WO2022071790A1 - System and method for text processing - Google Patents

System and method for text processing Download PDF

Info

Publication number
WO2022071790A1
WO2022071790A1 PCT/MY2020/050177 MY2020050177W WO2022071790A1 WO 2022071790 A1 WO2022071790 A1 WO 2022071790A1 MY 2020050177 W MY2020050177 W MY 2020050177W WO 2022071790 A1 WO2022071790 A1 WO 2022071790A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
text content
conversations
reply
questions
Prior art date
Application number
PCT/MY2020/050177
Other languages
French (fr)
Inventor
Mohammad Arshi SALOOT
Duc Nghia PHAM
Original Assignee
Mimos Berhad
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mimos Berhad filed Critical Mimos Berhad
Publication of WO2022071790A1 publication Critical patent/WO2022071790A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri

Definitions

  • the present invention relates broadly to the field of text processing. More particularly, the present invention relates to a system and method for processing text for generating text content that reflects a user’s opinion.
  • chatbots have been in use for chatting with a user or customer in providing solutions to issues raised.
  • conversations are one-sided, wherein the user is provided with a set of options and when the user selects one, a preset sentence or paragraphs is provided as reply.
  • the user is allowed to enter queries in the form of questions or sentences and the chatbots would extract keywords and reply to the user with the information related to the extracted keywords, wherein the information is extracted from a database based on the keywords.
  • United States Patent No.: US 8,655,889 B2 discloses an autonomous blog engine capable of autonomous generation of a blog, wherein whenever a picture is captured by a mobile phone, the mobile application determines a place of interest captured in the picture. Based on the determined place of interest, one or more pre-stored knowledge items including information on the place of interest are pulled from a database and autonomously compiles and publishes a blog entry on the place of interest along with the captured picture.
  • the present disclosure proposes a system and method for text processing.
  • the system comprises a display unit, an input unit, a storage unit, a processing unit and a publishing unit.
  • the display unit presents a plurality of multi-option questions and corresponding options to a user, wherein one or more options are selectable as an answer to each question.
  • the input unit receives a user reply with respect to each question, wherein the user reply includes one or more options selected by the user.
  • the storage unit stores a user log, wherein the user log includes one or more conversations participated by the user.
  • the processing unit processes the user reply and generates a text content based on the user reply and the user log.
  • the processing unit includes a sentence conversion module, an extraction module, weighing module, comparison module and a content generation module.
  • the sentence conversion module converts each question and corresponding user reply into one or more declarative sentences based on linguistic knowledgebase and linguistic ontological database.
  • the extraction module extracts one or more keywords from each declarative sentence and extracts one or more conversations from the user log based on the keywords using robotic process automation (RPA).
  • RPA robotic process automation
  • the weighing module weighs the declarative sentences and the conversations based on keywords present in the declarative sentences and the conversations, respectively.
  • the comparison module compares weights of each declarative sentences and the corresponding conversations to determine the most similar conversation.
  • the content generation module generates the text content based on the most similar conversation.
  • the display unit presents the generated text content to the user and the input unit receives a user selection with respect to the generated text content.
  • a publishing unit publishes the generated text content as the user’s opinion based on the user selection, wherein the generated text content is published if the user selection includes an approval for publishing the created text content.
  • the method comprises the steps of: presenting a plurality of multi-option questions and corresponding options to a user, wherein one or more options are selectable as an answer to each question, receiving a user reply with respect to each question, wherein the user reply includes one or more options selected by the user, processing the user reply for generating a text content based on the user reply and a user log, wherein one or more conversations participated by the user are stored in a storage unit as the user log, presenting the generated text content to the user; receiving a user selection with respect to the generated text content, and publishing the generated text content as the user’s opinion based on the user selection using a publishing unit, wherein the generated text content is published if said user selection includes an approval for publishing said generated text content.
  • the present invention By converting the questions and corresponding user replies into the declarative sentences, the present invention is able to understand the pattern of an actual opinion with respect to the questions. Since the final text is created based on the conversations closest to the declarative sentences, the present invention is capable of generating text content in a simple and effective manner, wherein the text content reflects the user’s opinion with respect to a specific topic or query.
  • FIGURE 1 shows a block diagram of the system for text processing, in accordance with an exemplary embodiment of the present invention.
  • FIGURE 2 shows a flow diagram of the method for text processing, in accordance with an exemplary embodiment of the present invention.
  • FIGURE 1 shows a block representation of the system for text processing, in accordance with an exemplary embodiment of the present invention.
  • the system (10) comprises a display unit (11 ), an input unit (12), a storage unit (13), a processing unit (14) and a publishing unit (15).
  • the display unit (11 ) presents a plurality of multi-option questions and corresponding options to a user, wherein one or more options are selectable as an answer to each question.
  • the input unit (12) receives a user reply with respect to each question, wherein the user reply includes one or more options selected by the user.
  • the display unit (11 ) and the input unit (12) are integrated into a user device such as a smartphone, tablet computer, laptop computer, desktop computer or any other computing device capable of executing a mobile application or a web application.
  • the user device may be in the form of an automated teller machine (ATM), kiosk, point of sale (POS) device and the like.
  • ATM automated teller machine
  • POS point of sale
  • the storage unit (13) stores a user log, wherein the user log includes one or more conversations participated by the user.
  • the storage unit (13) is a remote database wirelessly connected to the user device.
  • the storage unit (13) is a local memory device residing in the user device.
  • the conversations may include but not limited to textual data, audio data, still image data, clip art data and moving image data.
  • the system (10) may be connected to user’s email account, social media account, messaging account and storage folders within the user device for accumulating the user conversations through these means.
  • the processing unit (14) processes the user reply to each question and generates a text content based on the user reply and the user log.
  • the processing unit (14) includes a sentence conversion module (16), an extraction module (17), weighing module (18), comparison module (19) and a content generation module (20).
  • the sentence conversion module (16) converts each question and corresponding user reply into one or more declarative sentences based on commonsense knowledge-bases such as ConceptNet from MIT Media Lab, and linguistic ontological databases, such as Wordnet provided by Princeton University, and DBpedia from OpenLink.
  • the sentence conversion module (16) converts the questions and user replies into the declarative sentences by identifying different parts of the questions and user replies. Additionally, the sentence conversion module (16) may also identify a type of each identified part of the questions and user replies, wherein the type of parts includes noun, pronoun, verb, adverb, adjective, conjunction or auxiliary verb. Furthermore, the sentence conversion module (16) generates one or more synonyms, hyponyms and hypernyms for each identified part of the questions and user replies.
  • the extraction module (17) extracts one or more keywords from each declarative sentence and extracts one or more conversations from the user log based on the keywords using robotic process automation (RPA).
  • RPA robotic process automation
  • GUI graphical user interface
  • the weighing module (18) weighs the declarative sentences and the extracted conversations based on keywords present in the declarative sentences and the conversations, respectively.
  • Each generated declarative sentence receives a weight, which is determined based on the distance of keywords with their Synonym, Hyponym, and Hypernym in the knowledge base.
  • the weight of each generated sentence has inverse relation with the number of iteration.
  • the below table shows generated sentence for a two-choice question: Governments should spend more either on health or education? Sentences No. 1 and 2 are generated in the first iteration; thus, they have the highest weightage. However, there is no similar user’s content to these sentences. Next sentences are No. 3 and 4, which have a very low similarity with respect to the user’s content but have a 0.5 weightage.
  • final answer generating module uses ⁇ weight * similarity score for each question option: education and health. In this example, the user believe that government should spend more on education because the health option gained 0.19 total score, and the education option obtained 0.24 score.
  • the comparison module (19) compares weights of each declarative sentence and the corresponding conversations to determine the most similar conversation.
  • the content generation module (20) generates the text content based on the most similar conversation.
  • the comparison module (19) determines a conversation as the most similar conversation, if a difference between weights of the conversation and the corresponding declarative sentence is less than a threshold.
  • the comparison module (19) decreases the threshold by a predetermined value and then repeats the comparison process.
  • the comparison module (19) may determine the most similar conversation by comparing the weights of the conversations, wherein the conversation with the highest weight is determined as the most similar conversation.
  • the display unit (11 ) presents the generated text content to the user, and the input unit (12) receives a user selection with respect to the presented text content.
  • a publishing unit (15) publishes the presented text content as the user’s opinion based on the user selection, wherein the presented text content is published if the user selection includes an approval for publishing the presented text content.
  • the publishing unit (15) publishes the presented text content in a web page.
  • the processing unit (14) stops the publishing unit (15) from publishing the presented text content.
  • the present invention identifies a pattern of a potential opinion that may actually be provided by the user with respect to the questions. Since the final text is created based on the conversations closest to the declarative sentences, the present invention is capable of generating text content in a simple and effective manner, wherein the text content reflects the user’s opinion with respect to a specific topic or query.
  • FIGURE 2 shows a flow diagram of the method for text processing, in accordance with an exemplary embodiment of the present invention.
  • the method (100) comprises the steps of: presenting, at a display unit, a plurality of multi-option questions and corresponding options to a user (101 ), wherein one or more options are selectable as an answer to each question, receiving, at an input unit, a user reply with respect to each question (102), wherein the user reply includes one or more options selected by the user, processing, at a processing unit, the user reply for generating a text content based on the user reply and a user log (103), wherein one or more conversations participated by the user are stored in a storage unit as the user log, presenting, at the display unit, the generated text content to the user (104), receiving, at the input unit, a user selection with respect to the created text content (105), and publishing the generated text content as the user’s opinion based on the user selection using a publishing unit (106), wherein the generated text content is published if the user selection includes an
  • the conversations may include but not limited to textual data, audio data, still image data, clip art data and moving image data.
  • the conversations may be accumulated from the user’s email account, social media account, messaging account and storage folders within a user device including the display unit and the input unit.
  • Each question and corresponding user reply are converted into one or more declarative sentences using a sentence conversion module of the processing unit based on commonsense knowledge-bases (e.g. ConceptNet) and linguistic ontological databases (e.g. Wordnet).
  • commonsense knowledge-bases e.g. ConceptNet
  • linguistic ontological databases e.g. Wordnet
  • different parts of the questions and user replies and type of the parts of the questions and user replies are identified, wherein the type of parts includes noun, pronoun, verb, adverb, adjective, conjunction or auxiliary verb.
  • one or more synonyms, hyponyms and hypernyms for each identified part of the questions and user replies are generated by the sentence conversion module.
  • One or more keywords are extracted from each declarative sentence using an extraction module of the processing. Furthermore, one or more conversations are extracted from the user log by the extraction module based on the keywords using robotic process automation (RPA). Each of the declarative sentences and the corresponding extracted conversations is weighed using a weighing module of the processing unit based on keywords present in the declarative sentences and the conversations, respectively.
  • RPA robotic process automation
  • Weights of each declarative sentences and the corresponding conversations are compared using a comparison module of the processing unit to determine the most similar conversation.
  • the text content is generated by a content generation module of the processing unit based on the most similar conversation.
  • a conversation is determined as the most similar conversation, if a difference between weights of the conversation and the corresponding declarative sentence is less than a threshold.
  • the threshold is decreased by a predetermined value and then the comparison process is repeated to determine the most similar conversation.
  • the most similar conversation may also be determined by comparing the weights of the conversations, wherein the conversation with the highest weight is determined as the most similar conversation.
  • the present invention By converting the questions and corresponding user replies into the declarative sentences, the present invention identifies a pattern of a potential opinion that may actually be provided by the user with respect to the questions. Since the final text is created based on the conversations closest to the declarative sentences, the present invention is capable of generating text content in a simple and effective manner, wherein the text content reflects the user’s opinion with respect to a specific topic or query.
  • the terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms "a”, “an” and “the” may be intended to include the plural forms as well, unless the context clearly indicates otherwise.

Abstract

The present invention relates to a system and method for text processing. The system (10) comprises a display unit (11) for presenting a plurality of multi-option questions and corresponding options to a user, wherein multiple options are selectable as an answer to each question. An input unit (12) receives a user reply with respect to each question, wherein the user reply includes one or more options selected by the user. A storage unit (13) stores user log including conversations participated by the user. A processing unit (14) processes the user reply and generates a text content based on the user reply and the user log. The display unit (11) presents the generated text content to the user, and the input unit (12) receives a user selection with respect to the created content. A publishing unit (15) publishes the created content as the user's opinion based on the user selection.

Description

SYSTEM AND METHOD FOR TEXT PROCESSING
FIELD OF THE DISCLOSURE
The present invention relates broadly to the field of text processing. More particularly, the present invention relates to a system and method for processing text for generating text content that reflects a user’s opinion.
BACKGROUND
Developments have been made to enable a computer software applications to impersonate a human in real-time written conversation. For example, chatbots have been in use for chatting with a user or customer in providing solutions to issues raised. Mostly, such conversations are one-sided, wherein the user is provided with a set of options and when the user selects one, a preset sentence or paragraphs is provided as reply. In some cases, the user is allowed to enter queries in the form of questions or sentences and the chatbots would extract keywords and reply to the user with the information related to the extracted keywords, wherein the information is extracted from a database based on the keywords.
Even though such developments are useful to some extent in quickly addressing customer queries, it would be obvious for the customer that the queries are being dealt with by machines which makes the customer uncomfortable. United States Patent No.: US 8,655,889 B2 discloses an autonomous blog engine capable of autonomous generation of a blog, wherein whenever a picture is captured by a mobile phone, the mobile application determines a place of interest captured in the picture. Based on the determined place of interest, one or more pre-stored knowledge items including information on the place of interest are pulled from a database and autonomously compiles and publishes a blog entry on the place of interest along with the captured picture. However, it is mere compilation of preexisting information and opinion of the user cannot be expressed in this approach. Technical paper titled “Towards Automatic Generation of Product Reviews from Aspect-Sentiment Scores", Zang et al., discloses an improved method of generating a user review. This approach introduces a hierarchical structure with aligned attention in a Long-Short Term Memory (LSTM) decoder for generating descriptive Chinese reviews from aspect-sentiment scores representing users’ opinions. Scores for different aspects of a product i.e. car, are received from a user and reviews are generated from pre-stored review contents based on the scores. Even though this approach enables automatically generating a text content that expresses user’s opinion, it is very limited as it requires pre-written text content from the same field of technology.
Hence, there is still a need in the art for a system and method for processing text for creating text content in a simple and effective manner without a need for structured text information from the same field of technology, wherein the text content reflects the user’s opinion with respect to a specific topic or query.
SUMMARY
The present disclosure proposes a system and method for text processing. The system comprises a display unit, an input unit, a storage unit, a processing unit and a publishing unit. The display unit presents a plurality of multi-option questions and corresponding options to a user, wherein one or more options are selectable as an answer to each question. The input unit receives a user reply with respect to each question, wherein the user reply includes one or more options selected by the user. The storage unit stores a user log, wherein the user log includes one or more conversations participated by the user. The processing unit processes the user reply and generates a text content based on the user reply and the user log.
In one aspect of the present invention, the processing unit includes a sentence conversion module, an extraction module, weighing module, comparison module and a content generation module. The sentence conversion module converts each question and corresponding user reply into one or more declarative sentences based on linguistic knowledgebase and linguistic ontological database. The extraction module extracts one or more keywords from each declarative sentence and extracts one or more conversations from the user log based on the keywords using robotic process automation (RPA).
The weighing module weighs the declarative sentences and the conversations based on keywords present in the declarative sentences and the conversations, respectively. The comparison module compares weights of each declarative sentences and the corresponding conversations to determine the most similar conversation. The content generation module generates the text content based on the most similar conversation.
The display unit presents the generated text content to the user and the input unit receives a user selection with respect to the generated text content. A publishing unit publishes the generated text content as the user’s opinion based on the user selection, wherein the generated text content is published if the user selection includes an approval for publishing the created text content.
In another aspect of the present invention, the method comprises the steps of: presenting a plurality of multi-option questions and corresponding options to a user, wherein one or more options are selectable as an answer to each question, receiving a user reply with respect to each question, wherein the user reply includes one or more options selected by the user, processing the user reply for generating a text content based on the user reply and a user log, wherein one or more conversations participated by the user are stored in a storage unit as the user log, presenting the generated text content to the user; receiving a user selection with respect to the generated text content, and publishing the generated text content as the user’s opinion based on the user selection using a publishing unit, wherein the generated text content is published if said user selection includes an approval for publishing said generated text content.
By converting the questions and corresponding user replies into the declarative sentences, the present invention is able to understand the pattern of an actual opinion with respect to the questions. Since the final text is created based on the conversations closest to the declarative sentences, the present invention is capable of generating text content in a simple and effective manner, wherein the text content reflects the user’s opinion with respect to a specific topic or query.
Various objects, features, aspects and advantages of the inventive subject matter will become more apparent from the following detailed description of preferred embodiments, along with the accompanying drawing figures in which like numerals represent like components.
BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS
In the figures, similar components and/or features may have the same reference numerals. Further, various components of the same type may be distinguished by following the reference numerals with a second numeral that distinguishes among the similar components. If only the first reference numeral is used in the specification, the description is applicable to any one of the similar components having the same first reference numeral irrespective of the second reference numeral.
FIGURE 1 shows a block diagram of the system for text processing, in accordance with an exemplary embodiment of the present invention.
FIGURE 2 shows a flow diagram of the method for text processing, in accordance with an exemplary embodiment of the present invention.
DETAILED DESCRIPTION
In accordance with the present disclosure, there is provided a system and method for text processing, which will now be described with reference to the embodiments shown in the accompanying drawings. The embodiments do not limit the scope and ambit of the disclosure. The description relates purely to the embodiments and suggested applications thereof.
The embodiments herein and the various features and advantageous details thereof are explained with reference to the non-limiting embodiment in the following description. Descriptions of well-known components and processes are omitted so as to not unnecessarily obscure the embodiments herein. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiment herein. Accordingly, the description should not be construed as limiting the scope of the embodiment herein.
The description hereinafter, of the specific embodiment will so fully reveal the general nature of the embodiments herein that others can, by applying current knowledge, readily modify or adapt or perform both for various applications such specific embodiment without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation.
FIGURE 1 shows a block representation of the system for text processing, in accordance with an exemplary embodiment of the present invention. The system (10) comprises a display unit (11 ), an input unit (12), a storage unit (13), a processing unit (14) and a publishing unit (15). The display unit (11 ) presents a plurality of multi-option questions and corresponding options to a user, wherein one or more options are selectable as an answer to each question.
The input unit (12) receives a user reply with respect to each question, wherein the user reply includes one or more options selected by the user. In a preferred embodiment, the display unit (11 ) and the input unit (12) are integrated into a user device such as a smartphone, tablet computer, laptop computer, desktop computer or any other computing device capable of executing a mobile application or a web application. Alternatively, the user device may be in the form of an automated teller machine (ATM), kiosk, point of sale (POS) device and the like.
The storage unit (13) stores a user log, wherein the user log includes one or more conversations participated by the user. Preferably, the storage unit (13) is a remote database wirelessly connected to the user device. Alternatively, the storage unit (13) is a local memory device residing in the user device. The conversations may include but not limited to textual data, audio data, still image data, clip art data and moving image data. The system (10) may be connected to user’s email account, social media account, messaging account and storage folders within the user device for accumulating the user conversations through these means.
The processing unit (14) processes the user reply to each question and generates a text content based on the user reply and the user log. Preferably, the processing unit (14) includes a sentence conversion module (16), an extraction module (17), weighing module (18), comparison module (19) and a content generation module (20). The sentence conversion module (16) converts each question and corresponding user reply into one or more declarative sentences based on commonsense knowledge-bases such as ConceptNet from MIT Media Lab, and linguistic ontological databases, such as Wordnet provided by Princeton University, and DBpedia from OpenLink.
The sentence conversion module (16) converts the questions and user replies into the declarative sentences by identifying different parts of the questions and user replies. Additionally, the sentence conversion module (16) may also identify a type of each identified part of the questions and user replies, wherein the type of parts includes noun, pronoun, verb, adverb, adjective, conjunction or auxiliary verb. Furthermore, the sentence conversion module (16) generates one or more synonyms, hyponyms and hypernyms for each identified part of the questions and user replies.
Suppose, the question “Do you enjoy cooking food?” and options “a) Yes, I enjoy; b) No, I don’t; c) Sometimes”, are presented to the user. If the user selects option ‘a’, the question and the user reply are converted into the declarative sentence “I enjoy cooking food”. Similarly, the declarative sentence for options ‘b’ and ‘c’ may be “I don’t enjoy cooking food” and “I sometimes enjoy cooking food”, respectively.
The extraction module (17) extracts one or more keywords from each declarative sentence and extracts one or more conversations from the user log based on the keywords using robotic process automation (RPA). RPA, sometimes referred to as software robotics is a software automation tool capable of developing a set of actions by observing while a user does those actions in a graphical user interface (GUI), and then automatically repeating those actions directly in the GUI.
The weighing module (18) weighs the declarative sentences and the extracted conversations based on keywords present in the declarative sentences and the conversations, respectively. Each generated declarative sentence receives a weight, which is determined based on the distance of keywords with their Synonym, Hyponym, and Hypernym in the knowledge base. The weight is generated using weight=1/x equation, where x is the number of the time that the comparison module (19) has rejected the generated sentences.
In other words, the weight of each generated sentence has inverse relation with the number of iteration. For example, the below table shows generated sentence for a two-choice question: Governments should spend more either on health or education? Sentences No. 1 and 2 are generated in the first iteration; thus, they have the highest weightage. However, there is no similar user’s content to these sentences. Next sentences are No. 3 and 4, which have a very low similarity with respect to the user’s content but have a 0.5 weightage. After generating enough declarative sentences, final answer generating module uses ^weight * similarity score for each question option: education and health. In this example, the user believe that government should spend more on education because the health option gained 0.19 total score, and the education option obtained 0.24 score.
Table 1. Example declarative sentences and the corresponding weights and similarity scores
Figure imgf000009_0001
Table 1 continued
Figure imgf000010_0001
The comparison module (19) compares weights of each declarative sentence and the corresponding conversations to determine the most similar conversation. The content generation module (20) generates the text content based on the most similar conversation. Preferably, the comparison module (19) determines a conversation as the most similar conversation, if a difference between weights of the conversation and the corresponding declarative sentence is less than a threshold.
If none of the extracted conversations is determined as the most similar conversation, the comparison module (19) decreases the threshold by a predetermined value and then repeats the comparison process. Alternatively, the comparison module (19) may determine the most similar conversation by comparing the weights of the conversations, wherein the conversation with the highest weight is determined as the most similar conversation.
The display unit (11 ) presents the generated text content to the user, and the input unit (12) receives a user selection with respect to the presented text content. A publishing unit (15) publishes the presented text content as the user’s opinion based on the user selection, wherein the presented text content is published if the user selection includes an approval for publishing the presented text content. Preferably, the publishing unit (15) publishes the presented text content in a web page.
If the user selection includes a refusal to publish the presented text content, the processing unit (14) stops the publishing unit (15) from publishing the presented text content. By converting the questions and corresponding user replies into the declarative sentences, the present invention identifies a pattern of a potential opinion that may actually be provided by the user with respect to the questions. Since the final text is created based on the conversations closest to the declarative sentences, the present invention is capable of generating text content in a simple and effective manner, wherein the text content reflects the user’s opinion with respect to a specific topic or query.
FIGURE 2 shows a flow diagram of the method for text processing, in accordance with an exemplary embodiment of the present invention. The method (100) comprises the steps of: presenting, at a display unit, a plurality of multi-option questions and corresponding options to a user (101 ), wherein one or more options are selectable as an answer to each question, receiving, at an input unit, a user reply with respect to each question (102), wherein the user reply includes one or more options selected by the user, processing, at a processing unit, the user reply for generating a text content based on the user reply and a user log (103), wherein one or more conversations participated by the user are stored in a storage unit as the user log, presenting, at the display unit, the generated text content to the user (104), receiving, at the input unit, a user selection with respect to the created text content (105), and publishing the generated text content as the user’s opinion based on the user selection using a publishing unit (106), wherein the generated text content is published if the user selection includes an approval for publishing the created text content.
Preferably, the conversations may include but not limited to textual data, audio data, still image data, clip art data and moving image data. Furthermore, the conversations may be accumulated from the user’s email account, social media account, messaging account and storage folders within a user device including the display unit and the input unit.
Each question and corresponding user reply are converted into one or more declarative sentences using a sentence conversion module of the processing unit based on commonsense knowledge-bases (e.g. ConceptNet) and linguistic ontological databases (e.g. Wordnet). During the conversion process, different parts of the questions and user replies and type of the parts of the questions and user replies are identified, wherein the type of parts includes noun, pronoun, verb, adverb, adjective, conjunction or auxiliary verb. Furthermore, one or more synonyms, hyponyms and hypernyms for each identified part of the questions and user replies are generated by the sentence conversion module.
One or more keywords are extracted from each declarative sentence using an extraction module of the processing. Furthermore, one or more conversations are extracted from the user log by the extraction module based on the keywords using robotic process automation (RPA). Each of the declarative sentences and the corresponding extracted conversations is weighed using a weighing module of the processing unit based on keywords present in the declarative sentences and the conversations, respectively.
Weights of each declarative sentences and the corresponding conversations are compared using a comparison module of the processing unit to determine the most similar conversation. The text content is generated by a content generation module of the processing unit based on the most similar conversation. Preferably, a conversation is determined as the most similar conversation, if a difference between weights of the conversation and the corresponding declarative sentence is less than a threshold.
If none of the extracted conversations is determined as the most similar conversation, the threshold is decreased by a predetermined value and then the comparison process is repeated to determine the most similar conversation. Alternatively, the most similar conversation may also be determined by comparing the weights of the conversations, wherein the conversation with the highest weight is determined as the most similar conversation.
By converting the questions and corresponding user replies into the declarative sentences, the present invention identifies a pattern of a potential opinion that may actually be provided by the user with respect to the questions. Since the final text is created based on the conversations closest to the declarative sentences, the present invention is capable of generating text content in a simple and effective manner, wherein the text content reflects the user’s opinion with respect to a specific topic or query. The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" may be intended to include the plural forms as well, unless the context clearly indicates otherwise.
The terms "comprises," "comprising," “including,” and “having,” are inclusive and therefore specify the presence of stated features, integers, steps, operations, elements, or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or groups thereof.
The use of the expression “at least” or “at least one” suggests the use of one or more elements, as the use may be in one of the embodiments to achieve one or more of the desired objects or results.
While the foregoing describes various embodiments of the invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof. The scope of the invention is determined by the claims that follow. The invention is not limited to the described embodiments, versions or examples, which are included to enable a person having ordinary skill in the art to make and use the invention when combined with information and knowledge available to the person having ordinary skill in the art.

Claims

CLAIMS:
1 . A system (10) for text processing, comprising: i. at least one display unit (11 ) for presenting a plurality of multi-option questions and corresponding options to a user, wherein one or more options are selectable as an answer to each question; ii. at least one input unit (12) for receiving a user reply with respect to each question, wherein said user reply includes one or more options selected by said user; iii. at least one storage unit (13) for storing user log, wherein said user log includes one or more conversations participated by said user; iv. at least one processing unit (14) for processing said user reply and for generating a text content based on said user reply and said user log, wherein said display unit (11 ) presents said generated text content to said user and said input unit (12) receives a user selection with respect to said created text content; and v. at least one publishing unit (15) for publishing said created text content as said user’s opinion based on said user selection, wherein said created text content is published if said user selection includes an approval for publishing said created text content, characterized in that said processing unit (14) includes:
- at least one sentence conversion module (16) for converting each question and corresponding user reply into one or more declarative sentences based on at least one of a commonsense knowledge-base and a linguistic ontological databases;
- at least one extraction module (17) for extracting one or more keywords from each declarative sentence and for extracting one or more conversations from said user log based on said keywords using robotic process automation, RPA;
- at least one weighing module (18) for weighing said declarative sentences and said conversations based on keywords present in said declarative sentences and said conversations, respectively; - at least one comparison module (19) for comparing weights of each declarative sentences and said corresponding conversations to determine the most similar conversation; and
- at least one content generation module (20) for generating said text content based on the most similar conversation. The system (10) of claim 1 , wherein said sentence conversion module (16) converts said questions and user replies into said declarative sentences by identifying different parts of said questions and user replies. The system (10) of claim 2, wherein said sentence conversion module (16) identifies a type of each identified part of said questions and user replies. The system (10) of claim 3, wherein said type of identified parts includes noun, pronoun, verb, adverb, adjective, conjunction or auxiliary verb. The system (10) of claim 2, wherein said sentence conversion module (16) generates one or more synonyms, hyponyms and hypernyms for each identified part of said questions and user replies. The system (10) of claim 1 , wherein said conversations include at least one of textual data, audio data, still image data, clip art data and moving image data. A method (100) for text processing, comprising the steps of: i. presenting, at at least one display unit, a plurality of multi-option questions and corresponding options to a user (101 ), wherein one or more options are selectable as an answer to each question; ii. receiving, at at least one input unit, a user reply with respect to each question (102), wherein said user reply includes one or more options selected by said user; iii. processing, at at least one processing unit, said user reply for generating a text content based on said user reply and a user log
(103), wherein one or more conversations participated by said user are stored in a storage unit as said user log; iv. presenting, at said display unit, said generated text content to said user
(104); v. receiving, at said input unit, a user selection with respect to said created text content (105); and vi. publishing said created text content as said user’s opinion based on said user selection using at least one publishing unit (106), wherein said created text content is published if said user selection includes an approval for publishing said created text content, characterized in that said step of processing said user reply includes:
- converting, at a sentence conversion module of said processing unit, each question and corresponding user reply into one or more declarative sentences based on at least one of a commonsense knowledge-base and a linguistic ontological database;
- extracting, an extraction module of said processing unit, one or more keywords from each declarative sentence;
- extracting, at said extraction module, one or more conversations from said user log based on said keywords using robotic process automation, RPA;
- weighing said declarative sentences and said conversations based on keywords present in said declarative sentences and said conversations, respectively, using a weighing module of said processing unit;
- comparing weights of each declarative sentences and said corresponding conversations using a comparison module of said process unit to determine the most similar conversation; and
- generating said text content based on the most similar conversation using a content generation module of said processing unit. The method (100) of claim 7, wherein said step of converting said questions and corresponding user replies into one or more declarative sentences includes:
- identifying different parts of said questions and user replies;
- identifying a type of each identified part of said questions and user replies; and
- generating one or more synonyms, hyponyms and hypernyms for each identified part of said questions and user replies. The method (100) of claim 8, wherein said type of identified parts includes noun, pronoun, verb, adverb, adjective, conjunction or auxiliary verb. The method (100) of claim 7, wherein said conversations include at least one of textual data, audio data, still image data, clip art data and moving image data.
PCT/MY2020/050177 2020-09-30 2020-11-30 System and method for text processing WO2022071790A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI2020005114 2020-09-30
MYPI2020005114 2020-09-30

Publications (1)

Publication Number Publication Date
WO2022071790A1 true WO2022071790A1 (en) 2022-04-07

Family

ID=80951615

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2020/050177 WO2022071790A1 (en) 2020-09-30 2020-11-30 System and method for text processing

Country Status (1)

Country Link
WO (1) WO2022071790A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116860952A (en) * 2023-09-04 2023-10-10 富璟科技(深圳)有限公司 RPA intelligent response processing method and system based on artificial intelligence

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5884248A (en) * 1996-04-10 1999-03-16 Casio Computer Co., Ltd. Build message communication system utilizing data tables containing message defining data and corresponding codes
KR20070102267A (en) * 2006-04-14 2007-10-18 학교법인 포항공과대학교 Dialog management system, and method of managing dialog using example-based dialog modeling technique
KR20160147303A (en) * 2015-06-15 2016-12-23 포항공과대학교 산학협력단 Method for dialog management based on multi-user using memory capacity and apparatus for performing the method
KR20190090636A (en) * 2018-01-25 2019-08-02 경희대학교 산학협력단 Method for automatically editing pattern of document
JP2019194759A (en) * 2018-05-01 2019-11-07 国立研究開発法人情報通信研究機構 Dialogue system reinforcement device and computer program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5884248A (en) * 1996-04-10 1999-03-16 Casio Computer Co., Ltd. Build message communication system utilizing data tables containing message defining data and corresponding codes
KR20070102267A (en) * 2006-04-14 2007-10-18 학교법인 포항공과대학교 Dialog management system, and method of managing dialog using example-based dialog modeling technique
KR20160147303A (en) * 2015-06-15 2016-12-23 포항공과대학교 산학협력단 Method for dialog management based on multi-user using memory capacity and apparatus for performing the method
KR20190090636A (en) * 2018-01-25 2019-08-02 경희대학교 산학협력단 Method for automatically editing pattern of document
JP2019194759A (en) * 2018-05-01 2019-11-07 国立研究開発法人情報通信研究機構 Dialogue system reinforcement device and computer program

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116860952A (en) * 2023-09-04 2023-10-10 富璟科技(深圳)有限公司 RPA intelligent response processing method and system based on artificial intelligence
CN116860952B (en) * 2023-09-04 2023-11-03 富璟科技(深圳)有限公司 RPA intelligent response processing method and system based on artificial intelligence

Similar Documents

Publication Publication Date Title
US20220006761A1 (en) Systems and processes for operating and training a text-based chatbot
US11062094B2 (en) Systems and methods for automatically detecting sentiments and assigning and analyzing quantitate values to the sentiments expressed in text
Fikri et al. A comparative study of sentiment analysis using SVM and SentiWordNet
US20150286627A1 (en) Contextual sentiment text analysis
US20130253910A1 (en) Systems and Methods for Analyzing Digital Communications
Ravi et al. Profiling student interactions in threaded discussions with speech act classifiers
CN110770694A (en) Obtaining response information from multiple corpora
US20150286928A1 (en) Causal Modeling and Attribution
Duerr et al. Persuasive Natural Language Generation--A Literature Review
Wright Stylistics versus Statistics: A corpus linguistic approach to combining techniques in forensic authorship analysis using Enron emails
US10055487B2 (en) Preference visualization system and censorship system
Rahman et al. Sentiment analysis on Twitter data: comparative study on different approaches
WO2022071790A1 (en) System and method for text processing
Hu et al. Word embeddings and semantic shifts in historical Spanish: Methodological considerations
Mahanan et al. College Agent: The Machine Learning Chatbot for College Tasks
Kumar et al. Natural language processing
Krommyda et al. Improving the quality of the conversational datasets through extensive semantic analysis
Le A hybrid method for text-based sentiment analysis
US11966570B2 (en) Automated processing and dynamic filtering of content for display
Jaya et al. Development Of Conversational Agent To Enhance Learning Experience: Case Study In Pre University
US11907500B2 (en) Automated processing and dynamic filtering of content for display
Poots et al. Automatic annotation of text with pictures
Kindbom LSTM vs Random Forest for Binary Classification of Insurance Related Text
US20230289377A1 (en) Multi-channel feedback analytics for presentation generation
US20230289836A1 (en) Multi-channel feedback analytics for presentation generation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20956400

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20956400

Country of ref document: EP

Kind code of ref document: A1