WO2021235225A1 - 情報処理装置及び情報処理方法 - Google Patents

情報処理装置及び情報処理方法 Download PDF

Info

Publication number
WO2021235225A1
WO2021235225A1 PCT/JP2021/017336 JP2021017336W WO2021235225A1 WO 2021235225 A1 WO2021235225 A1 WO 2021235225A1 JP 2021017336 W JP2021017336 W JP 2021017336W WO 2021235225 A1 WO2021235225 A1 WO 2021235225A1
Authority
WO
WIPO (PCT)
Prior art keywords
dialogue
knowledge
sentence
response
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2021/017336
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
文規 本間
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Group Corp
Original Assignee
Sony Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Group Corp filed Critical Sony Group Corp
Priority to JP2022524370A priority Critical patent/JP7718413B2/ja
Publication of WO2021235225A1 publication Critical patent/WO2021235225A1/ja
Anticipated expiration legal-status Critical
Priority to JP2025122968A priority patent/JP2025137717A/ja
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/83Querying
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • G06F40/56Natural language generation

Definitions

  • This disclosure relates to an information processing device and an information processing method.
  • dialogue systems for responding appropriately to users in response to utterances received from users and user actions are widely used.
  • the dialogue system it is required to respond to the user's utterance at an appropriate speed and content.
  • the knowledge base is a database of knowledge accumulated in a format that can be used by a computer.
  • the knowledge base is searched according to the user's utterance, and the knowledge obtained from the knowledge base is converted into the response sentence.
  • the scale of the knowledge base is expanded, it is difficult to respond appropriately to the user's utterance due to the delay in response due to the decrease in the search speed of the knowledge base and the complexity of the search rule design.
  • the information processing apparatus has a knowledge object generation unit that extracts a plurality of terms from a semi-structured document and generates a knowledge graph showing the relationship between the extracted terms, and the knowledge graph.
  • a dialogue pair generation unit that generates a dialogue repository including a plurality of dialogue pairs is provided, and each of the plurality of dialogue pairs is a set of a question text and a response text to the question text, and the question text includes the question text.
  • the first term included and the second term included in the response sentence are two terms in a connected relationship in the knowledge graph.
  • the information processing apparatus includes an acquisition unit for acquiring a question sentence from a user, a question sentence including the first term extracted from the semi-structured document, and an extract from the semi-structured document.
  • the dialogue pair which is a pair with the response sentence including the second term connected to the first term is put into the acquired question sentence from the user. It is provided with an inference unit which is selected according to the situation and outputs the response sentence of the selected dialogue pair as a response sentence to the question sentence from the user.
  • Embodiment 1-1 Outline of information processing according to the embodiment 1-2.
  • FIGS. 1 to 4. are diagrams showing an outline of information processing executed by the information processing system 1 according to the embodiment.
  • FIG. 3 is a diagram showing an outline of input / output data in information processing according to the embodiment.
  • FIG. 4 is a diagram showing an outline of processing when a knowledge base KB specialized for dialogue is not generated, which is different from the information processing according to the embodiment.
  • the information processing includes a process S1 for generating a dialogue repository DR from a semi-structured document D1 and a process S2 for generating a response R to a question (utterance sentence U) from a user.
  • the process S1 for generating the dialogue repository DR is executed prior to the process S2 for generating the response R to the question from the user.
  • the semi-structured document D1 (data set) used to generate the dialogue repository DR contract documents such as contracts, legal documents, medical books such as home medicine, and text documents such as textbooks can be appropriately used. Is.
  • the process S1 for generating the dialogue repository DR is a process for automatically generating the dialogue repository DR including a plurality of dialogue pair DPs from the semi-structured document D1. More specifically, the process of generating the dialogue repository DR includes the process S11 of generating a "dialogue-specific" knowledge base KB from the semi-structured document D1 and a group of dialogue pair DPs from the generated knowledge base KB ( It includes a process S12 for generating a dialogue pair group).
  • the process S11 for generating the knowledge base KB is a process for automatically generating a knowledge object such as a knowledge graph, that is, a knowledge base KB from the semi-structured document D1. More specifically, in the process S11 for generating the knowledge base KB, as shown in FIG. 1, a knowledge graph is obtained by extracting a plurality of terms from the semi-structured document D1 using the model M1 and converting them into knowledge objects. Create (knowledge object). That is, the knowledge graph is the knowledge structure of the semi-structured document D1 that shows the relationship between the plurality of terms extracted from the semi-structured document D1.
  • Extraction of terms from the semi-structured document D1 and knowledge objectification can be realized by, for example, pattern matching (model M1) for the semi-structured document D1.
  • the extraction of terms from the semi-structured document D1 and the knowledge objectization can be realized by using a machine learning model (model M1).
  • a machine learning model as an example, Seq2seq using RNN-LSTM (Recurrent Neural Networks-Long short-term memory) learned by using the data extracted by pattern matching from the semi-structured document D1 as teacher data.
  • DNN Deep Neural Network
  • DNN Deep Neural Network
  • the process S12 for generating a dialogue pair group is a process for automatically generating a dialogue repository DR including a plurality of dialogue pair DPs based on the generated knowledge graph, that is, the knowledge base KB.
  • the generation of the dialogue pair DP from the knowledge base KB constructed as a knowledge graph is automatically generated based on the template (model M2).
  • each of the plurality of dialogue pair DPs is a set of spoken sentences Q1, Q2, Q3 (Q sentences) as question sentences and response sentences A1, A2, A3 (A sentences) to the question sentences (Q sentences).
  • QA pair, QA sentence a process for automatically generating a dialogue repository DR including a plurality of dialogue pair DPs based on the generated knowledge graph, that is, the knowledge base KB.
  • the generation of the dialogue pair DP from the knowledge base KB constructed as a knowledge graph is automatically generated based on the template (model M2).
  • each of the plurality of dialogue pair DPs is a set of spoken sentences Q1, Q2, Q3 (Q
  • the process S1 for generating the dialogue repository DR uses the DNN (model M3) to input at least the question sentences (utterance sentences Q1, Q2, Q3) of the dialogue pair DP to the sentence vectors (utterance sentence vectors QV1, QV2, respectively). Includes processing to convert (sentence vectorization) to QV3).
  • the generated dialogue repository DR is registered as a dialogue repository of the QA inference model (model M5).
  • the process S2 for generating the response R to the question from the user includes the process S21 for acquiring the question (utterance text, utterance sentence U) from the user.
  • the process S21 for acquiring a question from the user includes a process of acquiring the output of the DNN (model M4) corresponding to the input of the utterance sentence U from the user as the utterance sentence vector UV.
  • the process S2 for generating the response R includes the process S22 for selecting the dialogue pair DP to be output as the response R (response text, response sentence) from the dialogue repository DR to the user.
  • the process S22 for selecting the dialogue pair DP is based on a comparison between the sentence vector of the user's speech (utterance sentence vector UV) and the sentence vector of the Q sentence of the dialogue repository generated in advance (utterance sentence vector QV1, QV2, QV3). Is done. More specifically, the utterance sentence vectors QV1, QV2, and QV3 of the dialogue repository DR that are closest to the utterance sentence vector UV and are equal to or more than the threshold value (less than the difference defined by the threshold value) are searched and searched.
  • the response statement corresponding to the dialogue pair DP is selected. In the example shown in FIG. 1, the response sentence A1 corresponding to the utterance sentence vector QV1 is selected.
  • the process S2 for generating the response R includes the process S23 for outputting the selected response R.
  • the process S23 for outputting this response R the response sentence A1 selected as described above is output and presented to the user.
  • the response R to the user can be uniquely determined based on the comparison result of the sentence vector between the user's utterance (utterance sentence U) and the question sentence (utterance sentence Q1, Q2, Q3) of the dialogue pair DP. ..
  • a QA dialogue statement that automatically embodies a part not explicitly described in the semi-structured document D1 (or knowledge base KB).
  • the process S2 for generating the dialogue repository DR generates the dialogue pair DP1 of the knowledge in the contract from the data set DS1 including the semi-structured document D1 such as the contract, and also generates the dialogue pair DP1 of the knowledge in the contract from the data set DS2 containing other general knowledge. Generates a dialogue pair DP2 that further contains.
  • the written word / knowledge converter C1 exemplified in FIG. 2 is an example of the model M1 exemplified in FIG.
  • the knowledge / dialogue pair converter C2 exemplified in FIG. 2 is an example of the model M2 (or the model M2 and the model M3) exemplified in FIG.
  • the utterance text UT illustrated in FIG. 2 is an example of a text indicating the utterance sentence U illustrated in FIG.
  • the response text RT illustrated in FIG. 2 is an example of a text indicating the response sentence A1 exemplified in FIG.
  • Complementation of the dialogue pair DP is performed by, for example, using a persona to complement the dialogue pair DP in which specific and unnecessary information equivalent to the personal information is removed without touching the personal information.
  • the complement of the dialogue pair DP is done by complementing the dialogue on general knowledge not stated in the contract.
  • the set of the interrogative sentence and the response sentence of the dialogue pair DP is a term extracted from the semi-structured document D1 and is two terms connected in the knowledge graph. Is. As illustrated in FIG. 3, it is assumed that "$ (term)" and “$ (explanatory text)" are connected in the knowledge base KB. At this time, as a dialogue pair DP, a set of a question sentence (SRC) "What is $ (term)?" And a response sentence (TGT) "It is $ (explanatory sentence).” Is generated. NS. A more specific explanation will be given.
  • SRC question sentence
  • TGT response sentence
  • the example of user utterance (Q sentence) illustrated in FIG. 1 is "What is the ceremony of the Enthronism Reishoden?".
  • the response example (Sentence A) is "The coronation ceremony is a ceremony in which the coronation emperor, who is the center of the coronation ceremony, declares the coronation inside and outside Japan.”
  • the "ceremony of the Enthronism of the religion” and the “ceremony of the emperor who is the center of the coronation to declare the coronation inside and outside Japan” are two connected in the knowledge graph. It is an example of the term.
  • a dialogue pair DP is generated in the same manner. NS.
  • the case where the knowledge base KB specialized for dialogue is not generated is the case where the data set DS1'such as the agreement is converted by the written word / knowledge converter C1'to be the knowledge base KB'.
  • the user utterance (utterance text UT') is converted into the search query SQ'by the utterance / query converter C3', and the knowledge base KB'is searched by the search query SQ'.
  • the knowledge (search result SR') obtained by the search is converted into a response sentence by the result / response converter C4', and the response sentence (response text RT') is output.
  • the search speed decreases as the scale (data size) of the knowledge base increases.
  • the slowdown in search speed causes a delay in the agent response, which requires immediacy.
  • the knowledge base is large, the design cost of the search rule will be high. In other words, a designer who is familiar with the knowledge and dialogue needs to implement or update the knowledge base search rules, which increases the human resources required to convert a huge amount of knowledge into the knowledge base.
  • FIG. 5 is a block diagram showing an example of the functional configuration of the response generation device 10 of the information processing system 1 according to the embodiment.
  • FIG. 6 is a block diagram showing an example of the functional configuration of the dialogue repository generation device 20 of the information processing system 1 according to the embodiment.
  • the response generation device 10 and the dialogue repository generation device 20 are examples of the information processing devices according to the present disclosure, respectively.
  • the information processing system 1 includes a response generation device 10, a dialogue repository generation device 20, and an external server 30.
  • the response generation device 10, the dialogue repository generation device 20, and the external server 30 are communicably connected via the network N.
  • the process of generating a response to a question from the user is executed by the response generation device 10 shown in FIG. 5 and the dialogue repository generation device 20 shown in FIG. .
  • the process of generating the dialogue repository from the semi-structured document is executed by the dialogue repository generation device 20 shown in FIG.
  • the response generation device 10 is an information processing terminal (information processing device) that executes the response generation process according to the present disclosure.
  • the voice recognition and voice response processing performed by the response generation device 10 may be referred to as an agent function. Further, the response generation device 10 may be referred to as an agent device.
  • the response generation device 10 may be realized by various smart devices having an information processing function, or may be realized as a dedicated device. That is, the response generator 10 is not limited to a general-purpose computer, but is not limited to a general-purpose computer, but also a smartphone, a tablet terminal, a smart speaker, a wearable device such as a watch-type or eyeglass-type terminal, a smart home appliance such as a television, an air conditioner, or a refrigerator, and a smart vehicle such as an automobile. It may be a drone, a home-use robot, or the like.
  • the response generation device 10 as various devices having an information processing function functions as the response generation device 10 according to the present disclosure by executing a program (application) for realizing the response generation processing.
  • the response generation device 10 includes a sensor 11, an input unit 12, a communication unit 13, a storage unit 14, an acquisition unit 15, a response generation unit 16, and an output unit 17.
  • the sensor 11 detects various information.
  • the sensor 11 includes a microphone that collects voice spoken by the user, a camera that acquires the user's behavior as an image, and a vital sensor that detects vital signs such as the user's body temperature, respiration, and heart rate.
  • the input unit 12 is a device for receiving various operations from the user.
  • the input unit 12 is realized by a keyboard, a mouse, a touch panel, or the like.
  • the communication unit 13 is connected to the network N by wire or wirelessly, and transmits / receives information to / from the dialogue repository generation device 20 and the like via the network N.
  • the storage unit 14 is realized by, for example, a semiconductor memory element such as a RAM (Random Access Memory) or a flash memory (Flash Memory), or a storage device such as a hard disk or an optical disk.
  • the storage unit 14 has, for example, a user information table 141, model information 142, and a control program 143.
  • the user information table 141 stores information about a user who uses the response generation device 10.
  • the model information 142 stores the model (function) used for the response generation process and the parameters.
  • the control program 143 stores a program (application) for realizing the response generation process.
  • the acquisition unit 15 is a processing unit that acquires various types of information. In addition, the acquisition unit 15 may acquire the context information when the input information is input together with the input information.
  • the acquisition unit 15 includes a detection unit 151, a registration unit 152, and a reception unit 153.
  • the detection unit 151 detects various information via the sensor 11. For example, the detection unit 151 detects the voice spoken by the user via a microphone, which is an example of the sensor 11. Further, the detection unit 151 detects the image information used as the context information via the camera which is an example of the sensor 11. Further, the detection unit 151 detects vital signs such as the user's body temperature, respiration, and heart rate via the vital sensor, which is an example of the sensor 11.
  • the registration unit 152 accepts registration from the user via the input unit 12.
  • the registration unit 152 accepts registration of a user profile (attribute information) of a user who uses the response generation device 10 via a touch panel or a keyboard.
  • the attribute information is used, for example, to select a persona.
  • the receiving unit 153 receives various information. For example, the receiving unit 153 receives the response statement selected in the dialogue repository generation process from the dialogue repository generation device 20.
  • the response generation unit 16 executes a process (response generation process) for generating a response to a question from the user.
  • the response generation unit 16 includes a signal processing unit 161, a voice recognition unit 162, a response unit 163, and a transmission unit 164.
  • the signal processing unit 161 performs signal processing related to input information such as a user's utterance voice, an image, and a vital sign from the sensor 11.
  • the voice recognition unit 162 recognizes the signal processed by the signal processing unit 161 as voice.
  • the response unit 163 executes a process of acquiring a question sentence (utterance text, utterance sentence) from the user in the process of generating a response to the question from the user.
  • the response unit 163 converts the user's uttered voice recognized by the voice recognition unit 162 into a sentence vector. Further, the response unit 163 executes a process of outputting the selected response statement. The response unit 163 generates a response to the user based on the response statement from the dialogue repository generation device 20 acquired by the acquisition unit 15. For example, the response unit 163 converts the acquired response sentence into voice data.
  • the transmission unit 164 transmits the text vector of the user-spoken voice generated by the response unit 163 to the communication unit 13. The transmission unit 164 transmits the voice data converted by the response unit 163 to the output unit 17.
  • the output unit 17 is a mechanism for outputting various information.
  • the output unit 17 is, for example, a speaker.
  • the output unit 17 outputs the voice data transmitted from the response generation unit 16 by voice.
  • the output unit 17 may be a display that displays the response text acquired by the acquisition unit 15.
  • the dialogue repository generation device 20 is an information processing device that executes the dialogue repository generation process according to the present disclosure.
  • the dialogue repository generation device 20 may be, for example, a general-purpose computer or a server.
  • the dialogue repository generation device 20 functions as the dialogue repository generation device 20 according to the present disclosure by executing a program (application) for realizing the dialogue repository generation process.
  • the dialogue repository generation device 20 includes a communication unit 21, a storage unit 22, an acquisition unit 23, and a dialogue repository generation unit 24.
  • the communication unit 21 is connected to the network N by wire or wirelessly, and transmits / receives information to / from the response generator 10 and the external server 30 via the network N.
  • the storage unit 22 is realized by, for example, a semiconductor memory element such as a RAM (Random Access Memory) or a flash memory (Flash Memory), or a storage device such as a hard disk or an optical disk.
  • the storage unit 14 has, for example, a user information table 221, a model information 222, a dialogue repository 223, and a control program 224.
  • the user information table 221 is information acquired from the storage unit 14 of the response generation device 10, and stores information about the user who uses the response generation device 10.
  • the model information 222 stores the model (function) used for the dialogue repository generation process and the parameters.
  • the model information 222 stores the pattern used in the process of extracting the knowledge structure of the semi-structured document and generating the knowledge base.
  • Model information 222 stores a template used in the process of generating dialogue pairs from the knowledge base.
  • the dialogue repository 223 stores the generated dialogue pairs.
  • the control program 224 stores a program (application) for realizing the interactive repository processing.
  • the storage unit 22 may store a semi-structured document.
  • the acquisition unit 23 is a processing unit that acquires various types of information.
  • the acquisition unit 23 includes a reception unit 231.
  • the receiving unit 231 acquires text information such as a semi-structured document to be converted into a knowledge object and general knowledge from the external server 30. Further, the receiving unit 231 acquires information about the user who uses the response generating device 10, the user's utterance text (question sentence, Q sentence), context information, and the like from the response generating device 10.
  • the dialogue repository generation unit 24 executes a process of generating a dialogue repository from a semi-structured document (dialogue repository generation process).
  • the dialogue repository generation unit 24 includes a knowledge object generation unit 241, a dialogue pair generation unit 242, an inference unit 243, and a transmission unit 244.
  • the knowledge object generation unit 241 executes a process of generating a knowledge base.
  • the dialogue pair generation unit 242 executes a process of generating a dialogue repository including a dialogue pair group from the generated knowledge base.
  • the inference unit 243 vectorizes at least the question sentence of the dialogue pair.
  • the inference unit 243 executes a process of selecting a dialogue pair to be output as a response statement from the dialogue repository 223 to the user and a process of outputting the selected response statement in the process of generating a response to the question from the user. ..
  • the transmission unit 244 transmits, for example, the response text selected in the dialogue repository generation process to the response generation device 10 via the communication unit 21.
  • the external server 30 is a device including storage such as a non-volatile memory for storing a semi-structured document (data set) to be converted into a knowledge object.
  • the external server 30 or the storage such as the non-volatile memory may be configured to be directly connected to the dialogue repository generation device 20 without going through the network N.
  • the external server 30 may store the dialogue repository 223 generated by the dialogue repository generation device 20.
  • the response generation device 10, the dialogue repository generation device 20, and the external server 30 may be realized as one device. Further, the response generation device 10, the dialogue repository generation device 20, and the external server 30 may be realized by a plurality of devices, respectively.
  • FIG. 7 is a diagram for explaining a typical logical structure of a semi-structured document used for pattern matching when generating a knowledge graph from a semi-structured document.
  • FIG. 7 illustrates the case of contracts, but other semi-structured documents have a similar logical structure.
  • the semi-structured document has a logical structure such as term definition ST1, link definition ST2, conditional statement ST3, exception reason ST4, and range ST5.
  • the term definition ST1 is a logical structure indicating a term such as "A is B.”
  • the link definition ST2 is a logical structure showing a correlation (link) such as "B as A”.
  • the conditional statement ST3 is a logic indicating a condition such as "A, B.”, “A, B, C.”, "A, B. In this case, C.”. It is a structure.
  • the exception reason ST4 is a logical structure indicating an exception reason, for example, "In the case of A, B. However, in the case of C, D.”.
  • the range ST5 is a logical structure indicating a range, for example, "A, but includes B.” or "A, but excludes B.”.
  • FIG. 8 is a diagram for explaining a typical reference notation of a semi-structured document used for pattern matching when generating a knowledge graph from a semi-structured document.
  • FIG. 8 illustrates the case of a conditional statement, but other logical structures also have reference notations.
  • the logical structure of the conditional statement ST3 has reference notations such as parenthesized notation E1, compound sentence notation E2, simple sentence notation E3, and nested notation E4.
  • the parenthesized notation E1 is a reference notation expressed using parentheses, for example, "B (limited to the case of A).”
  • the compound sentence notation E2 is a reference notation expressed using a compound sentence, for example, "A. In this case, B.”.
  • the simple sentence notation E3 is a reference notation expressed by using a simple sentence such as "In the case of A, B.”
  • Nested notation E4 is a reference notation expressed using nesting, for example, "1) In the following cases, B.a) A.b) A'.”.
  • FIG. 9 is a diagram for explaining a specific example of making a knowledge object of semi-structured knowledge.
  • FIG. 9 relates to the terms and conditions as a semi-structured document (semi-structured knowledge).
  • 10 to 16 are diagrams for explaining specific examples of knowledge objectization for each knowledge classification of FIG. 9, respectively.
  • the classifications, patterns, templates, etc. shown in FIGS. 9 to 16 are examples, and can be changed, added, and deleted as appropriate.
  • the knowledge classification (large) and knowledge classification (small) illustrated in FIG. 9 are examples of nodes in the knowledge graph, respectively.
  • the knowledge classification (large) is "term”, “summary”, “instruction”, “payment obligation”, “right”, “prohibition”, “act”, “reference”, “scope”. , “Due date”, “Human”, “Calculation formula”, “Condition”, “Applicable” and “Other” items.
  • the knowledge classification (small) of "term” includes the items of “another name”, “definition” and “replacement”.
  • the knowledge classification (small) of “Summary” includes the items of “Summary of Terms", “Summary of Articles” and “Summary of Items”.
  • the knowledge classification (small) of "instruction” includes the items of "payment instruction”, “submission instruction”, “change instruction”, “notification instruction”, “designation instruction” and “billing instruction”.
  • the knowledge classification (small) of "payment obligation” includes the items of "payment", “no payment” and "possible payment”.
  • the knowledge classification (small) of "right” is “claim right", “selection right”, “change right”, “cancellation right”, “cancellation right”, “cancellation right”, “prepayment right”, “repayment right”. , “Resurrection right”, “Lending”, “Recovery right”, “Inquiry right”, “Acceptance permission” and “Repayment right”.
  • the knowledge classification (small) of “prohibition” includes the items of "revival prohibition”, “cancellation prohibition”, “change prohibition”, “countermeasure prohibition”, “use prohibition” and “publication prohibition”.
  • the knowledge classification (small) of "act” is “confirmation”, “delivery”, “discount”, “continuation”, “notification”, “handling”, “processing”, “refund”, “registration”, “setting”.
  • the “reference” knowledge classification (small) includes the “as follows” item.
  • the knowledge classification (small) of “range” includes the items of "include” and “limit”.
  • the knowledge classification (small) of "due date” is the item of "expiration date”, “at the time of liability occurrence”, “grace period”, “registration period”, “renewal date”, “base date” and “effective date”. including.
  • the "human” knowledge classification (small) includes the “recipient”, "contractor” and “agent” items.
  • the knowledge classification (small) of "calculation formula” is "metal definition", “calculation method”, “minimum amount”, “deduction”, “transfer”, “allocation”, “ratio”, “rounding down” and “reduction”. Includes items.
  • the “condition” knowledge classification (small) includes the “amount”, “payment” and “notification” items.
  • the “applicable” knowledge classification (small) includes the “applicable”, “mutatis mutandis” and “not applicable” items.
  • the “Other” knowledge classification (small) includes items such as “responsibility", "jurisdiction court”, “invalidity”, “extinction”, “no dividend” and “no refund”.
  • FIG. 10 exemplifies knowledge objectization and dialogue pairing related to the "term” node.
  • FIG. 10 shows an example of a pattern for extracting the knowledge classification “term” from a semi-structured document.
  • the pattern of "another name” is, for example, "$ (term)” (so-called “$ (another name)”).
  • the pattern of "definition” is, for example, "$ (definition) (hereinafter referred to as” $ (term) ")", “$ (term) (referred to as $ (definition). The same shall apply hereinafter)" and "$ (Term) is $ (definition).”
  • the pattern of "replacement” is, for example, "$ (term)” is read as “$ (replacement)”.
  • a semi-structured document is a document described using a certain logical structure and reference notation.
  • each knowledge classification corresponds to each node (node) of the knowledge graph.
  • the pattern (model information 222) according to the logical structure and the reference notation is defined for each knowledge classification (node).
  • FIG. 9 illustrates an example of a “QA type” which is a type of dialogue pair generated from the value (term) of each knowledge classification, and an example of a dialogue pair generated for each knowledge classification.
  • the underlined portion indicates a part of the template (model information 222) used for generating the dialogue pair.
  • the dialogue pair generation unit 242 generates a dialogue pair by, for example, inserting "terms” and “different names” connected in the knowledge graph into the template (model information 222). Specifically, the dialogue pair generation unit 242 inserts "cervical syndrome” into the template and generates a Q sentence "What is cervical syndrome?". Similarly, the dialogue pair generation unit 242 inserts "so-called” whiplash "" into the template, and is “so-called” whiplash ". A sentence is generated.
  • FIG. 11 exemplifies knowledge objectization and dialogue pairing related to the "overview” node.
  • FIG. 11 shows an example of a target for extracting the knowledge classification “outline” from a semi-structured document.
  • FIG. 12 illustrates knowledge objecting and dialogue pairing for the "instruction” node and the “conditional instruction” node.
  • FIG. 13 illustrates knowledge objecting and dialogue pairing for the “Payment Obligations” node and the “Conditional Payment Obligations” node.
  • FIG. 14 illustrates knowledge objecting and dialogue pairing for the "rights" node and the “conditional rights” node.
  • FIG. 15 illustrates knowledge objecting and dialogue pairing for "prohibited” and “conditionally prohibited” nodes.
  • the knowledge object generation unit 241 can generate a knowledge graph by performing pattern matching according to the pattern of the object to be extracted in the same manner as described with reference to FIG. For example, when the conditional clause is extracted as shown in FIGS. 12 to 15, the knowledge classification (node) of "conditional XX" is generated. For example, the "conditional rights" node illustrated in FIG. 12 is connected to the description (term) of the conditional clause and the "rights" node related to the description of the main clause. Further, the dialogue pair generation unit 242 can generate a dialogue pair by inserting the connected terms in the knowledge graph into the template (model information 222) in the same manner as described with reference to FIG. can.
  • FIG. 17 is a diagram showing an example of how to deal with irregular cases in the generation of dialogue pairs from the knowledge base.
  • the expression may differ between the contract and the FAQ attached to the contract.
  • the terms and conditions include “daily hospitalization benefit amount", but the FAQ says “guarantee amount” and the terms and conditions include "when making a request for this article”.
  • the FAQ exemplifies the case of saying "when canceling the contract".
  • the terms extracted from the contract can be paired in dialogue using a template. Therefore, the dialogue pair generation unit 242 additionally generates a question sentence using the FAQ term (guaranteed amount, cancellation) as a question sentence of a wording variation.
  • the question sentence using the FAQ term can be generated as a dialogue pair for the same response sentence as the question sentence using the term extracted from the agreement, for example.
  • FIG. 17 there may be cases where the terms and conditions describe cases where payment is made, but do not describe cases where payment is not made.
  • the dialogue pair generation unit 242 additionally generates a question sentence regarding a case other than the case described in the contract as a question sentence of a negative case.
  • Negative case responses can be generated from the connection relationships in the knowledge graph, as in the case of using templates.
  • the dialogue pair generation unit 242 can generate a QA dialogue statement that automatically embodies a part that is not explicitly described in the semi-structured document (or knowledge base). can.
  • FIG. 18 is a diagram for explaining a specific example of automatically complementing the contents not explicitly described in the contract by defining a persona.
  • FIG. 18 shows the correspondence between the user's utterance (question sentence), the response sentence automatically generated from the contract knowledge as the response sentence corresponding to the question sentence, and the response sentence automatically generated by defining the persona according to the contract example.
  • An example is shown.
  • FIG. 19 is a diagram for explaining a specific example in which technical terms described in the contract are automatically complemented by replacing them with specific example sentences using numerical values calculated from persona parameters.
  • the persona is an item example of "insured person", "gender”, “1 hospitalization payment limit”, “hospitalization benefit daily amount", "insurance period”, “insurance premium payment period”.
  • the underlined portion is a dialogue QA sentence automatically generated as described above from the knowledge of the contract.
  • the double underlined part is a sentence embodied by using a persona, and is a complementary dialogue QA sentence.
  • the persona parameters are stored in, for example, the user information table 221.
  • the dialogue pair generation unit 242 responds to the user's utterance (question) "Who will be paid?", And is "paid to the policyholder" from the contract knowledge.
  • the response statement "Masu.” Is automatically generated.
  • the dialogue pair generation unit 242 automatically generates a response sentence with the expression "For example, if the customer is the policyholder, the customer will be paid" according to the persona of the policyholder. ..
  • the dialogue pair generation unit 242 calculates the parameters of the persona defined for the term “paid premium” and the paid premium described in the contract.
  • a concrete response statement is generated based on the method (calculation formula), and the concrete QA dialogue statement is automatically added. That is, the dialogue pair generation unit 242 calculates the amount of premium paid according to the defined persona parameter.
  • the dialogue pair generation unit 242 says, "For example, if the customer pays a monthly premium of 4,610 yen for 5 years, the total premium paid will be ... 280,000 yen. This 28. 10,000 yen is the paid insurance premium. ”A specific response statement is automatically added.
  • paid insurance premium and the calculation method are connected in the knowledge graph.
  • the persona parameters shown in FIG. 19 are examples, and are not limited to these.
  • the defined persona may be two or more.
  • the defined persona can prepare a variation of typical parameter differences for the items of the contract example presented in the pamphlet corresponding to the contract, for example.
  • the variation of the parameter difference can be determined by plotting the profiles of all the contractors and extracting representative parameters by clustering for the plot.
  • the dialogue pair generation unit 242 produces an easy-to-understand response sentence that is closer to the form explained by humans, as compared with the case where the clause knowledge is directly converted into the dialogue QA sentence by using the template. It can be generated and added to the dialogue repository.
  • FIGS. 20 and 21 are diagrams for explaining a specific example of automatically supplementing the contents not explicitly described in the contract by using other knowledge of the semi-structured document.
  • FIG. 20 shows an example of correspondence between a user's utterance (question sentence), a response sentence automatically generated from the agreement knowledge as a response sentence corresponding to the question sentence, and a response sentence automatically generated using general knowledge.
  • the underlined portion is a dialogue QA sentence automatically generated as described above from the knowledge of the contract.
  • the double underlined part is a sentence embodied by using general knowledge, and is a complementary dialogue QA sentence. As shown in FIG.
  • the knowledge object generation unit 241 graphs terms not described in the terms and conditions by using other general knowledge of semi-structured documents such as ontology and WikiPedia. In the examples of FIGS.
  • bone marrow transplantation means "to a patient with intractable blood disease such as leukemia or aplastic anemia, the donor's normal bone marrow cells are intravenously introduced. It has been obtained that it is a treatment that is injected into and transplanted into the bone marrow. In this way, QA dialogue sentences (dialogue pairs) embodied using general knowledge can be automatically added.
  • knowledge described on a website on the Internet via network N can also be used.
  • FIG. 22 is a diagram for explaining a specific example of automatically supplementing the contents not explicitly described in the contract by using the updated content of the contract.
  • the underlined portion indicates a portion that has been changed due to the update of the contract.
  • the double underlined part is a sentence embodied by using the updated contents of the contract, and is a supplementary dialogue QA sentence.
  • the dialogue pair generation unit 242 makes the difference itself before and after the update of the contract into a knowledge base for the contract that can be renewed and the contractor can take over.
  • the dialogue pair generation unit 242 generates a dialogue pair for each of the terms and conditions before and after the update, and automatically generates a dialogue pair indicating that there is a new version after the update for the old version before the update, and adds it to the dialogue repository. ..
  • the definition of the term "paid premium (equivalent amount)" of "monthly premium equivalent x 12 x contract” in the contract at the time of contract is changed to "monthly insurance” in the latest contract.
  • An example is an example in which the definition is updated to "fee equivalent amount x 11 x contract".
  • the dialogue pair generation unit 242 said, "For example, 10,000 yen is the paid insurance premium.
  • the dialogue pair generation unit 242 In the latest contract updated in April 2019, it was changed from 12 months to 11 months. ⁇ ⁇ . ”, Which automatically generates a response text that complements the updated content of the contract. In addition, the contents may be updated only for a part of the contract. In such a case, the dialogue pair generation unit 242 generates dialogue pairs for the contents of both versions before and after the update. At this time, the dialogue pair generation unit 242 inserts a phrase such as "in the new clause" as illustrated in FIG. In this way, the dialogue pair can be complemented by using the updated contents of the contract.
  • FIGS. 23 and 24 are diagrams for explaining the selection of the dialogue pair using the context information, respectively.
  • FIG. 23 shows a case where a semi-structured document D1 is appropriately used as an insurance policy or other legal document, a legal document such as a legal library, a medical document such as home medicine, a management rule or other rule, or a grammar book or other textbook. Is illustrated.
  • the response unit 163 of the response generation device 10 may, in addition to the user's utterance sentence U, context information such as a user profile (attribute information) UC1 and context information such as vital signs and images obtained by the sensor 11.
  • UC2 is further converted into a sentence vector.
  • the sentence vector output from the response generation device 10 to the dialogue repository generation device 20 is a document vector in which the user's utterance sentence U and the context information UC1 and UC2 are combined, as shown in FIG.
  • the image if it is medical information, an image of a related document such as a medical certificate or a notebook, a symptom image obtained by photographing a symptom to be investigated, or the like can be appropriately used.
  • the inference unit 243 can narrow down the dialogue pair DP to be output as a response sentence to the user according to the similarity of the sentence vectors from the response unit 163 of the response generation device 10. Specifically, the inference unit 243 preferentially selects the dialogue pair DP regarding the contract according to the user profile indicating the policyholder of the insurance.
  • the inference unit 243 preferentially selects the dialogue pair DPB generated by using the persona B defined according to the user profile among the plurality of personas A, B, C, ..., X.
  • the dialogue pair DPB can be selected by the sentence vector in which the user's utterance sentence U is supplemented by the context information UC1 and UC2. Therefore, the utterance burden of the user can be reduced.
  • FIG. 24 illustrates a case where a medical document such as home medicine is used as a data set DS1 of a semi-structured document.
  • the image / disease name converter C5 of the response unit 163 converts the context information UC21 indicating the symptom image I10 into a disease name based on the medical information data set DS3, and converts the “water ibo” of the image text UCT1 into a disease name.
  • the vital sensor / symptom converter C6 of the response unit 163 converts the body temperature “36.8 degrees” of the context information UC22 indicating the output of the vital sensor into a symptom based on the vital sign data set DS4, and the sensor text UCT2. Generates "normal heat".
  • the inference unit 243 selects a response sentence (response text RT) to the user based on the sentence vector of the utterance text UT that combines these context information UC21 and UC22, thereby realizing a mechanism like a pseudo diagnosis. Can be done.
  • the technique according to the embodiment can be applied even when the data set becomes medical information, and the corresponding symptom is automatically and simply diagnosed from the medical knowledge by inputting the user's speech, the image, and the information of the vital sensor.
  • the system can be realized.
  • FIG. 25 is a flowchart showing an example of the dialogue repository generation process as information processing according to the embodiment.
  • the knowledge object generation unit 241 opens an arbitrary semi-structured document (S101).
  • the knowledge object generation unit 241 inputs each sentence of the document into the written language / knowledge converter and generates a knowledge base (S102).
  • the written word / knowledge converter is an example of a model stored in the model information 222 as described above with reference to FIGS. 1 to 3, 6 and the like.
  • the written word / knowledge converter learned the parameters using the data extracted from the semi-structured document by pattern matching so as to output the knowledge object in response to the input of the sentence of the semi-structured document. It is a DNN such as Seq2seq.
  • a written word / knowledge converter is a function designed to output a knowledge object in response to input of a sentence in a semi-structured document by pattern matching.
  • the dialogue pair generation unit 242 searches the knowledge base for the corresponding knowledge, inputs it to the knowledge / dialogue converter, and inputs the Q sentence and the A sentence. Pairs, that is, dialogue pairs are generated (S104).
  • the knowledge / dialogue converter is an example of a model stored in the model information 222 as described above with reference to FIGS. 1 to 3, 6 and the like.
  • the knowledge-dialogue transducer can sequentially generate dialogue pairs for each term by inserting the knowledge retrieved from the knowledge base into the template.
  • the dialogue pair generation unit 242 extracts a term from the generated QA sentence, that is, a dialogue pair to which an example sentence outside the knowledge of the contract can be added (S105).
  • a term to which an example sentence other than the contract knowledge can be added is extracted in the process of S105
  • the dialogue pair generation unit 242 adds a QA sentence (complements the dialogue pair) that extends the knowledge related to the term (S106). ).
  • the flow of FIG. 25 repeats the processes of S104 to S106 until all the knowledge bases are converted into QA sentences (S107: No), and ends when all the knowledge bases are converted into QA sentences (S107: Yes). ..
  • FIG. 26 is a diagram for explaining the generation of a sentence vector.
  • the dialogue pair generation unit 242 converts a Q sentence into a sentence vector using a sentence / vector converter (models M3 and M4).
  • the text / vector converter for example, BERT (Bidirectional Encoder Representations from Transformers), which is a trained large-scale model (machine learning model), can be used.
  • the text / vector converter is an example of a model stored in the model information 222.
  • the means of sentence vectorization is not limited to this, and other machine learning models such as Word2vec and ELMo may be used.
  • FIG. 27 shows an example of data of a QA sentence (dialogue pair) generated from a semi-structured document (knowledge) by the dialogue repository generation process described with reference to FIG. 25.
  • the QA sentence (dialogue pair) can be automatically generated from the semi-structured document (knowledge).
  • FIG. 28 is a flowchart showing an example of response generation processing as information processing according to the embodiment.
  • the flow of FIG. 28 generates a response to a user's utterance using the data of a QA sentence (dialogue pair) generated from a semi-structured document (knowledge) by the dialogue repository generation process as shown in FIG. 27.
  • the response unit 163 inputs the image data obtained by the sensor 11 into the image / text converter and converts it into text data (S201).
  • the image / text converter is an example of a model stored in the model information 142 as described above with reference to FIGS. 23 and 24.
  • the image / sentence converter is a machine learning model in which parameters are determined so as to output text data in response to input of image data.
  • the response unit 163 inputs the sensing data such as the vital sensor obtained by the sensor 11 into the sensing log / text converter and converts it into text data (S202).
  • the sensing log / text converter is an example of a model stored in the model information 142 as described above with reference to FIGS. 23 and 24.
  • the sensing log / sentence converter is a machine learning model in which parameters are determined so as to output text data in response to input of sensing data.
  • the response unit 163 converts other context information such as a user profile into text data (S203). After that, the response unit 163 combines the arbitrary user-spoken text data and the converted text data as described above with reference to FIGS. 23 and 24, and uses a sentence / vector converter to generate a sentence vector. Generate.
  • This sentence / vector converter is an example of a model stored in the model information 142.
  • various machine learning models such as BERT, Word2vec, and ELMo can be appropriately used.
  • the dialogue pair generation unit 242 inputs the Q sentence generated from the knowledge base to the sentence / vector converter when it has not been converted in the processes of S104 and S106 of the above-mentioned dialogue repository generation process using FIG. 25.
  • a sentence vector for search may be generated (S205). At this time, the A sentence may be further converted into a sentence vector.
  • the dialogue pair generation unit 242 searches for and selects the Q sentence having the closest vector distance to the sentence vector corresponding to the user's utterance from the response unit 163 among the Q sentences corresponding to the dialogue repository (S206). After that, the dialogue pair generation unit 242 transmits the A sentence corresponding to the selected Q sentence to the response generation device 10 as a response. The response generation device 10 presents the A sentence from the dialogue pair generation unit 242 to the user. After that, the flow of FIG. 28 ends.
  • an appropriate response can be selected from the QA sentence (dialogue pair) automatically generated from the semi-structured document (knowledge).
  • each machine learning model is a composite function with parameters in which a plurality of functions are synthesized, and is defined by a combination of a plurality of adjustable functions and parameters.
  • the machine learning model may be any parameterized composition function defined by a combination of multiple adjustable functions and parameters.
  • the machine learning model may be a convolutional neural network (CNN) or a fully connected network. It is assumed that the parameters of the trained machine learning model are stored in, for example, the storage unit 14 and the storage unit 22, respectively.
  • CNN convolutional neural network
  • FIG. 29 is a block diagram showing an example of the hardware configuration of each device of the information processing system 1 according to the embodiment.
  • Information devices such as the devices (response generation device 10, dialogue repository generation device 20, and external server 30) of the information processing system 1 according to the above-described embodiment are realized by a computer 1000 having a configuration as shown in FIG. 29, for example. NS.
  • the computer 1000 has a CPU 1100, a RAM 1200, a ROM (Read Only Memory) 1300, an HDD (Hard Disk Drive) 1400, a communication interface 1500, and an input / output interface 1600. Each part of the computer 1000 is communicably connected by the bus 1050.
  • the CPU 1100 operates based on the program stored in the ROM 1300 or the HDD 1400, and controls each part. For example, the CPU 1100 loads a program stored in the ROM 1300 or the HDD 1400 into the RAM 1200, and executes processing corresponding to various programs.
  • the ROM 1300 stores a boot program such as a BIOS (Basic Input Output System) executed by the CPU 1100 when the computer 1000 is started, a program depending on the hardware of the computer 1000, and the like.
  • BIOS Basic Input Output System
  • the HDD 1400 is a computer-readable recording medium that non-temporarily records a program executed by the CPU 1100 and data used by such a program.
  • the HDD 1400 is a recording medium for recording the control programs 143 and 224 according to the present disclosure, which is an example of the program data 1450.
  • the communication interface 1500 is an interface for the computer 1000 to connect to the network N and the external network 1550 (for example, the Internet).
  • the CPU 1100 receives data from another device or transmits data generated by the CPU 1100 to another device via the communication interface 1500.
  • the input / output interface 1600 is an interface for connecting the input / output device 1650 and the computer 1000.
  • the CPU 1100 receives data from an input device such as a keyboard or mouse via the input / output interface 1600. Further, the CPU 1100 transmits data to an output device such as a display, a speaker, or a printer via the input / output interface 1600. Further, the input / output interface 1600 may function as a media interface for reading a program or the like recorded on a predetermined recording medium (media).
  • the media is, for example, an optical recording medium such as DVD (Digital Versatile Disc) or PD (Phase change rewritable Disk), a magneto-optical recording medium such as MO (Magneto-Optical disk), a tape medium, a magnetic recording medium, or a semiconductor memory.
  • an optical recording medium such as DVD (Digital Versatile Disc) or PD (Phase change rewritable Disk)
  • a magneto-optical recording medium such as MO (Magneto-Optical disk)
  • tape medium such as DVD (Digital Versatile Disc) or PD (Phase change rewritable Disk)
  • MO Magneto-optical disk
  • the CPU 1100 of the computer 1000 realizes each function of the response generator 10 by executing the control program 143 loaded on the RAM 1200.
  • the CPU 1100 of the computer 1000 realizes each function of the dialogue repository generation device 20 by executing the control program 224 loaded on the RAM 1200.
  • the HDD 1400 stores the control programs 143 and 224 according to the present disclosure and the data in the storage units 14 and 22.
  • the CPU 1100 reads the program data 1450 from the HDD 1400 and executes the program, but as another example, these programs may be acquired from another device via the external network 1550.
  • the dialogue repository generation device 20 (information processing device) includes a knowledge object generation unit 241 and a dialogue pair generation unit 242.
  • the knowledge object generation unit 241 extracts a plurality of terms from the semi-structured document and generates a knowledge graph showing the relationship between the extracted plurality of terms.
  • the dialogue pair generation unit 242 generates a dialogue repository containing a plurality of dialogue pairs based on the knowledge graph. Each of the plurality of dialogue pairs is a set of a question sentence and a response sentence to the question sentence. The first term included in the question sentence and the second term included in the response sentence are two terms in a connected relationship in the knowledge graph.
  • the dialogue repository generation device 20 (information processing device) can automatically generate a QA sentence (dialogue pair) from the semi-structured document (knowledge). Therefore, even when the scale of the knowledge base is expanded, it is possible to suppress the complexity of the search rule design, and it is possible to appropriately respond to the user's utterance.
  • the knowledge object generation unit 241 In the dialogue repository generation device 20 (information processing device), the knowledge object generation unit 241 generates a knowledge graph including terms extracted from other knowledge information of the semi-structured document, and the dialogue pair generation unit 242 is a half. Complement the dialogue pair with terms extracted from other knowledge information in the structured document.
  • the dialogue repository generation device 20 (information processing device) can complement the dialogue pair by using general knowledge not described in the semi-structured document such as the agreement. Therefore, it is possible to generate an appropriate response sentence as a response to the question sentence from the user.
  • the dialogue pair generation unit 242 complements the dialogue pair by using at least one persona.
  • the dialogue repository generation device 20 includes an acquisition unit 23 and an inference unit 243.
  • the acquisition unit 23 acquires the question text from the user.
  • the reasoning unit 243 is connected to the first term in the knowledge graph showing the relationship between the question sentence including the first term extracted from the semi-structured document and the plurality of terms extracted from the semi-structured document.
  • a dialogue pair that is a pair with a response sentence including the second term is selected according to the acquired question text from the user, and the response text of the selected dialogue pair is used as the response text to the question text from the user.
  • the response sentence to the question sentence from the user can be selected from the dialogue repository without searching the knowledge base. Therefore, even when the scale of the knowledge base is expanded, it is possible to suppress the delay in response due to the decrease in the search speed of the knowledge base, so that it is possible to appropriately respond to the user's utterance.
  • the inference unit 243 selects the dialogue pair based on the vector distance of the document vector between the question text of the dialogue pair and the question text from the user.
  • the inference unit 243 selects a dialogue pair based on a document vector generated by combining the question text from the user and the context information corresponding to the question text from the user. ..
  • the acquisition unit 23 further acquires the context information corresponding to the question text from the user, and the inference unit 243 gives priority to the dialogue pair corresponding to the acquired context information. Select to.
  • the context information includes at least one of an image, vital signs, and a user profile.
  • the information processing method extracts multiple terms from a semi-structured document and generates a knowledge graph showing the relationship between the extracted terms, and a dialogue including multiple dialogue pairs based on the knowledge graph.
  • Each of the dialogue pairs including creating a repository, is a set of a question and a response to the question, the first term contained in the question and the second term contained in the response. Is two terms that are connected in a knowledge graph.
  • the dialogue repository generation device 20 (information processing device) can automatically generate a QA sentence (dialogue pair) from the semi-structured document (knowledge). Therefore, even when the scale of the knowledge base is expanded, it is possible to suppress the complexity of the search rule design, and it is possible to appropriately respond to the user's utterance.
  • the information processing method is to acquire a question sentence from the user, a question sentence including the first term extracted from the semi-structured document, and a relationship between a plurality of terms extracted from the semi-structured document.
  • a dialogue pair that is a pair with a response sentence including the second term connected to the first term is selected according to the question sentence from the acquired user, and the response of the selected dialogue pair is selected. Includes outputting the statement as a response to a question from the user.
  • the response sentence to the question sentence from the user can be selected from the dialogue repository without searching the knowledge base. Therefore, even when the scale of the knowledge base is expanded, it is possible to suppress the delay in response due to the decrease in the search speed of the knowledge base, so that it is possible to appropriately respond to the user's utterance.
  • the present technology can also have the following configurations.
  • a knowledge object generator that extracts multiple terms from a semi-structured document and generates a knowledge graph showing the relationships between the extracted terms.
  • a dialogue pair generator that generates a dialogue repository containing a plurality of dialogue pairs based on the knowledge graph is provided. Each of the plurality of dialogue pairs is a set of a question sentence and a response sentence to the question sentence. The first term included in the question sentence and the second term included in the response sentence are two terms in a connected relationship in the knowledge graph. Information processing device.
  • the knowledge object generator generates the knowledge graph, which further includes terms extracted from other knowledge information in the semi-structured document.
  • the dialogue pair generator complements the dialogue pair using terms extracted from other knowledge information of the semi-structured document.
  • the information processing device according to (1) above.
  • (3) The information processing apparatus according to (1) or (2) above, wherein the dialogue pair generation unit complements the dialogue pair by using at least one persona.
  • a dialogue pair that is a set with a response sentence including the term is selected according to the acquired question text from the user, and the selected response text of the dialogue pair is used as the response text to the question text from the user.
  • An information processing device equipped with an inference unit that outputs.
  • the information processing apparatus selects the dialogue pair based on the vector distance of the document vector between the interrogative text of the dialogue pair and the interrogative text from the user.
  • the reasoning unit selects the dialogue pair based on the document vector generated by combining the question text from the user and the context information corresponding to the question text from the user, according to the above (4) or (5).
  • the information processing device described in. (7) The acquisition unit further acquires the context information corresponding to the question text from the user, and further acquires the context information.
  • the inference unit preferentially selects the dialogue pair corresponding to the acquired context information.
  • the information processing apparatus according to any one of (4) to (6).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
PCT/JP2021/017336 2020-05-21 2021-05-06 情報処理装置及び情報処理方法 Ceased WO2021235225A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2022524370A JP7718413B2 (ja) 2020-05-21 2021-05-06 情報処理装置及び情報処理方法
JP2025122968A JP2025137717A (ja) 2020-05-21 2025-07-23 情報処理装置及び情報処理方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020089205 2020-05-21
JP2020-089205 2020-05-21

Publications (1)

Publication Number Publication Date
WO2021235225A1 true WO2021235225A1 (ja) 2021-11-25

Family

ID=78708843

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/017336 Ceased WO2021235225A1 (ja) 2020-05-21 2021-05-06 情報処理装置及び情報処理方法

Country Status (2)

Country Link
JP (2) JP7718413B2 (https=)
WO (1) WO2021235225A1 (https=)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024147259A1 (ja) * 2023-01-06 2024-07-11 株式会社日立製作所 ライフサポートシステム、ライフサポート装置、および、ライフサポート方法
JP2025102693A (ja) * 2023-12-05 2025-07-08 エルジー マネジメント デベロップメント インスティテュート カンパニー リミテッド 目的指向対話方法およびそのシステム
JP2025102692A (ja) * 2023-12-05 2025-07-08 エルジー マネジメント デベロップメント インスティテュート カンパニー リミテッド 目的指向対話の文脈に基づくタスク実行方法およびそのシステム
JP2025127928A (ja) * 2024-02-21 2025-09-02 SB Intuitions株式会社 対話システム、プログラム、および制御方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007011775A (ja) * 2005-06-30 2007-01-18 Nippon Telegr & Teleph Corp <Ntt> 辞書作成装置、辞書作成方法、プログラム及び記録媒体
JP2019036210A (ja) * 2017-08-18 2019-03-07 株式会社三井住友銀行 機械学習を利用したfaq登録支援方法、及びコンピュータシステム
US20200042649A1 (en) * 2018-08-02 2020-02-06 International Business Machines Corporation Implicit dialog approach operating a conversational access interface to web content
US20200065389A1 (en) * 2017-10-10 2020-02-27 Tencent Technology (Shenzhen) Company Limited Semantic analysis method and apparatus, and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10922342B2 (en) * 2018-06-11 2021-02-16 Stratifyd, Inc. Schemaless systems and methods for automatically building and utilizing a chatbot knowledge base or the like
CN110083690B (zh) * 2019-04-10 2022-05-03 华侨大学 一种基于智能问答的对外汉语口语训练方法及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007011775A (ja) * 2005-06-30 2007-01-18 Nippon Telegr & Teleph Corp <Ntt> 辞書作成装置、辞書作成方法、プログラム及び記録媒体
JP2019036210A (ja) * 2017-08-18 2019-03-07 株式会社三井住友銀行 機械学習を利用したfaq登録支援方法、及びコンピュータシステム
US20200065389A1 (en) * 2017-10-10 2020-02-27 Tencent Technology (Shenzhen) Company Limited Semantic analysis method and apparatus, and storage medium
US20200042649A1 (en) * 2018-08-02 2020-02-06 International Business Machines Corporation Implicit dialog approach operating a conversational access interface to web content

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024147259A1 (ja) * 2023-01-06 2024-07-11 株式会社日立製作所 ライフサポートシステム、ライフサポート装置、および、ライフサポート方法
JP2024097643A (ja) * 2023-01-06 2024-07-19 株式会社日立製作所 ライフサポートシステム、ライフサポート装置、および、ライフサポート方法
JP7815156B2 (ja) 2023-01-06 2026-02-17 株式会社日立製作所 ライフサポートシステム、ライフサポート装置、および、ライフサポート方法
JP2025102693A (ja) * 2023-12-05 2025-07-08 エルジー マネジメント デベロップメント インスティテュート カンパニー リミテッド 目的指向対話方法およびそのシステム
JP2025102692A (ja) * 2023-12-05 2025-07-08 エルジー マネジメント デベロップメント インスティテュート カンパニー リミテッド 目的指向対話の文脈に基づくタスク実行方法およびそのシステム
JP2025127928A (ja) * 2024-02-21 2025-09-02 SB Intuitions株式会社 対話システム、プログラム、および制御方法
JP7827761B2 (ja) 2024-02-21 2026-03-10 SB Intuitions株式会社 対話システム、プログラム、および制御方法

Also Published As

Publication number Publication date
JP2025137717A (ja) 2025-09-19
JP7718413B2 (ja) 2025-08-05
JPWO2021235225A1 (https=) 2021-11-25

Similar Documents

Publication Publication Date Title
WO2021235225A1 (ja) 情報処理装置及び情報処理方法
JP7087938B2 (ja) 質問生成装置、質問生成方法及びプログラム
US20240387025A1 (en) Low-Latency Conversational Artificial Intelligence (Ai) Architecture With A Parallelized In-Depth Analysis Feedback Loop
US10719668B2 (en) System for machine translation
JP7315065B2 (ja) 質問生成装置、質問生成方法及びプログラム
JP7230576B2 (ja) 生成装置、学習装置、生成方法及びプログラム
CN117574921A (zh) 一种中文牙科智能诊疗问答方法、电子设备及存储介质
Zhao et al. Enhancing aspect-based sentiment analysis with BERT-driven context generation and quality filtering
Singh et al. Deep learning model for interpretability and explainability of aspect-level sentiment analysis based on social media
Marik et al. A hybrid deep feature selection framework for emotion recognition from human speeches
CN110263167B (zh) 医疗实体分类模型生成方法、装置、设备和可读存储介质
JP2020527804A (ja) コード化された医療語彙のマッピング
WO2020240870A1 (ja) パラメータ学習装置、パラメータ学習方法、及びコンピュータ読み取り可能な記録媒体
CN113779190A (zh) 事件因果关系识别方法、装置、电子设备与存储介质
CN118410156B (zh) 基于大语言模型的医疗问答方法和装置
Tang et al. MECG: modality-enhanced convolutional graph for unbalanced multimodal representations: J. Tang et al.
JP7411149B2 (ja) 学習装置、推定装置、学習方法、推定方法及びプログラム
CN113723463B (zh) 情感分类方法及装置
Haney Patents for NLP software: An empirical review
CN112750042B (zh) 数据处理方法、装置与电子设备
CN114398854A (zh) 电子书的标签生成方法、装置及电子设备
Bandlamudi et al. Building conversational artifacts to enable digital assistant for APIs and RPAs
KR102885899B1 (ko) 거대 언어 모델을 이용한 crm 데이터 분석 방법 및 컴퓨터 프로그램
Kulkarni et al. Deep reinforcement-based conversational AI agent in healthcare system
Quach et al. Fuzzy ontology modeling by utilizing fuzzy set and fuzzy description logic

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21809170

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022524370

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21809170

Country of ref document: EP

Kind code of ref document: A1