EP3652664A1 - Procédé de conduite de dialogue homme-ordinateur - Google Patents

Procédé de conduite de dialogue homme-ordinateur

Info

Publication number
EP3652664A1
EP3652664A1 EP17780297.2A EP17780297A EP3652664A1 EP 3652664 A1 EP3652664 A1 EP 3652664A1 EP 17780297 A EP17780297 A EP 17780297A EP 3652664 A1 EP3652664 A1 EP 3652664A1
Authority
EP
European Patent Office
Prior art keywords
user
information
meta
intention
self
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP17780297.2A
Other languages
German (de)
English (en)
Inventor
Philipp Heltewig
Sascha Poggemann
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cognigy GmbH
Original Assignee
Cognigy GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cognigy GmbH filed Critical Cognigy GmbH
Publication of EP3652664A1 publication Critical patent/EP3652664A1/fr
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/226Validation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • G06F40/56Natural language generation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Definitions

  • the present method relates to an implementation on a computer that also includes a mobile phone, smartphone or other portable device having a digital
  • Computing unit may be that allows the user to conduct a natural, everyday-suitable conversation with a computer system. Through text or voice input, various steps for classification and subsequent speech comprehension are performed, whereupon context-based in the use case, a conversation between the user and the
  • navigation devices in vehicles are usually able to give the driver who has entered a destination instructions in spoken language for driving. But they do not know if the user understood the instructions. Even these systems can not understand the user, but are dependent on his destination input. From the start, they only allow targets that are stored in the system-immanent database.
  • US Pat. No. 6,964,023 B2 describes a system for multimodal user conversions that is capable of converting spoken speech into text, resolving ambiguities and recognizing the mood of the user, for example by means of face recognition. If necessary, the system will ask questions to the user to clarify his instructions. In doing so, the system keeps asking until it detects the intent of the user. After that it leads System corresponding instructions, for example, it turns on the radio.
  • the recognition process is based on face and speech recognition. An error probability is determined for each branch of the decision tree, both by comparison with a threshold and by linear regression. The decision trees are trained for word-dependent properties.
  • WO 02/073331 A2 provides a system architecture for commercial transactions in natural language in which users can inform themselves about products of a catalog.
  • the system accesses a database for products and a database of user preferences and asks questions for clarification.
  • the database of user preferences has an update function, but a self-learning function is missing.
  • US 8,600,747 B2 describes a method for managing a dialogue, which may also include spoken language, such as in the context of a telephone call.
  • a dialogue motivators is used, which is an error handling, a
  • the system can ask questions to the user.
  • it is not self-learning and primarily serves to transfer users to human dialogue partners.
  • Speech recognition and an audio interface as well as a browser with access to content and determines activation phrases.
  • a latent semantic analysis By applying a latent semantic analysis, the similarity between the user utterances and the
  • US 8,543,407 B1 describes a system which uses statistical modeling and formal logic to evaluate conversational input regarding data contents such as commands and dictations. For this purpose, a command dictionary and a dynamic grammar for identification, the clarification of ambiguity and the extraction of
  • WO 2014/047270 A1 describes a method and an apparatus for achieving the intent in an IVR - Interactive voice response - using natural language
  • US 8,849,652 B2 describes a system that includes voice-based and non-voice-based interfaces and user context, previous information and
  • US Pat. No. 9,502,027 B1 describes a method for processing the output of a speech recognition.
  • the parsing it is first examined whether it is a command. If this is the case, this command is executed, and if not asked by a query to clarify the ambiguity.
  • the process can neither recognize intentions nor lead to a natural dialogue.
  • the object of the invention is thus to enable the computer system to emulate human communication better and not only to capture and evaluate the language or even only their words and grammar, and thus with a
  • the software should know some basic rules of strategic practice so that it can not be easily deceived or tricked.
  • the invention solves this problem by a method for dialogue between human and computer using a self-learning system, comprising
  • Sentence structure is analyzed and repeatedly the user inputs are split into factual content and metadata about the user and again the user intention is determined and classified,
  • the user is identified if possible. This can be done for example by a unique user ID. If the user has previously connected to the system, he will be recognized and the call will resume. If the user logs on again after an expiration time, the system generates a new instance based on the stored context information and lets the user start a new dialog. The user can connect via any interface such as chat or voice and then with appropriate identification and authentication, the appropriate user context can be loaded.
  • Such an application can be carried out in a manner similar to a forwarding of mobile radio signals between individual radio cells across channels and across the interfaces, thus permitting the transfer of dialogue between the individual channels.
  • the system retains the context, the saved dialog status and the dialog history.
  • Receiving and then analyzing the user inputs to sentence structure and sentence structure and recognition is done by a natural language parser that outputs grammatical information of a sentence, such as identified subjects, further information about each word, and a tree structure, also called dependency tree. denotes the relation between the individual words.
  • This structure tree is then further analyzed in the "Dependency Tree Parsing" processing step, whereby subject-predicate-object structures are generated as far as possible.
  • the processing of dependency trees and structural information extracted from it are combined with other existing meta information to use this information to determine the value of a user input.
  • This so-called input object which contains additional additional information, makes it possible to remove the corresponding meta information from the input and to buffer it in the conversation-dependent context.
  • the intention of the user also referred to as "intent”, is to be understood in the following to mean a more far-reaching term than just a mere request, the request, referred to as a "query”, representing a subset of the user intention. From the perspective of communication concerns the
  • This knowledge base is provided by a database of goal intentions, relevant context information, and external data sources.
  • the classification is carried out with a classification algorithm for clarifying intent and meaning.
  • Processing logic also referred to as flow, to a basis for a
  • the system is able to formulate queries to the user in order to confirm or disprove the validity of the recognition.
  • This type of feedback and the storage of user feedback creates a cycle in which the system is able to improve its own behavior, ie to learn independently. This in the form that after a certain number of queries pressed no further questions to any user must be asked. The system then learned to interpret and assign a previously unknown user request that did not match the reference intent.
  • Conversation state is a targeted activation or deactivation more meaningful to the conversation of matching intents.
  • an intent detection dependent on the conversation status is carried out with a conversation state-dependent delimitation of permitted or tolerated or meaningful intents. This achieves a serious improvement of the status-dependent intent detection. Paired with the one described above
  • Inquiry algorithm creates a procedure in which initially only a few structural
  • the intention recognition is done by statistical comparison.
  • the "Intent-Clarification" processing step formulates a query to the user, so that any ambiguity of the user input by demand and re-capture of the processing steps described above are used to get a clearer statement on intention and meaning of user input.
  • the "Intent Clarification” clarifies whether the system has recognized the correct intent or not.
  • Each reference intent has one or more explicit queries associated with it.
  • a script includes content from the meta-information and the context in a dynamically generated query.
  • the query is adapted to the user input and generates a much more natural query or other feedback to the user.
  • the computer system After asking the user, the computer system waits for a response. If no response is given, the system again requests or aborts the user's request, if an answer comes, the computer system checks to see if it is being provided by the same user, authenticates it, and the analysis and processing proceeds again. Prior to this, it is ensured that the system waits for further user input in a valid state or that it actively asks the user for a conversation by asking questions.
  • a context-based, structured dialog management suitable for the application is made possible by means of a logical decision matrix, a flow, a decision tree, a neural network or other implementation.
  • the decisive part of the “reasoning engine” is the graphical representation of a structure for dialogue management.
  • logical processing elements such as conditions such as “if”, “eise”, “then”, “switch case” and user interaction elements such as “say”, “say-once”,
  • Conversational context determined from previous information can be used as a basis for discussion.
  • This reasoning engine allows the meta-information from previous ones
  • Previously executed processing steps provide the necessary meta-information of one or more user inputs for processing in the reasoning engine.
  • the intent discovery occurs before processing in the flow.
  • a query to the user can be made directly from the intent detection.
  • queries to the user can also be formulated directly from the flow.
  • the intent usually does not already contain all sorts of specifications, which are then queried by the flow.
  • the system already enables partially dynamic generation of responses to the context or from stored meta-information.
  • the queries are thus created from the context and output to the user.
  • the flow generates the four language levels "fact level”, “appeal level”, “relationship level” and
  • the reception of user input is preceded by a conversion of spoken natural speech into text.
  • a conversion of spoken natural speech into text is known in the art.
  • this precedent step is used to gain further meta-information about the dialogue partner.
  • meta-information of the user are included in the conversion of spoken language into text and provided separately to the algorithm of the intention recognition.
  • MetaInformationen the parameters hurry, emotion, language, dialect and person or combinations or parts thereof can be detected.
  • these meta-information obtained from the natural speech of the user can serve as raw data for the intention recognition as well as for the subsequent logical processing and to this the parameters tonation, urgency and reassurance are delivered. So sometimes it can be concluded from the tonation on whether a complaint or a purchase request exists. In the case of urgency, it may be useful to lower the thresholds for statistical relevance. moreover
  • Adapting user behavior such as between a purchase process, a
  • a further embodiment of the invention provides that the generation of the speech text response to the user is followed by a conversion of text into spoken natural language in order to lead to a unified dialogue , In such cases, it is necessary to uniquely authenticate each user at all times to ensure that the system responds to the same user who made the request so the dialogue continues with the same person.
  • the system therefore allocates an identification record to each user in one embodiment, allowing access to user-dependent context information. In this way, it is also possible to conduct an interface-independent dialogue, the context and the current conversation and dialogue status persist and are always consistent. When the user terminates the conversation by disconnecting from the system, it first checks if there is a technical error and becomes the context
  • Speech interaction can thereby be achieved with real on-screen interaction, e.g. by displaying graphics or a device such as a robot, an "Internet of Things" device, or even a smartphone or a
  • User interface is preferably designed to be multiple
  • Control commands can also be processed directly from the meta information as data records, thereby influencing the conversation process from the context and using the flow elements, which control the conversation depending on the context, with the user.
  • a connected robot can greet the user when he detects movement. Through a possible face recognition such a robot could try to uniquely identify a user and his company. The system would then be able, based on existing contextual information, to create a dynamically generated greeting using the name and other context information and to convey it via the robot in the form of speech to the user.
  • the self-learning system is supported by an administrator who is involved in the occurrence of keywords or in the event of excessive demand.
  • the flow and manageable context information ensure that the system is partially self-learning, but does not change its actual behavior, since it still operates on the basis of predefined structures.
  • Administrator can only intervene controlling and bring in new elements for enhanced or enhanced user dialogues.
  • Further embodiments of the invention relate to the integration and query of external databases.
  • conversations-relevant data such as context information or meta-information
  • corresponding, linked databases are queried.
  • These may be, for example, product databases of manufacturers or goods and their storage location in a shopping center. If these databases are foreign-language, a translation program can also be used.
  • Generating issues via a markup language has access to the meta-information and the context and thus can include that information in the feedback and general processing.
  • Fig. 1 shows an overview block flow diagram of the entire process including the options.
  • Fig. 1 shows the user input 1 as text.
  • the user can make this input either through spoken language, which requires a subsequent program to translate from speech to text, or via a smartphone or a computer or other device.
  • the conversion can also be integrated by means of voice control in a smartphone, computer or other device.
  • An upstream automatic translation is possible.
  • For unique user identification also a kind of authentication or user identification is required.
  • the user input, as well as the entire subsequent dialog, can also be mixed via different input channels.
  • Synonym Mapper In the first processing step "Synonym Mapper" 2, a synonym recognition takes place, wherein the text is checked for synonymous phrases given by the administrator. If such synonyms are found, Synonym Mapper creates 2 metadata that replaces the original text with these synonyms without deleting the original data. This meta-record of the original sentence of the user is successively extended in the subsequent processing steps.
  • NLP Natural Language Processing
  • NLP parser which assigns words and punctuation of a text to parts of speech, which is also referred to as "POS-tagging", and the dependencies are determined to determine the grammatical and syntactic relationships between the sentence components to output as data.
  • Specially trained machine learning models can be used for the generic identification of sentence structures.
  • the data generated by the parser is parsed to identify the subject, object, their relationship to each other, and their properties.
  • the record type is determined, ie for example, whether it is a question or a statement. These data are added as metadata to the record of the previous step.
  • the "Intent Mapper" 5 first accepts all valid ones defined by the administrator
  • the scoring algorithm assigns a similarity value W by making the following comparisons:
  • W gets the value 1. This also applies to the lemmatized form of S and the form of S in which synonyms have been replaced.
  • Property comparisons are made, each giving a similarity value W between 0 and 1. These properties include subjects, relationships, objects, as well as their lemmas, questioner-who-what-where, circumstance, if it is a question, conditions, negative adverbs and possibly other aspects.
  • the scoring algorithm then checks for certain cases whether a value 0 must be assigned, for example if the example sentence is a question, but the user input is not, or if the subject, relationship and object do not match. In addition, if the negativity is determined, the system could assign the value W to 0.
  • the scoring values W, of the different pairs P, are then collected, stored in a table and sorted by size. This table is added to the metadata record.
  • this calculation is performed and the result of the calculation for the user input S is compared with all active intents and thus with a certain number of example sentences A.
  • the example sentence with the highest relevance is selected.
  • the intent assignment takes place.
  • Inquiry to the user is required, and if so, how.
  • the administrator sets two threshold values between 0 and 1.
  • the first value is the query threshold RT and the second threshold is the validity threshold CT, the threshold CT is higher than that of the RT.
  • the two thresholds RT and CT divide the range of scoring values W into three ranges. If the first threshold RT is not reached, the program will explain to the user that he did not understand it and politely address to him the request 7, his input
  • the program takes advantage of its learning system and its
  • the learning system inserts after the Intent Mapper 5 and is connected to a database 100, which consists of three parts:
  • Learned system is possibly connected to the APIs 104.
  • the part of the user data 101 contains data about the respective user and data of all users from previous user inputs and has a learning database individually for each user as well as a common learning database for all users together.
  • the respective learning databases contain the already processed data previous dialogs, including related meta-information, and a counter that indicates how many times each phrase containing a recognized intent has occurred by the user and by all other users.
  • the learning system acts as follows:
  • the scoring value W After the scoring value W has been determined, it is first checked in query 6 whether CT was reached or exceeded or not. In this case, the program continues without the intervention in the learning system with the next processing step 8. If W is between RT and CT, it is checked in the user data 101 in the examination of previous confirmations 10 whether the user has previously confirmed such an input sentence in a comparable context. If this is the case, no query is formulated to the user, rather the system "thinks" that the user again means the advised input and increments the counter for that detected sentence by 1. The program also continues with flow 8.
  • the program takes the example sentence A, which belongs to the highest score value W, formulated in the query 1 1 from a confirmation question and asks the user if this is his intention.
  • the program confirms that the intention has not been understood and, if other example sentences have also reached RT, whose score W was lower than the previous one, then asks the user if these example sentences then his intention was better. Otherwise, the program continues with flow 8, but sets a "flag" in the input object, indicating that the user's intent was not properly determined, so that Flow 8 can respond to the input and continue the user dialog ,
  • the learning system checks for each of the (inconclusive) queries whether the user's record has already been deposited as a learning sentence in the learning database LDB.
  • this phrase means the intent and the counter is incremented by 1 for that user and set.
  • this sentence has already been confirmed as intent in previous queries of other users and stored as a learning sentence in the learning database LDB. If such a learning sentence already exists, its counter is increased by 1, if not, it is created as a learning sentence in the learning database LDB. Thereafter, in the step "Examination for example sentences" 15 it is checked whether the learned sentence is suitable as a new example sentence for further cases.
  • the program removes the learning set from the learning database in the "Modification LDB" step 16 and, in the future, leads it as an example sentence so that a direct recognition takes place.
  • the new example sentence is marked as such to allow the administrator to retrieve this example sentence and, if necessary, manually delete it. This avoids abuse.
  • Flow 8 is a decision tree that enables you to create meaningful statements during a conversation with a user.
  • the flow 8 uses the stored context data 102 and the status of the conversation to be a human-like one
  • the context is used to store information about everything related to the current user and the dialogue with him, for example the name of the user after he has named it.
  • the context data 102 is stored and is then available for future user conversations.
  • the status determines where you are in the conversation, for example, at the beginning, the end, or in a specific processing step.
  • the status information is crucial for adequate answers, since identical user inputs can have completely different meanings depending on the current dialog status.
  • the Flow 8 has direct configuration access to external data sources, external data 103, and external APIs 104, and can include these in context and conversation.
  • the flow 8 consists of a number of different nodes that determine the logical structure to structure a conversation and generate the desired outputs. These are a start node that determines the beginning of a flow, function nodes that perform a specific function, for example, output a statement to a user or make an entry in a database, logic nodes that structure the flow within the flow, for example, if / then / else or
  • the flow 8 can be the current one
  • Such a runtime activated scripting language typically consists of a markup language tagging portions of text or data. Furthermore, this activated during runtime scripting language to form corresponding requests to databases with external data 103 and allow out of the flow node direct queries to these data sources. Synchronous in the form that any requests must be answered until the flow 8 continues to process, or asynchronously in the form that the immediate processing of further flow requests.
  • Each flow starts with a start node followed by any number of child nodes. All nodes are executed in sequence starting from the starting node as they are reached. Logic nodes leading to branches control the process and ensure that only relevant nodes are traversed during a call. Thus, it may be that some nodes in the course of a dialogue repeatedly or not at all. This constellation results in a free but structured dialogue.
  • Flow 8 It is particularly advantageous to use a markup language within the Flow 8 that allows to call or change the meta-information on user input, the context and the status directly within the decision tree. Access is e.g. via different nodes or via free programming.
  • This text input 1 starts the system.
  • Mapper 2 the term “snack” is searched for synonyms or generic terms, for example the term “pizza” is found for this and it is loaded from the user-added dictionary. There, a keyword with a unique identifier, which can be used later in the flow, and a freely definable number of associated synonyms, created. in the
  • NLP 3 examines the word by word entry and the relationship between the words. In this case, it is recognized that they are two half-sentences, the last of which is a question that addresses a location. Subject of the first half sentence is the user himself, object is a food item. A number was found: the word "one". The verb could indicate a relationship state that contains an intention. But the questioning sentence could also be an intention.
  • Step "Keyphrase Mapping" 4 it is first checked whether something is negated, this is not the case here.
  • the categorization is specified, items are a food and a location.
  • Location information loaded and pairs are formed. For example there are sandwiches, pizza and chips with corresponding example sentences A from the
  • Example sentence are loaded because the subject follows only from the first half-sentence, which is first to clarify. For each of these example sentences A, a scoring value W is assigned. These scoring values can be roughly proportional to the typical sales of the local
  • step "Checking the Intent Score” 6 it is checked whether the highest scoring value of the example sentences exceeds the value CT. This is not the case here. Then it is checked whether this highest value is at least above RT, this is the case. The system then checks in step 13 whether there is already a confirmation from a previous user input for this user and for such a user input. If this is the case, no further search is made for a confirmation, but the user input is regarded as confirmed and the system proceeds to the flow 8 as the next step.
  • the program initiates inquiry 1 1 associated with the example set of the highest scoring value W. In the present example this is the sentence, "Would you like a pizza?". The program expects a yes-no answer. If the answer is "yes”, the program checks for the second half-sentence whether there is an example sentence with a value of 1 for this, in this example this is assumed and no further inquiry is necessary. The program also continues to flow here, but previously saves the sentence in its learning database LDB, which is part of the user database 101 so that it can recognize the question next time.
  • LDB learning database
  • the program checks whether it already knows the question about the snack from other user queries. If this were the case, it would increment the counter in the learning database by 1 for all users, but in the present example, the sentence must first be recreated in the learning database. For this reason, the query, whether the
  • Threshold value LT for transfer to the database for example records in the database for reference or example sentences for intents has also been exceeded.
  • the program then proceeds to flow 8, which generates a response where to get a pizza, possibly in the form of an enumeration, if there are several places to do it that it knows. Possibly. It also asks for the pads you want and immediately places a complete order that the system can then forward to a digital ordering system.
  • the flow 8 can access by configuration to further application-specific, external data 103 or programming interface APIs 104.
  • the second user asks "I also want a snack but not pizza".
  • the first steps are analogous, but there is no second half sentence with a local question, and there is also a negative that is identified in the Keyphrase Mapping 4 if it has not already been identified in the previous step NLP 3. Since a negation to pizza is included, the example sentence with the pizza is given the scoring value 0.
  • the following step "Intent Mapping" 5 you will only be asked "Would you like a sandwich?". If the user affirms this, then the procedure continues analogously as in the previous case. Only in the flow 8 does the program determine that there was no asking for a location and ask further questions about it, possibly it also offers users a corresponding direct order.
  • the learning database then has two different ones for the question about the snack

Abstract

La présente invention concerne un procédé de conduite du dialogue homme-ordinateur à l'aide d'un système d'apprentissage automatique, comprenant les étapes consistant à: recevoir des entrées utilisateur à partir d'un texte en langage naturel ; associer des synonymes et des mots-clés ainsi que des associations de mots ; analyser les entrées utilisateur en fonction de la structure de la phrase et de la syntaxe et de leur reconnaissance ; attribuer des phrases clés ; déterminer et classifier une intention utilisateur ; vérifier si une confirmation est requise ; traiter logiquement, dans une préparation de raisonnement et de réponse, par la formulation d'une décision concernant une autre conception du dialogue, une formulation ou un historique de dialogue et, éventuellement, générer une réponse ; pour déterminer l'intention utilisateur, la pertinence statistique de l'attribution à une intention cible étant déterminée tout en conservant toutes les méta-informations, le classement étant effectué à l'aide d'un algorithme de classement dans lequel l'arbre de décision, la logique et la compréhension de la langue sont combinés par l'intermédiaire de méta-informations, un retour d'informations étant généré à l'aide des enquêtes utilisateur précédentes sur le système d'apprentissage automatique, lors d'un dépassement de valeur seuil de la pertinence statistique, une requête étant générée à l'utilisateur pour savoir si l'intention détectée a été correctement comprise, dans ce cas, une réponse utilisateur étant reçue, puis analysée en fonction de la structure de la phrase et de la syntaxe et, de manière répétée, les entrées utilisateur étant divisées en contenu factuel et en méta-informations sur l'utilisateur et l'intention utilisateur étant en outre déterminée et classée, le résultat de détermination étant transmis au système d'apprentissage automatique, le système d'apprentissage automatique traitant des recommandations et/ou des décisions de système automatiques pour améliorer la détermination d'intention à l'aide d'une pluralité d'interactions utilisateur et ledit système d'apprentissage automatique conservant lesdites recommandations et/ou lesdites décisions système automatiques pour une utilisation ultérieure, lors d'un dépassement de valeur seuil de la pertinence statistique, la génération de la réponse texte en langage naturel à l'utilisateur étant lancée, et autrement, une requête supplémentaire étant effectuée.
EP17780297.2A 2017-07-14 2017-07-14 Procédé de conduite de dialogue homme-ordinateur Pending EP3652664A1 (fr)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/DE2017/100587 WO2019011356A1 (fr) 2017-07-14 2017-07-14 Procédé de conduite de dialogue homme-ordinateur

Publications (1)

Publication Number Publication Date
EP3652664A1 true EP3652664A1 (fr) 2020-05-20

Family

ID=60021849

Family Applications (1)

Application Number Title Priority Date Filing Date
EP17780297.2A Pending EP3652664A1 (fr) 2017-07-14 2017-07-14 Procédé de conduite de dialogue homme-ordinateur

Country Status (3)

Country Link
US (1) US11315560B2 (fr)
EP (1) EP3652664A1 (fr)
WO (1) WO2019011356A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115238060A (zh) * 2022-09-20 2022-10-25 支付宝(杭州)信息技术有限公司 人机交互方法及装置、介质、计算设备

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112020707A (zh) * 2018-01-05 2020-12-01 国立大学法人九州工业大学 标签添加装置、标签添加方法以及程序
EP3811245A4 (fr) 2018-06-19 2022-03-09 Ellipsis Health, Inc. Systèmes et procédés d'évaluation de santé mentale
US20190385711A1 (en) 2018-06-19 2019-12-19 Ellipsis Health, Inc. Systems and methods for mental health assessment
US11409961B2 (en) * 2018-10-10 2022-08-09 Verint Americas Inc. System for minimizing repetition in intelligent virtual assistant conversations
US11361163B2 (en) * 2018-12-17 2022-06-14 MuyVentive, LLC System and method to represent conversational flows as graph embeddings and to conduct classification and clustering based on such embeddings
US11797768B2 (en) 2018-12-17 2023-10-24 MuyVentive, LLC System and method to represent conversational flows as graph embeddings and to conduct classification and clustering based on such embeddings
CN113168500A (zh) * 2019-01-22 2021-07-23 索尼集团公司 信息处理设备、信息处理方法及程序
CN109840276A (zh) * 2019-02-12 2019-06-04 北京健康有益科技有限公司 基于文本意图识别的智能对话方法、装置和存储介质
US11288459B2 (en) * 2019-08-01 2022-03-29 International Business Machines Corporation Adapting conversation flow based on cognitive interaction
US11705114B1 (en) 2019-08-08 2023-07-18 State Farm Mutual Automobile Insurance Company Systems and methods for parsing multiple intents in natural language speech
US11126793B2 (en) * 2019-10-04 2021-09-21 Omilia Natural Language Solutions Ltd. Unsupervised induction of user intents from conversational customer service corpora
CN110795532A (zh) * 2019-10-18 2020-02-14 珠海格力电器股份有限公司 一种语音信息的处理方法、装置、智能终端以及存储介质
CN110689891A (zh) * 2019-11-20 2020-01-14 广东奥园奥买家电子商务有限公司 一种基于公众显示装置的语音交互方法以及设备
JP7405660B2 (ja) * 2020-03-19 2023-12-26 Lineヤフー株式会社 出力装置、出力方法及び出力プログラム
CN111405128B (zh) * 2020-03-24 2022-02-18 中国—东盟信息港股份有限公司 一种基于语音转文字的通话质检系统
US20210319189A1 (en) * 2020-04-08 2021-10-14 Rajiv Trehan Multilingual concierge systems and method thereof
CN111666372B (zh) * 2020-04-29 2023-08-18 百度在线网络技术(北京)有限公司 解析查询词query的方法、装置、电子设备和可读存储介质
FR3111210B1 (fr) * 2020-06-04 2022-07-08 Thales Sa Communication bidirectionnelle homme-machine
CA3125124A1 (fr) * 2020-07-24 2022-01-24 Comcast Cable Communications, Llc Systemes et methodes d'entrainement de modeles de commandes vocales
CN112102840A (zh) * 2020-09-09 2020-12-18 中移(杭州)信息技术有限公司 语义识别方法、装置、终端及存储介质
CN112216278A (zh) * 2020-09-25 2021-01-12 威盛电子股份有限公司 语音识别系统、指令产生系统及其语音识别方法
CN112115249B (zh) * 2020-09-27 2023-11-14 支付宝(杭州)信息技术有限公司 用户意图的统计分析及结果展示方法和装置
CN112487142B (zh) * 2020-11-27 2022-08-09 易联众信息技术股份有限公司 一种基于自然语言处理的对话式智能交互方法和系统
US20230008868A1 (en) * 2021-07-08 2023-01-12 Nippon Telegraph And Telephone Corporation User authentication device, user authentication method, and user authentication computer program
CN113806484B (zh) * 2021-09-18 2022-08-05 橙色云互联网设计有限公司 关于用户需求信息的交互方法、装置以及存储介质
CN114244795B (zh) * 2021-12-16 2024-02-09 北京百度网讯科技有限公司 一种信息的推送方法、装置、设备及介质
CN114676689A (zh) * 2022-03-09 2022-06-28 青岛海尔科技有限公司 语句文本的识别方法和装置、存储介质及电子装置

Family Cites Families (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6766320B1 (en) 2000-08-24 2004-07-20 Microsoft Corporation Search engine with natural language-based robust parsing for user query and relevance feedback learning
US6964023B2 (en) 2001-02-05 2005-11-08 International Business Machines Corporation System and method for multi-modal focus detection, referential ambiguity resolution and mood classification using multi-modal input
WO2002073331A2 (fr) 2001-02-20 2002-09-19 Semantic Edge Gmbh Environnement interactif en langage naturel, dependant du contexte et a base de connaissances pour applications dynamiques et flexibles de recherche et de presentation de produits, services et informations
US7167832B2 (en) 2001-10-15 2007-01-23 At&T Corp. Method for dialog management
US20070136067A1 (en) 2003-11-10 2007-06-14 Scholl Holger R Audio dialogue system and voice browsing method
US7949529B2 (en) 2005-08-29 2011-05-24 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
US9318108B2 (en) * 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US9811935B2 (en) * 2007-04-26 2017-11-07 Ford Global Technologies, Llc Emotive advisory system and method
US8578330B2 (en) * 2007-06-11 2013-11-05 Sap Ag Enhanced widget composition platform
US8165886B1 (en) 2007-10-04 2012-04-24 Great Northern Research LLC Speech interface system and method for control and interaction with applications on a computing system
US8219407B1 (en) 2007-12-27 2012-07-10 Great Northern Research, LLC Method for processing the output of a speech recognizer
US20110125734A1 (en) * 2009-11-23 2011-05-26 International Business Machines Corporation Questions and answers generation
US10276170B2 (en) * 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
DE202011111062U1 (de) * 2010-01-25 2019-02-19 Newvaluexchange Ltd. Vorrichtung und System für eine Digitalkonversationsmanagementplattform
US9262612B2 (en) * 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US9105268B2 (en) 2012-09-19 2015-08-11 24/7 Customer, Inc. Method and apparatus for predicting intent in IVR using natural language queries
US9235978B1 (en) * 2013-01-16 2016-01-12 Domo, Inc. Automated suggested alerts based on natural language and user profile analysis
US9594542B2 (en) * 2013-06-20 2017-03-14 Viv Labs, Inc. Dynamically evolving cognitive architecture system based on training by third-party developers
US10474961B2 (en) * 2013-06-20 2019-11-12 Viv Labs, Inc. Dynamically evolving cognitive architecture system based on prompting for additional user input
US9633317B2 (en) * 2013-06-20 2017-04-25 Viv Labs, Inc. Dynamically evolving cognitive architecture system based on a natural language intent interpreter
US10741182B2 (en) * 2014-02-18 2020-08-11 Lenovo (Singapore) Pte. Ltd. Voice input correction using non-audio based input
US10579396B2 (en) * 2014-04-09 2020-03-03 Nice-Systems Ltd. System and automated method for configuring a predictive model and deploying it on a target platform
US10726831B2 (en) * 2014-05-20 2020-07-28 Amazon Technologies, Inc. Context interpretation in natural language processing using previous dialog acts
US10170123B2 (en) * 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9338493B2 (en) * 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
WO2016138067A1 (fr) * 2015-02-24 2016-09-01 Cloudlock, Inc. Système et procédé de sécurisation d'un environnement informatique d'entreprise
CN106844400A (zh) 2015-12-07 2017-06-13 南京中兴新软件有限责任公司 智能应答方法及装置
US10733982B2 (en) * 2018-01-08 2020-08-04 Apple Inc. Multi-directional dialog
US10741176B2 (en) * 2018-01-31 2020-08-11 International Business Machines Corporation Customizing responses to users in automated dialogue systems

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115238060A (zh) * 2022-09-20 2022-10-25 支付宝(杭州)信息技术有限公司 人机交互方法及装置、介质、计算设备

Also Published As

Publication number Publication date
US20200234700A1 (en) 2020-07-23
WO2019011356A1 (fr) 2019-01-17
US11315560B2 (en) 2022-04-26

Similar Documents

Publication Publication Date Title
EP3652664A1 (fr) Procédé de conduite de dialogue homme-ordinateur
DE69814114T2 (de) Natürliche sprache verstehendes verfahren und verstehende vorrichung zur sprachsteuerung einer anwendung
DE60130880T2 (de) Web-gestützte spracherkennung durch scripting und semantische objekte
JP7098875B2 (ja) 会議支援システム、会議支援装置、会議支援方法及びプログラム
DE69906540T2 (de) Multimodale benutzerschnittstelle
DE112016004863T5 (de) Parametersammlung und automatische Dialogerzeugung in Dialogsystemen
DE102017122358A1 (de) Bedingte Bereitstellung von Zugriff durch interaktive Assistentenmodul
DE112016003626T5 (de) Natürlichsprachliche Schnittstelle zu Datenbanken
CN110175229B (zh) 一种基于自然语言进行在线培训的方法和系统
EP3100174A1 (fr) Procédé automatique de reconnaissance sémantique et de mesure de l'univocité de texte
DE102013003055A1 (de) Verfahren und Vorrichtung zum Durchführen von Suchen in natürlicher Sprache
DE112018006345T5 (de) Abrufen von unterstützenden belegen für komplexe antworten
EP1950672A1 (fr) Procédé et système de traitement des données destinés aux demandes commandées d'informations stockées de manière structurée
DE102018113034A1 (de) Stimmenerkennungssystem und stimmenerkennungsverfahren zum analysieren eines befehls, welcher mehrere absichten hat
AT6920U1 (de) Verfahren zur erzeugung natürlicher sprache in computer-dialogsystemen
DE112020005268T5 (de) Automatisches erzeugen von schema-annotationsdateien zum umwandeln von abfragen in natürlicher sprache in eine strukturierte abfragesprache
EP2962296B1 (fr) Analyse vocale sur la base d'une sélection de mots et dispositif d'analyse vocale
EP1599866A1 (fr) Systeme de traitement linguistique, procede de classement de suite de signes acoustiques et/ou ecrits en mots ou enregistrements lexicaux
CN112579757A (zh) 智能问答方法、装置、计算机可读存储介质及电子设备
DE60214850T2 (de) Für eine benutzergruppe spezifisches musterverarbeitungssystem
DE19849855C1 (de) Verfahren zur automatischen Generierung einer textlichen Äußerung aus einer Bedeutungsrepräsentation durch ein Computersystem
Chen et al. How GPT-3 responds to different publics on climate change and Black Lives Matter: A critical appraisal of equity in conversational AI
DE102019218918A1 (de) Dialogsystem, elektronisches gerät und verfahren zur steuerung des dialogsystems
DE60119643T2 (de) Homophonewahl in der Spracherkennung
EP3576084B1 (fr) Conception du dialogue efficace

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20200214

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20221209