EP3652664A1 - Procédé de conduite de dialogue homme-ordinateur - Google Patents
Procédé de conduite de dialogue homme-ordinateurInfo
- Publication number
- EP3652664A1 EP3652664A1 EP17780297.2A EP17780297A EP3652664A1 EP 3652664 A1 EP3652664 A1 EP 3652664A1 EP 17780297 A EP17780297 A EP 17780297A EP 3652664 A1 EP3652664 A1 EP 3652664A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- user
- information
- meta
- intention
- self
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000012545 processing Methods 0.000 claims abstract description 36
- 238000012790 confirmation Methods 0.000 claims abstract description 13
- 230000003993 interaction Effects 0.000 claims abstract description 11
- 238000003066 decision tree Methods 0.000 claims abstract description 8
- 238000009472 formulation Methods 0.000 claims abstract description 6
- 239000000203 mixture Substances 0.000 claims abstract description 6
- 238000007635 classification algorithm Methods 0.000 claims abstract description 4
- 238000002360 preparation method Methods 0.000 claims abstract description 4
- 230000004044 response Effects 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 13
- 238000006243 chemical reaction Methods 0.000 claims description 11
- 238000013507 mapping Methods 0.000 claims description 10
- 230000001419 dependent effect Effects 0.000 claims description 9
- 239000008186 active pharmaceutical agent Substances 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 238000001514 detection method Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 5
- 238000013461 design Methods 0.000 claims description 3
- 230000009849 deactivation Effects 0.000 claims description 2
- 230000008451 emotion Effects 0.000 claims description 2
- 238000005352 clarification Methods 0.000 abstract description 6
- 238000010276 construction Methods 0.000 abstract 2
- 230000000717 retained effect Effects 0.000 abstract 1
- 235000013550 pizza Nutrition 0.000 description 9
- 235000011888 snacks Nutrition 0.000 description 7
- 238000003058 natural language processing Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 230000006399 behavior Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 235000013305 food Nutrition 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 206010000210 abortion Diseases 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008602 contraction Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/226—Validation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/55—Rule-based translation
- G06F40/56—Natural language generation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/225—Feedback of the input speech
Definitions
- the present method relates to an implementation on a computer that also includes a mobile phone, smartphone or other portable device having a digital
- Computing unit may be that allows the user to conduct a natural, everyday-suitable conversation with a computer system. Through text or voice input, various steps for classification and subsequent speech comprehension are performed, whereupon context-based in the use case, a conversation between the user and the
- navigation devices in vehicles are usually able to give the driver who has entered a destination instructions in spoken language for driving. But they do not know if the user understood the instructions. Even these systems can not understand the user, but are dependent on his destination input. From the start, they only allow targets that are stored in the system-immanent database.
- US Pat. No. 6,964,023 B2 describes a system for multimodal user conversions that is capable of converting spoken speech into text, resolving ambiguities and recognizing the mood of the user, for example by means of face recognition. If necessary, the system will ask questions to the user to clarify his instructions. In doing so, the system keeps asking until it detects the intent of the user. After that it leads System corresponding instructions, for example, it turns on the radio.
- the recognition process is based on face and speech recognition. An error probability is determined for each branch of the decision tree, both by comparison with a threshold and by linear regression. The decision trees are trained for word-dependent properties.
- WO 02/073331 A2 provides a system architecture for commercial transactions in natural language in which users can inform themselves about products of a catalog.
- the system accesses a database for products and a database of user preferences and asks questions for clarification.
- the database of user preferences has an update function, but a self-learning function is missing.
- US 8,600,747 B2 describes a method for managing a dialogue, which may also include spoken language, such as in the context of a telephone call.
- a dialogue motivators is used, which is an error handling, a
- the system can ask questions to the user.
- it is not self-learning and primarily serves to transfer users to human dialogue partners.
- Speech recognition and an audio interface as well as a browser with access to content and determines activation phrases.
- a latent semantic analysis By applying a latent semantic analysis, the similarity between the user utterances and the
- US 8,543,407 B1 describes a system which uses statistical modeling and formal logic to evaluate conversational input regarding data contents such as commands and dictations. For this purpose, a command dictionary and a dynamic grammar for identification, the clarification of ambiguity and the extraction of
- WO 2014/047270 A1 describes a method and an apparatus for achieving the intent in an IVR - Interactive voice response - using natural language
- US 8,849,652 B2 describes a system that includes voice-based and non-voice-based interfaces and user context, previous information and
- US Pat. No. 9,502,027 B1 describes a method for processing the output of a speech recognition.
- the parsing it is first examined whether it is a command. If this is the case, this command is executed, and if not asked by a query to clarify the ambiguity.
- the process can neither recognize intentions nor lead to a natural dialogue.
- the object of the invention is thus to enable the computer system to emulate human communication better and not only to capture and evaluate the language or even only their words and grammar, and thus with a
- the software should know some basic rules of strategic practice so that it can not be easily deceived or tricked.
- the invention solves this problem by a method for dialogue between human and computer using a self-learning system, comprising
- Sentence structure is analyzed and repeatedly the user inputs are split into factual content and metadata about the user and again the user intention is determined and classified,
- the user is identified if possible. This can be done for example by a unique user ID. If the user has previously connected to the system, he will be recognized and the call will resume. If the user logs on again after an expiration time, the system generates a new instance based on the stored context information and lets the user start a new dialog. The user can connect via any interface such as chat or voice and then with appropriate identification and authentication, the appropriate user context can be loaded.
- Such an application can be carried out in a manner similar to a forwarding of mobile radio signals between individual radio cells across channels and across the interfaces, thus permitting the transfer of dialogue between the individual channels.
- the system retains the context, the saved dialog status and the dialog history.
- Receiving and then analyzing the user inputs to sentence structure and sentence structure and recognition is done by a natural language parser that outputs grammatical information of a sentence, such as identified subjects, further information about each word, and a tree structure, also called dependency tree. denotes the relation between the individual words.
- This structure tree is then further analyzed in the "Dependency Tree Parsing" processing step, whereby subject-predicate-object structures are generated as far as possible.
- the processing of dependency trees and structural information extracted from it are combined with other existing meta information to use this information to determine the value of a user input.
- This so-called input object which contains additional additional information, makes it possible to remove the corresponding meta information from the input and to buffer it in the conversation-dependent context.
- the intention of the user also referred to as "intent”, is to be understood in the following to mean a more far-reaching term than just a mere request, the request, referred to as a "query”, representing a subset of the user intention. From the perspective of communication concerns the
- This knowledge base is provided by a database of goal intentions, relevant context information, and external data sources.
- the classification is carried out with a classification algorithm for clarifying intent and meaning.
- Processing logic also referred to as flow, to a basis for a
- the system is able to formulate queries to the user in order to confirm or disprove the validity of the recognition.
- This type of feedback and the storage of user feedback creates a cycle in which the system is able to improve its own behavior, ie to learn independently. This in the form that after a certain number of queries pressed no further questions to any user must be asked. The system then learned to interpret and assign a previously unknown user request that did not match the reference intent.
- Conversation state is a targeted activation or deactivation more meaningful to the conversation of matching intents.
- an intent detection dependent on the conversation status is carried out with a conversation state-dependent delimitation of permitted or tolerated or meaningful intents. This achieves a serious improvement of the status-dependent intent detection. Paired with the one described above
- Inquiry algorithm creates a procedure in which initially only a few structural
- the intention recognition is done by statistical comparison.
- the "Intent-Clarification" processing step formulates a query to the user, so that any ambiguity of the user input by demand and re-capture of the processing steps described above are used to get a clearer statement on intention and meaning of user input.
- the "Intent Clarification” clarifies whether the system has recognized the correct intent or not.
- Each reference intent has one or more explicit queries associated with it.
- a script includes content from the meta-information and the context in a dynamically generated query.
- the query is adapted to the user input and generates a much more natural query or other feedback to the user.
- the computer system After asking the user, the computer system waits for a response. If no response is given, the system again requests or aborts the user's request, if an answer comes, the computer system checks to see if it is being provided by the same user, authenticates it, and the analysis and processing proceeds again. Prior to this, it is ensured that the system waits for further user input in a valid state or that it actively asks the user for a conversation by asking questions.
- a context-based, structured dialog management suitable for the application is made possible by means of a logical decision matrix, a flow, a decision tree, a neural network or other implementation.
- the decisive part of the “reasoning engine” is the graphical representation of a structure for dialogue management.
- logical processing elements such as conditions such as “if”, “eise”, “then”, “switch case” and user interaction elements such as “say”, “say-once”,
- Conversational context determined from previous information can be used as a basis for discussion.
- This reasoning engine allows the meta-information from previous ones
- Previously executed processing steps provide the necessary meta-information of one or more user inputs for processing in the reasoning engine.
- the intent discovery occurs before processing in the flow.
- a query to the user can be made directly from the intent detection.
- queries to the user can also be formulated directly from the flow.
- the intent usually does not already contain all sorts of specifications, which are then queried by the flow.
- the system already enables partially dynamic generation of responses to the context or from stored meta-information.
- the queries are thus created from the context and output to the user.
- the flow generates the four language levels "fact level”, “appeal level”, “relationship level” and
- the reception of user input is preceded by a conversion of spoken natural speech into text.
- a conversion of spoken natural speech into text is known in the art.
- this precedent step is used to gain further meta-information about the dialogue partner.
- meta-information of the user are included in the conversion of spoken language into text and provided separately to the algorithm of the intention recognition.
- MetaInformationen the parameters hurry, emotion, language, dialect and person or combinations or parts thereof can be detected.
- these meta-information obtained from the natural speech of the user can serve as raw data for the intention recognition as well as for the subsequent logical processing and to this the parameters tonation, urgency and reassurance are delivered. So sometimes it can be concluded from the tonation on whether a complaint or a purchase request exists. In the case of urgency, it may be useful to lower the thresholds for statistical relevance. moreover
- Adapting user behavior such as between a purchase process, a
- a further embodiment of the invention provides that the generation of the speech text response to the user is followed by a conversion of text into spoken natural language in order to lead to a unified dialogue , In such cases, it is necessary to uniquely authenticate each user at all times to ensure that the system responds to the same user who made the request so the dialogue continues with the same person.
- the system therefore allocates an identification record to each user in one embodiment, allowing access to user-dependent context information. In this way, it is also possible to conduct an interface-independent dialogue, the context and the current conversation and dialogue status persist and are always consistent. When the user terminates the conversation by disconnecting from the system, it first checks if there is a technical error and becomes the context
- Speech interaction can thereby be achieved with real on-screen interaction, e.g. by displaying graphics or a device such as a robot, an "Internet of Things" device, or even a smartphone or a
- User interface is preferably designed to be multiple
- Control commands can also be processed directly from the meta information as data records, thereby influencing the conversation process from the context and using the flow elements, which control the conversation depending on the context, with the user.
- a connected robot can greet the user when he detects movement. Through a possible face recognition such a robot could try to uniquely identify a user and his company. The system would then be able, based on existing contextual information, to create a dynamically generated greeting using the name and other context information and to convey it via the robot in the form of speech to the user.
- the self-learning system is supported by an administrator who is involved in the occurrence of keywords or in the event of excessive demand.
- the flow and manageable context information ensure that the system is partially self-learning, but does not change its actual behavior, since it still operates on the basis of predefined structures.
- Administrator can only intervene controlling and bring in new elements for enhanced or enhanced user dialogues.
- Further embodiments of the invention relate to the integration and query of external databases.
- conversations-relevant data such as context information or meta-information
- corresponding, linked databases are queried.
- These may be, for example, product databases of manufacturers or goods and their storage location in a shopping center. If these databases are foreign-language, a translation program can also be used.
- Generating issues via a markup language has access to the meta-information and the context and thus can include that information in the feedback and general processing.
- Fig. 1 shows an overview block flow diagram of the entire process including the options.
- Fig. 1 shows the user input 1 as text.
- the user can make this input either through spoken language, which requires a subsequent program to translate from speech to text, or via a smartphone or a computer or other device.
- the conversion can also be integrated by means of voice control in a smartphone, computer or other device.
- An upstream automatic translation is possible.
- For unique user identification also a kind of authentication or user identification is required.
- the user input, as well as the entire subsequent dialog, can also be mixed via different input channels.
- Synonym Mapper In the first processing step "Synonym Mapper" 2, a synonym recognition takes place, wherein the text is checked for synonymous phrases given by the administrator. If such synonyms are found, Synonym Mapper creates 2 metadata that replaces the original text with these synonyms without deleting the original data. This meta-record of the original sentence of the user is successively extended in the subsequent processing steps.
- NLP Natural Language Processing
- NLP parser which assigns words and punctuation of a text to parts of speech, which is also referred to as "POS-tagging", and the dependencies are determined to determine the grammatical and syntactic relationships between the sentence components to output as data.
- Specially trained machine learning models can be used for the generic identification of sentence structures.
- the data generated by the parser is parsed to identify the subject, object, their relationship to each other, and their properties.
- the record type is determined, ie for example, whether it is a question or a statement. These data are added as metadata to the record of the previous step.
- the "Intent Mapper" 5 first accepts all valid ones defined by the administrator
- the scoring algorithm assigns a similarity value W by making the following comparisons:
- W gets the value 1. This also applies to the lemmatized form of S and the form of S in which synonyms have been replaced.
- Property comparisons are made, each giving a similarity value W between 0 and 1. These properties include subjects, relationships, objects, as well as their lemmas, questioner-who-what-where, circumstance, if it is a question, conditions, negative adverbs and possibly other aspects.
- the scoring algorithm then checks for certain cases whether a value 0 must be assigned, for example if the example sentence is a question, but the user input is not, or if the subject, relationship and object do not match. In addition, if the negativity is determined, the system could assign the value W to 0.
- the scoring values W, of the different pairs P, are then collected, stored in a table and sorted by size. This table is added to the metadata record.
- this calculation is performed and the result of the calculation for the user input S is compared with all active intents and thus with a certain number of example sentences A.
- the example sentence with the highest relevance is selected.
- the intent assignment takes place.
- Inquiry to the user is required, and if so, how.
- the administrator sets two threshold values between 0 and 1.
- the first value is the query threshold RT and the second threshold is the validity threshold CT, the threshold CT is higher than that of the RT.
- the two thresholds RT and CT divide the range of scoring values W into three ranges. If the first threshold RT is not reached, the program will explain to the user that he did not understand it and politely address to him the request 7, his input
- the program takes advantage of its learning system and its
- the learning system inserts after the Intent Mapper 5 and is connected to a database 100, which consists of three parts:
- Learned system is possibly connected to the APIs 104.
- the part of the user data 101 contains data about the respective user and data of all users from previous user inputs and has a learning database individually for each user as well as a common learning database for all users together.
- the respective learning databases contain the already processed data previous dialogs, including related meta-information, and a counter that indicates how many times each phrase containing a recognized intent has occurred by the user and by all other users.
- the learning system acts as follows:
- the scoring value W After the scoring value W has been determined, it is first checked in query 6 whether CT was reached or exceeded or not. In this case, the program continues without the intervention in the learning system with the next processing step 8. If W is between RT and CT, it is checked in the user data 101 in the examination of previous confirmations 10 whether the user has previously confirmed such an input sentence in a comparable context. If this is the case, no query is formulated to the user, rather the system "thinks" that the user again means the advised input and increments the counter for that detected sentence by 1. The program also continues with flow 8.
- the program takes the example sentence A, which belongs to the highest score value W, formulated in the query 1 1 from a confirmation question and asks the user if this is his intention.
- the program confirms that the intention has not been understood and, if other example sentences have also reached RT, whose score W was lower than the previous one, then asks the user if these example sentences then his intention was better. Otherwise, the program continues with flow 8, but sets a "flag" in the input object, indicating that the user's intent was not properly determined, so that Flow 8 can respond to the input and continue the user dialog ,
- the learning system checks for each of the (inconclusive) queries whether the user's record has already been deposited as a learning sentence in the learning database LDB.
- this phrase means the intent and the counter is incremented by 1 for that user and set.
- this sentence has already been confirmed as intent in previous queries of other users and stored as a learning sentence in the learning database LDB. If such a learning sentence already exists, its counter is increased by 1, if not, it is created as a learning sentence in the learning database LDB. Thereafter, in the step "Examination for example sentences" 15 it is checked whether the learned sentence is suitable as a new example sentence for further cases.
- the program removes the learning set from the learning database in the "Modification LDB" step 16 and, in the future, leads it as an example sentence so that a direct recognition takes place.
- the new example sentence is marked as such to allow the administrator to retrieve this example sentence and, if necessary, manually delete it. This avoids abuse.
- Flow 8 is a decision tree that enables you to create meaningful statements during a conversation with a user.
- the flow 8 uses the stored context data 102 and the status of the conversation to be a human-like one
- the context is used to store information about everything related to the current user and the dialogue with him, for example the name of the user after he has named it.
- the context data 102 is stored and is then available for future user conversations.
- the status determines where you are in the conversation, for example, at the beginning, the end, or in a specific processing step.
- the status information is crucial for adequate answers, since identical user inputs can have completely different meanings depending on the current dialog status.
- the Flow 8 has direct configuration access to external data sources, external data 103, and external APIs 104, and can include these in context and conversation.
- the flow 8 consists of a number of different nodes that determine the logical structure to structure a conversation and generate the desired outputs. These are a start node that determines the beginning of a flow, function nodes that perform a specific function, for example, output a statement to a user or make an entry in a database, logic nodes that structure the flow within the flow, for example, if / then / else or
- the flow 8 can be the current one
- Such a runtime activated scripting language typically consists of a markup language tagging portions of text or data. Furthermore, this activated during runtime scripting language to form corresponding requests to databases with external data 103 and allow out of the flow node direct queries to these data sources. Synchronous in the form that any requests must be answered until the flow 8 continues to process, or asynchronously in the form that the immediate processing of further flow requests.
- Each flow starts with a start node followed by any number of child nodes. All nodes are executed in sequence starting from the starting node as they are reached. Logic nodes leading to branches control the process and ensure that only relevant nodes are traversed during a call. Thus, it may be that some nodes in the course of a dialogue repeatedly or not at all. This constellation results in a free but structured dialogue.
- Flow 8 It is particularly advantageous to use a markup language within the Flow 8 that allows to call or change the meta-information on user input, the context and the status directly within the decision tree. Access is e.g. via different nodes or via free programming.
- This text input 1 starts the system.
- Mapper 2 the term “snack” is searched for synonyms or generic terms, for example the term “pizza” is found for this and it is loaded from the user-added dictionary. There, a keyword with a unique identifier, which can be used later in the flow, and a freely definable number of associated synonyms, created. in the
- NLP 3 examines the word by word entry and the relationship between the words. In this case, it is recognized that they are two half-sentences, the last of which is a question that addresses a location. Subject of the first half sentence is the user himself, object is a food item. A number was found: the word "one". The verb could indicate a relationship state that contains an intention. But the questioning sentence could also be an intention.
- Step "Keyphrase Mapping" 4 it is first checked whether something is negated, this is not the case here.
- the categorization is specified, items are a food and a location.
- Location information loaded and pairs are formed. For example there are sandwiches, pizza and chips with corresponding example sentences A from the
- Example sentence are loaded because the subject follows only from the first half-sentence, which is first to clarify. For each of these example sentences A, a scoring value W is assigned. These scoring values can be roughly proportional to the typical sales of the local
- step "Checking the Intent Score” 6 it is checked whether the highest scoring value of the example sentences exceeds the value CT. This is not the case here. Then it is checked whether this highest value is at least above RT, this is the case. The system then checks in step 13 whether there is already a confirmation from a previous user input for this user and for such a user input. If this is the case, no further search is made for a confirmation, but the user input is regarded as confirmed and the system proceeds to the flow 8 as the next step.
- the program initiates inquiry 1 1 associated with the example set of the highest scoring value W. In the present example this is the sentence, "Would you like a pizza?". The program expects a yes-no answer. If the answer is "yes”, the program checks for the second half-sentence whether there is an example sentence with a value of 1 for this, in this example this is assumed and no further inquiry is necessary. The program also continues to flow here, but previously saves the sentence in its learning database LDB, which is part of the user database 101 so that it can recognize the question next time.
- LDB learning database
- the program checks whether it already knows the question about the snack from other user queries. If this were the case, it would increment the counter in the learning database by 1 for all users, but in the present example, the sentence must first be recreated in the learning database. For this reason, the query, whether the
- Threshold value LT for transfer to the database for example records in the database for reference or example sentences for intents has also been exceeded.
- the program then proceeds to flow 8, which generates a response where to get a pizza, possibly in the form of an enumeration, if there are several places to do it that it knows. Possibly. It also asks for the pads you want and immediately places a complete order that the system can then forward to a digital ordering system.
- the flow 8 can access by configuration to further application-specific, external data 103 or programming interface APIs 104.
- the second user asks "I also want a snack but not pizza".
- the first steps are analogous, but there is no second half sentence with a local question, and there is also a negative that is identified in the Keyphrase Mapping 4 if it has not already been identified in the previous step NLP 3. Since a negation to pizza is included, the example sentence with the pizza is given the scoring value 0.
- the following step "Intent Mapping" 5 you will only be asked "Would you like a sandwich?". If the user affirms this, then the procedure continues analogously as in the previous case. Only in the flow 8 does the program determine that there was no asking for a location and ask further questions about it, possibly it also offers users a corresponding direct order.
- the learning database then has two different ones for the question about the snack
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/DE2017/100587 WO2019011356A1 (fr) | 2017-07-14 | 2017-07-14 | Procédé de conduite de dialogue homme-ordinateur |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3652664A1 true EP3652664A1 (fr) | 2020-05-20 |
Family
ID=60021849
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP17780297.2A Pending EP3652664A1 (fr) | 2017-07-14 | 2017-07-14 | Procédé de conduite de dialogue homme-ordinateur |
Country Status (3)
Country | Link |
---|---|
US (1) | US11315560B2 (fr) |
EP (1) | EP3652664A1 (fr) |
WO (1) | WO2019011356A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115238060A (zh) * | 2022-09-20 | 2022-10-25 | 支付宝(杭州)信息技术有限公司 | 人机交互方法及装置、介质、计算设备 |
Families Citing this family (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200387670A1 (en) * | 2018-01-05 | 2020-12-10 | Kyushu Institute Of Technology | Labeling device, labeling method, and program |
EP3811245A4 (fr) | 2018-06-19 | 2022-03-09 | Ellipsis Health, Inc. | Systèmes et procédés d'évaluation de santé mentale |
US20190385711A1 (en) | 2018-06-19 | 2019-12-19 | Ellipsis Health, Inc. | Systems and methods for mental health assessment |
US11409961B2 (en) * | 2018-10-10 | 2022-08-09 | Verint Americas Inc. | System for minimizing repetition in intelligent virtual assistant conversations |
US11361163B2 (en) * | 2018-12-17 | 2022-06-14 | MuyVentive, LLC | System and method to represent conversational flows as graph embeddings and to conduct classification and clustering based on such embeddings |
US11797768B2 (en) | 2018-12-17 | 2023-10-24 | MuyVentive, LLC | System and method to represent conversational flows as graph embeddings and to conduct classification and clustering based on such embeddings |
WO2020153028A1 (fr) * | 2019-01-22 | 2020-07-30 | ソニー株式会社 | Dispositif de traitement d'informations, procédé de traitement d'informations et programme |
CN109840276A (zh) * | 2019-02-12 | 2019-06-04 | 北京健康有益科技有限公司 | 基于文本意图识别的智能对话方法、装置和存储介质 |
US11288459B2 (en) * | 2019-08-01 | 2022-03-29 | International Business Machines Corporation | Adapting conversation flow based on cognitive interaction |
US11705114B1 (en) | 2019-08-08 | 2023-07-18 | State Farm Mutual Automobile Insurance Company | Systems and methods for parsing multiple intents in natural language speech |
US11126793B2 (en) * | 2019-10-04 | 2021-09-21 | Omilia Natural Language Solutions Ltd. | Unsupervised induction of user intents from conversational customer service corpora |
CN110795532A (zh) * | 2019-10-18 | 2020-02-14 | 珠海格力电器股份有限公司 | 一种语音信息的处理方法、装置、智能终端以及存储介质 |
CN110689891A (zh) * | 2019-11-20 | 2020-01-14 | 广东奥园奥买家电子商务有限公司 | 一种基于公众显示装置的语音交互方法以及设备 |
JP7405660B2 (ja) * | 2020-03-19 | 2023-12-26 | Lineヤフー株式会社 | 出力装置、出力方法及び出力プログラム |
CN111405128B (zh) * | 2020-03-24 | 2022-02-18 | 中国—东盟信息港股份有限公司 | 一种基于语音转文字的通话质检系统 |
US20210319189A1 (en) * | 2020-04-08 | 2021-10-14 | Rajiv Trehan | Multilingual concierge systems and method thereof |
CN111666372B (zh) * | 2020-04-29 | 2023-08-18 | 百度在线网络技术(北京)有限公司 | 解析查询词query的方法、装置、电子设备和可读存储介质 |
FR3111210B1 (fr) * | 2020-06-04 | 2022-07-08 | Thales Sa | Communication bidirectionnelle homme-machine |
CA3125124A1 (fr) * | 2020-07-24 | 2022-01-24 | Comcast Cable Communications, Llc | Systemes et methodes d'entrainement de modeles de commandes vocales |
CN112102840B (zh) * | 2020-09-09 | 2024-05-03 | 中移(杭州)信息技术有限公司 | 语义识别方法、装置、终端及存储介质 |
CN112216278A (zh) * | 2020-09-25 | 2021-01-12 | 威盛电子股份有限公司 | 语音识别系统、指令产生系统及其语音识别方法 |
CN112115249B (zh) * | 2020-09-27 | 2023-11-14 | 支付宝(杭州)信息技术有限公司 | 用户意图的统计分析及结果展示方法和装置 |
CN112487142B (zh) * | 2020-11-27 | 2022-08-09 | 易联众信息技术股份有限公司 | 一种基于自然语言处理的对话式智能交互方法和系统 |
US20230008868A1 (en) * | 2021-07-08 | 2023-01-12 | Nippon Telegraph And Telephone Corporation | User authentication device, user authentication method, and user authentication computer program |
CN113806484B (zh) * | 2021-09-18 | 2022-08-05 | 橙色云互联网设计有限公司 | 关于用户需求信息的交互方法、装置以及存储介质 |
CN114244795B (zh) * | 2021-12-16 | 2024-02-09 | 北京百度网讯科技有限公司 | 一种信息的推送方法、装置、设备及介质 |
CN114676689A (zh) * | 2022-03-09 | 2022-06-28 | 青岛海尔科技有限公司 | 语句文本的识别方法和装置、存储介质及电子装置 |
TWI832562B (zh) * | 2022-11-16 | 2024-02-11 | 英業達股份有限公司 | 同義詞搜尋系統及方法 |
DE102023201174A1 (de) | 2023-02-13 | 2024-08-14 | Siemens Aktiengesellschaft | Verfahren und System zur interaktiven elektronischen Kommunikation |
CN118013390B (zh) * | 2024-04-10 | 2024-06-21 | 北京铁力山科技股份有限公司 | 一种基于大数据分析的智慧工作台控制方法及系统 |
Family Cites Families (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6766320B1 (en) * | 2000-08-24 | 2004-07-20 | Microsoft Corporation | Search engine with natural language-based robust parsing for user query and relevance feedback learning |
US6964023B2 (en) | 2001-02-05 | 2005-11-08 | International Business Machines Corporation | System and method for multi-modal focus detection, referential ambiguity resolution and mood classification using multi-modal input |
WO2002073331A2 (fr) | 2001-02-20 | 2002-09-19 | Semantic Edge Gmbh | Environnement interactif en langage naturel, dependant du contexte et a base de connaissances pour applications dynamiques et flexibles de recherche et de presentation de produits, services et informations |
US7167832B2 (en) | 2001-10-15 | 2007-01-23 | At&T Corp. | Method for dialog management |
ATE363120T1 (de) | 2003-11-10 | 2007-06-15 | Koninkl Philips Electronics Nv | Audio-dialogsystem und sprachgesteuertes browsing-verfahren |
US7949529B2 (en) | 2005-08-29 | 2011-05-24 | Voicebox Technologies, Inc. | Mobile systems and methods of supporting natural language human-machine interactions |
US9318108B2 (en) * | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
CN101669090A (zh) * | 2007-04-26 | 2010-03-10 | 福特全球技术公司 | 情绪提示系统和方法 |
US8578330B2 (en) * | 2007-06-11 | 2013-11-05 | Sap Ag | Enhanced widget composition platform |
US8165886B1 (en) | 2007-10-04 | 2012-04-24 | Great Northern Research LLC | Speech interface system and method for control and interaction with applications on a computing system |
US8219407B1 (en) | 2007-12-27 | 2012-07-10 | Great Northern Research, LLC | Method for processing the output of a speech recognizer |
US20110125734A1 (en) * | 2009-11-23 | 2011-05-26 | International Business Machines Corporation | Questions and answers generation |
US10276170B2 (en) * | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
DE112011100329T5 (de) * | 2010-01-25 | 2012-10-31 | Andrew Peter Nelson Jerram | Vorrichtungen, Verfahren und Systeme für eine Digitalkonversationsmanagementplattform |
US9262612B2 (en) * | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US9105268B2 (en) | 2012-09-19 | 2015-08-11 | 24/7 Customer, Inc. | Method and apparatus for predicting intent in IVR using natural language queries |
US9235978B1 (en) * | 2013-01-16 | 2016-01-12 | Domo, Inc. | Automated suggested alerts based on natural language and user profile analysis |
US9633317B2 (en) * | 2013-06-20 | 2017-04-25 | Viv Labs, Inc. | Dynamically evolving cognitive architecture system based on a natural language intent interpreter |
US9594542B2 (en) * | 2013-06-20 | 2017-03-14 | Viv Labs, Inc. | Dynamically evolving cognitive architecture system based on training by third-party developers |
US10474961B2 (en) * | 2013-06-20 | 2019-11-12 | Viv Labs, Inc. | Dynamically evolving cognitive architecture system based on prompting for additional user input |
US10741182B2 (en) * | 2014-02-18 | 2020-08-11 | Lenovo (Singapore) Pte. Ltd. | Voice input correction using non-audio based input |
US10579396B2 (en) * | 2014-04-09 | 2020-03-03 | Nice-Systems Ltd. | System and automated method for configuring a predictive model and deploying it on a target platform |
US10726831B2 (en) * | 2014-05-20 | 2020-07-28 | Amazon Technologies, Inc. | Context interpretation in natural language processing using previous dialog acts |
US10170123B2 (en) * | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9338493B2 (en) * | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
CN107409126B (zh) * | 2015-02-24 | 2021-03-09 | 思科技术公司 | 用于保护企业计算环境安全的系统和方法 |
CN106844400A (zh) | 2015-12-07 | 2017-06-13 | 南京中兴新软件有限责任公司 | 智能应答方法及装置 |
US10733982B2 (en) * | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10741176B2 (en) * | 2018-01-31 | 2020-08-11 | International Business Machines Corporation | Customizing responses to users in automated dialogue systems |
-
2017
- 2017-07-14 US US16/630,912 patent/US11315560B2/en active Active
- 2017-07-14 EP EP17780297.2A patent/EP3652664A1/fr active Pending
- 2017-07-14 WO PCT/DE2017/100587 patent/WO2019011356A1/fr unknown
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115238060A (zh) * | 2022-09-20 | 2022-10-25 | 支付宝(杭州)信息技术有限公司 | 人机交互方法及装置、介质、计算设备 |
Also Published As
Publication number | Publication date |
---|---|
US11315560B2 (en) | 2022-04-26 |
WO2019011356A1 (fr) | 2019-01-17 |
US20200234700A1 (en) | 2020-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3652664A1 (fr) | Procédé de conduite de dialogue homme-ordinateur | |
DE69814114T2 (de) | Natürliche sprache verstehendes verfahren und verstehende vorrichung zur sprachsteuerung einer anwendung | |
JP7098875B2 (ja) | 会議支援システム、会議支援装置、会議支援方法及びプログラム | |
DE60130880T2 (de) | Web-gestützte spracherkennung durch scripting und semantische objekte | |
DE112016004863T5 (de) | Parametersammlung und automatische Dialogerzeugung in Dialogsystemen | |
DE102017122358A1 (de) | Bedingte Bereitstellung von Zugriff durch interaktive Assistentenmodul | |
CN110175229B (zh) | 一种基于自然语言进行在线培训的方法和系统 | |
DE112016003626T5 (de) | Natürlichsprachliche Schnittstelle zu Datenbanken | |
WO2015113578A1 (fr) | Procédé automatique de reconnaissance sémantique et de mesure de l'univocité de texte | |
DE102013003055A1 (de) | Verfahren und Vorrichtung zum Durchführen von Suchen in natürlicher Sprache | |
EP1950672A1 (fr) | Procédé et système de traitement des données destinés aux demandes commandées d'informations stockées de manière structurée | |
AT6920U1 (de) | Verfahren zur erzeugung natürlicher sprache in computer-dialogsystemen | |
EP1599866B1 (fr) | Systeme et procede de traitement linguistique | |
CN112579757A (zh) | 智能问答方法、装置、计算机可读存储介质及电子设备 | |
DE60214850T2 (de) | Für eine benutzergruppe spezifisches musterverarbeitungssystem | |
DE102013101871A1 (de) | Wortwahlbasierte Sprachanalyse und Sprachanalyseeinrichtung | |
DE102019218918A1 (de) | Dialogsystem, elektronisches gerät und verfahren zur steuerung des dialogsystems | |
DE202017105979U1 (de) | Systeme und Computerprogrammprodukte zur Handhabung von Formalität in Übersetzungen von Text | |
DE19849855C1 (de) | Verfahren zur automatischen Generierung einer textlichen Äußerung aus einer Bedeutungsrepräsentation durch ein Computersystem | |
DE102022000046A1 (de) | System zur erweiterbaren Such-, Content- und Dialogverwaltung mit zwischengeschalteter durch Menschen erfolgender Kuratierung | |
EP3576084B1 (fr) | Conception du dialogue efficace | |
WO2020126217A1 (fr) | Procédé, dispositif et utilisation pour générer une sortie de réponse en réaction à une information d'entrée vocale | |
DE102016125162B4 (de) | Verfahren und Vorrichtung zum maschinellen Verarbeiten von Texten | |
DE19930407A1 (de) | Verfahren zur sprachbasierten Navigation in einem Kommunikationsnetzwerk und zur Implementierung einer Spracheingabemöglichkeit in private Informationseinheiten | |
EP3291090B1 (fr) | Procédé et système de formation d'une interface numérique entre appareil terminal et logique d'application via apprentissage profond et informatique en nuage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20200214 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20221209 |