CN110059164B

CN110059164B - Method and system for presenting a user interface of a dialog system

Info

Publication number: CN110059164B
Application number: CN201910022976.1A
Authority: CN
Inventors: R·阿南德; A·阿罗拉; R·巴基斯; 冯松; J·甘霍特拉; C·古纳塞卡拉; D·纳哈莫; L·珀利麦纳科斯; S·D·沙西哈拉; 朱立
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2018-01-11
Filing date: 2019-01-10
Publication date: 2023-06-06
Anticipated expiration: 2039-01-10
Also published as: CN110059164A

Abstract

The invention relates to semantic representation and implementation for dialog systems. A method, apparatus, and computer program product for presenting a user interface of a dialog system are described. A unified semantic representation of dialog content between a user and a dialog system is created as a context graph of concepts and relationships. A set of sub-graph components of the semantic context graph are dynamically identified based on current dialog activity. The identified sub-graph component sets in the user interface serve as graphical element sets representing the respective concepts and relationships.

Description

Method and system for presenting a user interface of a dialog system

Technical Field

The present disclosure relates generally to natural language processing. More particularly, the present disclosure relates to natural language processing for dialog systems.

Background

It is becoming commonplace for users to encounter applications such as virtual agents and chat robots that provide natural language interfaces to Web content, applications and channels. Typically, these applications employ dialog systems that interact with end users using natural language based dialog prompts to accomplish goal-oriented tasks, such as online transactions. While these applications offer tremendous potential value, they are limited in the types of information and help they provide because the applications do not understand natural language adequately and it is difficult to generate interfaces for each potential user need. Thus, these systems typically limit dialog prompts to direct and static responses to user requests without providing an appropriate context or explanation as to why the system response was generated. Chat robots will often lack the ability to process specific items within end user feedback unless predicted by the system designer.

Existing dialog systems often do not adequately inform end users that they use only natural language based, result-oriented dialog prompts. Unless designed fully, hints can lead to unexpected ambiguity and undiscovered misunderstanding of both the system and end user during a conversation.

Furthermore, due to the frustration experienced with the use of the system, the end user loses the willingness to further use the dialog system, which leaves the system with no opportunity to obtain valuable user input that can be used to improve the system.

There is a need for further improvements in computer-aided dialog systems.

Disclosure of Invention

In accordance with the present disclosure, a method, apparatus, and computer program product are provided for presenting a user interface of a dialog system. A unified semantic representation of dialog content between a user and a dialog system is created as a context graph of concepts and relationships. A set of sub-graph components of the semantic context graph are dynamically identified based on current dialog activity. The identified sub-graph component sets in the user interface serve as graphical element sets representing the respective concepts and relationships.

Some of the more relevant features of the disclosed subject matter are summarized above. These characteristics should be interpreted as illustrative only. Many other advantageous results can be obtained by applying the disclosed subject matter in different ways or by modifying the invention to be described.

Drawings

For a more complete understanding of the present invention, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:

FIG. 1 depicts an exemplary block diagram of a distributed data processing environment in which exemplary aspects of the illustrative embodiments may be implemented;

FIG. 2 is an exemplary block diagram of a data processing system in which exemplary aspects of the illustrative embodiments may be implemented;

FIG. 3 illustrates an architecture of a framework for implementing embodiments of the present invention;

FIG. 4 illustrates how a system converts sequential user utterances made during a dialog process with the system into a more compact semantic meaning diagram according to one embodiment of the present invention;

FIG. 5 is a diagram of a simplified unified semantic graph using an example domain ontology (ontology) according to one embodiment of the invention;

FIG. 6 illustrates a system dynamically highlighting a portion of a context graph to cause user feedback according to an embodiment of the present invention;

FIG. 7 illustrates how a Surface Semantic Representation (SSR) is presented in a conversational interface according to one embodiment of the invention;

FIG. 8 is a diagram illustrating several example user feedback inputs for one embodiment of the invention;

FIG. 9 is a diagram illustrating one embodiment of having a chat agent ask a correct question; and

Fig. 10 is a diagram of a neural network framework in which modular connections for dialog management are used in an embodiment of the invention.

Detailed Description

At a high level, the preferred embodiments of the present invention provide a system, method, and computer program product for a dialog system that provides the end user with basic information for the system to complete a task. By providing this information, the interface can explain why the dialog system responded in its way and strategically entice the end user to use their feedback to improve the usability of the dialog system. In embodiments of the present invention, natural language based dialog cues are enhanced by a framework that dynamically generates dialog cues that provide more information for end users based on semantic context, domain knowledge, and dialog activity.

To the inventors' knowledge, the present invention is the first attempt to systematically generate semantic representations of dialog activities based on domain knowledge and graphically present the generated semantic representations at the user interface level to cause user feedback. In an embodiment of the invention, the generated semantic representations correspond to respective dialog activities. The interface obtains user input on the implicit dialog and low-level annotations for machine learning purposes. Because the semantic representation is dynamically generated, derived from multiple sources, and optimized from the end user's perspective, embodiments of the present invention represent an important improvement over prior art work on semantic content integration. The multi-contribution dynamic nature of conversations to be semantically represented in a user interface represents a difficult problem in semantic integration.

To address the limitations of natural language-based interactions and to improve the usability of dialog systems, embodiments of the present invention provide a unified framework to generate semantic graph representations of dialog for object-oriented tasks. Further, the system dynamically identifies sub-graphs within the representation for presentation in the user interface based on dialog activity and domain logic when requested, possible, and necessary. Specifically, embodiments of the present invention determine how the system interprets the user's input, how the system processes the backend information, and how the system provides a simple interpretation of domain logic and query results.

In contrast to conventional natural language-based interfaces, embodiments of the present invention utilize the expressive power of a graph-based model by: (1) Normalizing the text content to generate a semantic meaning representation; (2) Integrating the domain-interpretable entities and relationships with semantic matching techniques to generate a semantic context graph; (3) Dynamically identifying sub-graphs of the semantic context graph for the identified dialog operations; and (4) presenting the graphical representation of the selected content, for example as a set of graphical elements, as part of a dialog prompt for a dialog system.

The process of enhancing dialog cues using basic semantic or semantic surface implementations (SSRs) aims to effectively aid in the transfer of information and knowledge between the system and the end user and to enable the end user to provide various levels of feedback. The SSR of the current conversation has practical use for several embodiments of the invention, such as (a) experienced end users of mobile interfaces on web sites or chat robot services; (b) crowdsourcing workers about dialogue annotation tasks; (c) Subject matter expertise regarding knowledge transfer to a system; and (d) domain knowledge based teaching tools. An important aspect of the framework is the feedback input that causes the end user. In embodiments of the present invention, feedback is used for annotation purposes and is received through the interactive nature of the enhanced dialog prompts. With simple post-processing, the obtained feedback data is applied to advance learning to improve future conversations with the user through the conversational system.

With reference now to the figures and in particular with reference to FIGS. 1-2, exemplary diagrams of data processing environments are provided in which illustrative embodiments of the present disclosure may be implemented. 1-2 are only exemplary and are not intended to assert or imply any limitation with regard to the environments in which aspects or embodiments of the disclosed subject matter may be implemented. Many modifications to the depicted environments may be made without departing from the spirit and scope of the present invention.

With reference now to the figures, FIG. 1 depicts a pictorial representation of an exemplary distributed data processing environment in which aspects of the illustrative embodiments may be implemented. Distributed data processing system 100 may include a network of computers in which aspects of the illustrative embodiments may be implemented. Distributed data processing system 100 contains at least one network 102, network 102 being the medium used to provide communications links between various devices and computers connected together within distributed data processing system 100. Network 102 may include connections, such as wire, wireless communication links, or fiber optic cables.

In the depicted example, server 104 and server 106 connect to network 102 along with network storage unit 108. In addition,

clients

110, 112, and 114 are also connected to network 102. These

clients

110, 112, and 114 may be, for example, smart phones, tablet computers, personal computers, network computers, and the like. In the depicted example, server 104 provides data, such as boot files, operating system images, and applications to

clients

110, 112, and 114. In the depicted example,

clients

110, 112, and 114 are clients to server 104. Distributed data processing system 100 may include additional servers, clients, and other devices not shown. One or more of the server computers may be mainframe computers connected to the network 102. The mainframe computer may be, for example, an IBM System z mainframe running an IBM z/OS operating System. A mass storage unit and workstation (not shown) may be connected to the mainframe. The workstation may be a personal computer directly connected to the mainframe communicating via a bus or a console terminal directly connected to the mainframe via a display port.

In the depicted example, distributed data processing system 100 is the Internet with network 102 representing a worldwide collection of networks and gateways that use the Transmission control protocol/Internet protocol (TCP/IP) suite of protocols to communicate with one another. At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers, consisting of thousands of commercial, government, educational and other computer systems that route data and messages. Of course, distributed data processing system 100 also may be implemented as a number of different types of networks, such as for example, an intranet, a Local Area Network (LAN), a Wide Area Network (WAN), etc. As noted above, FIG. 1 is intended as an example, not as an architectural limitation for different embodiments of the disclosed subject matter, and thus, the particular elements shown in FIG. 1 should not be considered limiting with respect to the environments in which the illustrative embodiments of the present invention may be implemented.

With reference now to FIG. 2, a block diagram of an exemplary data processing system is shown in which aspects of the illustrative embodiments may be implemented. Data processing system 200 is an example of a computer, such as client 114 in FIG. 1, in which computer usable code or instructions implementing the processes for illustrative embodiments of the present disclosure may be located.

With reference now to FIG. 2, a block diagram of a data processing system is shown in which illustrative embodiments may be implemented. Data processing system 200 is an example of a computer, such as server 104 or client 110 in FIG. 1, in which computer usable program code or instructions implementing the processes for the illustrative embodiments may be located. In this illustrative example, data processing system 200 includes a communication fiber network 202 that provides communications between processor unit 204, memory 206, persistent storage 208, communication unit 210, input/output (I/O) unit(s) 212, and display 214.

The processor unit 204 is used to execute instructions of software that may be loaded into the memory 206. Processor unit 204 may be a set of one or more processors or may be a multi-processor core, depending on the particular implementation. Further, processor unit 204 may be implemented using one or more heterogeneous processor systems in which a main processor and a secondary processor are present on a single chip. As another illustrative example, processor unit 204 may be a Symmetric Multiprocessor (SMP) system containing multiple processors of the same type.

Memory 206 and persistent storage 208 are examples of storage devices. A storage device is any hardware capable of temporarily and/or permanently storing information. In these examples, memory 206 may be, for example, random access memory or any other suitable volatile or non-volatile storage device. Persistent storage 208 may take various forms, depending on the particular implementation. For example, persistent storage 208 may contain one or more components or devices. For example, persistent storage 208 may be a hard drive, a flash memory, a rewritable optical disk, a rewritable magnetic tape, or some combination of the above. The media used by persistent storage 208 also may be removable. For example, a removable hard drive may be used for persistent storage 208.

In these examples, communication unit 210 provides for communication with other data processing systems or devices. In these examples, communication unit 210 is a network interface card. The communication unit 210 may provide communication using one or both of physical and wireless communication links.

Input/output unit 212 allows data to be input and output using other devices that may be connected to data processing system 200. For example, the input/output unit 212 may provide a connection through a keyboard, a mouse for user input. Further, the input/output unit 212 may transmit an output to a printer. Furthermore, the input/output unit may provide a connection to a microphone for audio input from a user and to a speaker for providing audio output from a computer. Display 214 provides a mechanism for displaying information to a user.

Instructions for the operating system and applications or programs may be located on persistent storage 208. These instructions may be loaded into memory 206 for execution by processor unit 204. Processor unit 204 may perform the processes of the different embodiments using computer-implemented instructions, which may be located in a memory (e.g., memory 206). These instructions are referred to as program code, computer usable program code, or computer readable program code that may be read and executed by a processor in processor unit 204. In different embodiments, the program code may be embodied on different physical or tangible computer readable media (e.g., memory 206 or persistent storage 208).

Program code 216 is located in a functional form on computer readable media 218 that is selectively removable and may be loaded onto or transferred to data processing system 200 for execution by processor unit 204. In these examples, program code 216 and computer readable medium 218 form a computer program product 220. In one example, computer-readable medium 218 may take the form of a tangible form, such as an optical or magnetic disk that is inserted or placed into a drive or other device that is part of persistent storage 208 for transfer onto a storage device (e.g., hard drive) that is part of persistent storage 208. In tangible forms, computer readable medium 218 may also take the form of a persistent storage device, such as a hard drive, a thumb drive, or a flash memory connected to data processing system 200. The computer readable medium 218 in tangible form is also referred to as a computer recordable storage medium. In some cases, the computer recordable medium 218 may not be removable.

Alternatively, program code 216 may be transferred from computer readable media 218 to data processing system 200 through a communications link to communications unit 210 and/or through a connection to input/output unit 212. In the illustrative examples, the communication links and/or connections may be physical or wireless. The computer readable medium may also take the form of non-tangible media, such as communications links or wireless transmissions containing the program code. The different components illustrated for data processing system 200 are not meant to provide architectural limitations to the manner in which different embodiments may be implemented. The different illustrative embodiments may be implemented in a data processing system that includes other components in addition to or in place of those described with respect to data processing system 200. Other components shown in fig. 2 may differ from the illustrative example shown. As one example, a storage device in data processing system 200 is any hardware apparatus that may store data. Memory 206, persistent storage 208, and computer-readable media 218 are examples of storage devices in a tangible form.

In another example, a bus system may be used to implement communication fiber network 202 and may include one or more buses, such as a system bus or an input/output bus. Of course, the bus system may be implemented using any suitable type of architecture that provides for a transfer of data between different components or devices attached to the bus system. Further, the communication unit may comprise one or more devices for transmitting and receiving data, such as a modem or a network adapter. Further, a memory may be, for example, memory 206, or a cache such as found in an interface, and a memory controller hub that may be present in the communication fiber network 202.

The computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java ^TM Smalltalk, c++, c#, object-C, etc., as well as conventional procedural programming languages, such as Python or C. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

Those of ordinary skill in the art will appreciate that the hardware in FIGS. 1-2 may vary depending on the implementation. Other internal hardware or peripheral devices, such as flash memory, equivalent nonvolatile memory, or optical disk drives and the like, may be used in addition to or in place of the hardware depicted in figures 1-2. Furthermore, the processes of the illustrative embodiments may be applied to a multiprocessor data processing system, other than the SMP system mentioned previously, without departing from the spirit and scope of the disclosed subject matter.

The techniques described herein may operate jointly within a standard client-server paradigm, such as that shown in fig. 1, in which a client communicates with an internet-accessible Web-based portal (which executes on a collection containing one or more machines). The end user operates an internet-connectable device (e.g., a desktop computer, a notebook computer, an internet-enabled mobile device, etc.) capable of accessing and interacting with the portal. Typically, each client or server machine is a data processing system (including hardware and software) such as that shown in FIG. 2, and the entities communicate with each other over a network (e.g., the Internet, intranet, extranet, private network) or any other communication medium or link. A data processing system typically includes one or more processors, an operating system, one or more applications, and one or more utilities.

During a conversation between the system and the user, embodiments of the present invention use the framework to generate a unified model of conversational content and dynamically select relevant content from the model for presentation in the user interface. Specifically, embodiments determine how the system semantically interprets the user utterance, processes the request at the backend application, and requests user feedback. In various embodiments, user feedback is requested as and when possible, according to user preferences and when necessary. Because the user interface better informs the end user to provide various levels of feedback, it is possible to have more data annotated by the end user, which is collected over time to improve the dialog system.

Fig. 3 shows the architecture of a framework for implementing an embodiment of the invention. In general, the framework for implementing the present invention should provide four core functions (1) semantic interpretation of the user utterance; (2) semantic integration; (3) dynamic content selection; and (4) semantic surface implementation. The first two components correspond to a unified model of semantic content that is built together by integrating the semantic parsing results of text input (e.g., user utterances) into a dialog system and domain concepts embodied in seed graphs of domain ontology. After creating a unified semantic representation of dialog content as a context graph of concepts and relationships, the system dynamically identifies sub-graphs of the semantic context graph that are to be presented to the end user due to current dialog activity. The system then presents the content of the sub-graph at the user interface level using the corresponding graphical elements. As will be discussed in more detail below, a "semantic surface implementation" (SSR) of a dialog with a user is presented in a preferred embodiment of the invention. User feedback received from the interface is then processed, stored, and used to improve the dialog system.

Turning now to fig. 3, a user interacts with a user interface 301. The user interface captures both the annotated feedback 302 and the normal dialog between the user and the system (e.g., user utterance 303). In a preferred embodiment, the dialogue agent response of the system, i.e. the system part of the dialogue, is also captured. The natural language utterances are evaluated by a set of semantic analyzers 305, the semantic analyzers 305 producing formal interpretations of the natural language utterance meanings that can be used by the rest of the dialog system. Because the user provides annotated feedback based on graphical constructs in the user interface, the annotated feedback may be input directly to the graph integrator 307, with the graph integrator 307 being responsible for creating context graphs for the latest "turn" in the user/system dialog, from inputs from multiple system components and information sources.

The semantic interpretation 308 from the semantic analyzer 305 is supplied to a meaning representation processor 309, the meaning representation processor 309 converting the interpretation into a semantic representation suitable for incorporation in a context graph. The context parser component 311 provides input to the graph integrator 307 regarding previous user input (e.g., previous user utterances) so that a graph can be constructed from the current context of the user/system dialog. As will be discussed in more detail below, certain natural language meanings are clarified by evaluating the context of a current utterance with reference to a previous utterance. The graph integrator 307 generates a semantic meaning representation 312 (MR graph) of the sentence for the latest turn in the dialog and integrates with the unified semantic graph 321. Each user utterance captured in a dialog is in turn converted into its own sentence meaning graph. Although the word "sentence" means related to a sentence, those skilled in the art will recognize that not all user utterances will be sentences that are fully grammatical rules, and that a user utterance may include more than one sentence. Its use in the specification is intended to convey that one or more meaning maps are created in terms of utterances (i.e., for most, if not all, user utterances in a conversation).

In a preferred embodiment, the statement semantic meaning representations are converted into corresponding statement "concept" graphs. Given sentences in a user utterance, MR maps are semantic analyses of sentences using semantic tags (not just concepts), whereas concept maps are based on domain concepts.

The context resolver 311 may also access a "back-end" application 319, for which the dialog system is "front-end". The back-end application includes

several databases

313, 315, 317 that include domain specific information. In different embodiments of the invention, only certain databases will be present. The domain-specific information is used to disambiguate the user utterance when the user is currently engaged in a task for which the backend application is designed to complete. The context resolver 311 may generate queries against these databases to obtain domain-specific information for semantic basis. "semantic basis" refers to a mapping from textual content to related knowledge (e.g., domain concepts/relationships).

Semantic meaning representation 312 is incorporated into unified semantic graph 321, unified semantic graph 321 being a context graph of dialog content. In the referenced embodiment,

graphs

312 and 321 are merged, as described below in the section entitled "semantic integration for dialog content". The relevant information (given user intent) is integrated by any or several known types of integration processes, including, for example, cross-sentence, cross-turn, cross-interlocutor, and cross-knowledge base. With the semantic meaning graphs obtained, relevant semantic content is identified based on the domain database so that queries or commands can be formed to complete tasks. Performing semantic matching at two different levels, one at the element level; one at the structural level. For graphical elements, the system computes semantic similarity between domain concepts and node names in the MR graph. If the similarity score is above a certain threshold (determined by practice), the graph node is mapped with a domain concept. For a graph structure, the system considers semantic dependencies based on equivalencies, partial overlap, superset, and subsets.

A graphical construct 322 (e.g., a sub-graph) is presented for provision as a dialog prompt 323 as part of the user interface 301 for user annotation. As will be discussed below, the presented graphical constructs need not be continuous concepts and relationships from a unified semantic graph, but may be selected relationships and concepts that are predicted to be most likely to cause user feedback. User knowledge 322 and query results 324 are used to contribute to the unified semantic graph 321. User knowledge is an important source for improving existing dialog systems and user experience. For example, if an end user often mentions something, but it is not in the domain knowledge base. It is useful to identify such user knowledge and add it to the domain knowledge base.

As described above, the preferred embodiment of the present invention presents a semantic surface implementation (SSR) interface as part of a system dialogue with a user. The portion of the user interface identifies an intermediate semantic representation being used by the system to direct a portion of its object-oriented dialog. That is, the system actually tells the user why it is presenting the user with a particular selection. The more structured graphical representation presented as part of the user interface allows the underlying executable semantic interpretation of the user request to be visible and sufficiently understandable to the end user. In this way, the user is able to see how the system processes and interprets the context information. It also allows the user to provide feedback to the dialog manager via the chat interface, e.g. if the assumptions made by the system are good assumptions. The semantic representation presented at the user interface is graphical and thus more intuitive than lengthy dialog interpretation, thus allowing the end user's feedback to be graphical as well. That is, the user is able to interact with the graphical interface. By displaying semantic interpretations corresponding to the latest dialog states by means of intuitive graphical properties, the SSR interface is easy to understand (especially for experienced users) while being visually intuitive.

To develop intermediate representations that encode various semantics, embodiments of the present invention include a framework for converting and integrating semantic interpretations into a unified model. In an ideal case, the method for creating the unified model should be generalizable across applications and domains and semantically expressed to capture the meaning of the various queries. Other desirable attributes include computational convenience to support well-defined standard computing techniques, compatibility with primary backend storage (e.g., relational and graph databases), and interoperability and reusability for different applications.

In an embodiment of the invention, a graph-based approach is used to generate an intermediate semantic representation of a dialog. One challenge is to handle heterogeneous resource based context semantics and integrate them into a unified model. For object-oriented chat, the context semantics are determined by both the "informal" requestor (i.e., end user) and the "formal" responder (i.e., dialog system). More specifically, the semantics of the user intent may be embedded in the user utterance, which may include information such as specific goals or intent (e.g., "find lessons"), support information (e.g., "lessons with 3 points"), and user-centric information (e.g., "preferred theory lessons"). Semantics can also be interpreted using a domain corresponding to facts or ontology Knowledge Bases (KB) at the back-end of the dialog system. This information is typically stored in a relational and/or graphic database and is used in the preferred embodiment of the invention to interpret user utterances and also to provide information in response to user queries. Another challenge is that selecting sub-graph components from the intermediate semantic representation needs to be intuitive enough to present to the end user at the user interface. In a preferred embodiment, a concise and intuitive set of visual constructs representing selected sub-graph components are identified by combining the characteristics of both Tuple Relationship Calculus (TRC) and field relationship calculus (DRC). Explicit expressions of TRCs and DRCs use a compact set of connective words and specified variables, which can be shown by nodes and edges of the graph representation. The important core subtasks are generating a semantic representation of the user utterance, integrating the context graph with the full semantic representation (with user intent and domain interpretable semantics), and dynamically selecting sub-graph content in preparation for surface implementation of sub-graph components in the interface.

Interpretation of user utterances

The user utterance includes important context information that generally determines the course of a conversation between the system and the user. For descriptive purposes, a "user utterance" includes a spoken utterance interpreted by a speech recognition system and a written response and query to a dialog system. One core task is to convert the user utterance into a more standard, formal and canonical representation or semantic representation, which is closely related to the semantic analysis task. In an embodiment of the invention, the user utterance is interpreted based on the semantic analysis result. From the interpretation results, the dialog system generates a conceptual diagram representing relevant content for completing the task. Various types of semantic analysis mechanisms are used in embodiments of the present invention.

Specifically, in the preferred embodiment, the most recently introduced Meaning Representation Language (MRL) -Abstract Meaning Representation (AMR) is used. AMR is an analytical mechanism for multi-layer semantic interpretation, abstract representation, and unified simple data structures. AMR formalizes all sentence semantics and is specifically designed to normalize language and represent its meaning. It is equipped with a large-scale universal annotation library for English sentence semantics. AMR expresses the meaning of sentences in a graph, where nodes represent concepts (e.g., events, entities, attributes) and edges represent relationships (e.g., parts, agents, locations). The semantic relationships encoded in the AMR map can be interpreted as a combination of logical propositions or triplets. AMR maps are root, directed, acyclic, edge-labeled, leaf-labeled maps designed to be easily annotated and read by humans and calculated by computer programs. It interprets the assertions ("who did what to whom"), identifies concepts, values, and named entities.

Thus, because of these advantages, the preferred embodiment of the present invention employs AMR maps to express the semantic meaning of a user utterance. Preferably, the AMR map is adjusted by domain knowledge stored at the back end of the dialog system. The analysis process includes mapping tokens (token) of a text query (i.e., user utterance) to various ontology elements, such as concepts, attributes, and relationships between the corresponding concepts. Several semantic aspects annotated by AMR are closely related to query constructs, including, for example, entities/values, comparisons, aggregations, adjectives, connective, potentially forming queries with complex structures and implicit dialog flows.

A significant feature of AMR annotation is that it abstracts elements of surface syntax structures (e.g., word order and morphological syntactic markers). Thus, the AMR map can be converted into a conceptual map that encodes the primary semantic content. Recent work discusses system conversations from AMR to first order logic. It is important to convert natural language into formal representation so that dialog systems can use formal representation for reasoning. The first order logic is computationally convenient to automatically infer. AMR is therefore very suitable for this purpose.

Semantic analysis across several sentences is sometimes required in order for a user to interpret a request or for the system to interpret the semantic meaning of a request. Where this is required, in the preferred embodiment, the system first runs the semantic analyzer sequentially for sentences and obtains an ordered semantic atlas. Depending on the dialog, there may be a semantic overlap or association between the graphs (discussing analysis/convincing techniques). Based on the associations between sentences, an update of the attributes of the same or similar concepts between user utterances is performed. In a preferred embodiment of the invention, the graphs during the dialog are integrated. Several graph-based operations facilitate the integration of individual sentence graphs into one graph:

Merging-combining nodes with the same semantic meaning, e.g. co-referenced nodes, identified entities, semantic frameworks, wiki-zation.

Fold-handle syntax rules for named entities, i.e. hide nodes that are no longer active or semantically related.

The expand-add analyzer does not present or identify in language implicit nodes and edges.

Concatenation-if no relationship is detected, the two graphs are connected in the order in which they were generated with virtual ROOT nodes.

Reconstruct-change the relationship between nodes, including detach and attach edges.

Align-original text index and concept node/elasticity search to quickly search sub-graphs.

If i is a concept node and i- > r- > j is an edge from i to j with a relationship r. In this case, the node set connected by the ingress/egress edges E (in) and E (out) (path between i and j) will be (i … j). Thus, if the path between nodes i and j can be folded or merged, the path becomes ij.

Fig. 4 shows how the system converts multiple user utterances (i.e., sequential user utterances made during the course of a conversation with the system) into a more compact semantic meaning diagram (unified semantic diagram 321 in fig. 3). Semantic meaning diagram 400 includes a plurality of nodes that are topics associated with university course selection websites. As is conventional, nodes are represented by circles or ellipses connected by lines representing the relationships or edges of the graph. The original graph 400 is compiled from several different user utterances made during a dialog process with a dialog system. Embodiments of the present invention will recognize that there is an opportunity for integrating the original graph 400 into the integrated graph 401. For example, as part of sub-graph merge operation 403, two "course" nodes may be integrated, resulting in a single "course" node 405 in the integrated graph 401. Further, as shown, as part of the sub-graph folding operation, the "credits" and "3" nodes 407 from the original graph 400 may be combined into a single "credit" in the integrated graph 401: 3 "node 409. In addition, sub-graph unroll operation 411 may incorporate both "algorithm" and "theoretical" nodes from original graph 400. The generated subgraph is titled by "theoretical" node 413 in integrated graph 401.

In a preferred embodiment of the present invention, the sentence patterns are compressed into integrated sentence patterns before the integrated sentence patterns are merged into a unified semantic pattern.

Semantic integration of dialog content

The purpose of semantic integration is to collect relevant information for completing a task from various sources.

In particular, the process generally involves integrating the interpretation of user requests and queries by the system into one or more databases in the back-end application. In a preferred embodiment, the process further includes editing the automated command and intermediate or final query results into a unified format. These embodiments use a unified context map based on semantic meaning maps from user utterances. This general approach may be built on top of different dialog systems.

The relevant information may be collected using prior knowledge (e.g., core domain ontology, dialog task, or primary user intent). The given user intent may be derived from multiple types of integration techniques (e.g., cross-sentence, cross-turn, cross-speaker, and cross-knowledge base). The semantic meaning map is preferably obtained based on the method described above or variants thereof.

Next, the system identifies relevant semantic content based on the information in the domain database so that queries or commands for completing the task can be formed. In the preferred embodiment, the identification is done as semantic matches at two different levels, one at the element level; one at the structural level. For graph elements, the system computes semantic similarity between domain nodes and node names in the sentence MR graph (or integrated sentence graph). If the similarity score is above a particular threshold (determined by practice), then the graph nodes in the MR graph are mapped with domain-specific concepts in the domain knowledge graph. For graph structures, in an embodiment, the system considers semantic relevance based on equivalence, partial overlap, superset, and subset. If the similarity score is above a particular threshold, then the subgraph is mapped with domain propositions, which generally correspond to the query graph.

The similarity score equation used in embodiments of the present invention is given below:

score(i，j)＝a*equal(i，i)+b*overlap(i，j)+c*suprrset(i，j)+d*subset(i，j)

in an embodiment of the invention, a generic graph-based query is used. Preferably, the query is independent of the type of backend system coupled to the dialog system. By making the queries independent, this helps to suppress unnecessary detail of other modules and improves the robustness of the framework as database schemas change. Unlike low-level query languages such as SQL, the query is designed as a simplified but more intuitive representation of the modeling process without a specific grammar.

An integration process for generating a context map for an embodiment of the present invention is described in table 1. Let K be the core domain concept and S be the domain proposition (triplet).

Table 1: algorithm for semantic graph integration

Table 1 describes a given sentence sequence S and an empty or existing unified semantic graph G, integrating sentences in S with G. First, the system identifies a direct overlap node between gi and G (b), updating G accordingly; then semantically matching gi with domain knowledge K and updating unified semantic graph G accordingly.

Content selection

Content selection aims at dynamically identifying semantic representations or subgraphs of a context graph for presentation in a user interface. More specifically, the system predicts when what information is displayed to the end user at the interface to help achieve the goal while being aided by the dialog system. The second goal is to present information that is predicted to be most likely to collect user feedback, such as predicted based on learning of past user sessions. In principle, the semantic representation mainly corresponds to the current dialog action. For example, if the current dialog action is that the user provides information to the system, the selected subgraph corresponds to how the system interprets the latest user utterance based on domain concepts and logic. If the current dialog action is to have the system provide a simple interpretation of the query results, the corresponding sub-graph will be a representation of the database query. Alternatively, if the original database query does not generate valid results, the system may present a variant of the original database query.

However, a subgraph explicitly corresponding to a dialog action may not be available. In these cases, in a preferred embodiment of the present invention, the candidate subgraphs are ranked using a scoring schema, based on two main aspects: (1) Given a user intent, how the corresponding subgraph and user intent are semantically related; (2) Given the corresponding subgraphs, how likely it is that the user will provide feedback. Candidate subgraphs are obtained by a predetermined number of hops away from concept nodes representing user intent in the unified semantic representation. If no user intent is provided, the default content is based on a semantic meaning map of the latest user utterance. In a preferred embodiment, the system rewards nodes and graph structure heuristics designed for dialog content. The scoring pattern is given by the following equation:

for node i of the subgraph, there is a gain (denoted q (i)), provided that

Node I has previously emerged;

node I is domain interpretable;

node I is semantically editable/annotated by the end user; and

node I is semantically related to the previous domain concept.

For the edge (I, j) of the semantic context graph, if there is an information gain (denoted p (I, j)), then the precondition is:

-edges (I, j) have not previously appeared;

-edges (I, j) can be interpreted using the domain;

-edges (I, j) are editable/annotated by the end user;

-an edge (I, j) has a semantic dependency with a previous node or edge;

-edges (I, j) for forming queries; and

-the edges (I, j) indicate previous values of the concept.

The system first selects a candidate sub-graph associated with the current dialog activity. If not, the sub-graphs are ranked based on score (V ', E'), and the highest ranked sub-graph is selected. Alternative embodiments of the present invention use similar sets of factors in different scoring equations to quantify subgraphs, including at least one of the following: concept level characteristics, relationship level characteristics, or discussion level characteristics, and then rank the sub-atlases based on quantization factors.

Graphical representation at user interface

The preferred embodiment of the present invention uses a visual construction set for presenting the underlying semantics and visual interpretation of the interactive interface. The interface is used to collect feedback from the user, and as described above, is a semantic surface implementation (SSR) interface for dialog systems. The task involves visual presentation of ontology knowledge, dynamic update of dialog states (if given time and space allocation of knowledge in the interface).

To present the semantic representation to the end user, the conceptual simplicity and maximum information about the current action are emphasized: (1) the end user should understand the presentation; (2) the dialog activity should have good coverage; (3) The design should indicate an explicit identification of a change in dialog state; (4) the interface should facilitate user input. In an embodiment of the present invention, the type and number of graphic elements may be selected according to the type or expertise level of the user. For example, a subject matter expert who is training a dialog system may be presented with a denser sub-graph than a novice user (who may be presented with only a few sub-graph components). There is a tradeoff between efficiency (more elements making it easier to provide more feedback) and user friendliness (more elements confusing, especially for novice users).

In practice, the dialog activity may be very complex, and the whole graph, or even just the relevant subgraphs, will reflect this complexity. One option used in some embodiments is to characterize dialog activities and match them with corresponding graphical characteristics that the user interface designer predicts will be most important to the user. The presentation in the interface encompasses the context generated by both the end user and the system. The purpose of representing some of the user's inputs again to the user is to inform the user how the system understands their inputs so that they may or may not agree with the analysis results. Sometimes it is also important to inform the end user of the task completion progress so that the user understands the task status and proposes alternatives that can be used to complete the task. In addition to the traditional dialog interface, a set of visual constructs is added to support presentation of semantic information and to request various forms of feedback, as shown in fig. 7.

Another option used in other embodiments is to pre-compute and program the presentation in the graphical user interface based on which sub-picture elements have the greatest semantic expressivity of the current action. This is part of the semantic integrity criteria, i.e. which sub-graph element set shows the "best overall situation" of the current state of the dialog between the user and the dialog system.

To further improve performance, system optimization may be used to present the display area of the sub-graph under temporal and spatial constraints. That is, if all relevant subgraphs (multiple) cannot be presented, the system optimizes the content based on space and time constraints, with p (i) as the space occupied by node i, q (i, j) as the space occupied by edge (i, j), and S as the total available space. In general, real-time visualization is time sensitive in that the user will speak a new utterance or the system will be expected to reply to a previous utterance. Thus, when a user provides a new utterance quickly, a complex graphical representation would likely be unacceptable from a user satisfaction perspective. On the other hand, when the system is under test and the subject matter expert is interacting with the interface to correct the assumptions of the system, the speed of the conversation may be slower and thus potentially provide more information. Thus, in embodiments of the present invention, the most recent dialog speed is used to determine the temporal constraints of the presentation.

A further constraint is that the presentation must be understandable to the user, so while the most important nodes and edges are preferentially displayed, the less important nodes and edges may also be displayed if they are to present an increasing meaning. An equation for calculating which elements to display in the interface is given in table 2.

x _i ，y _i，j ∈{0，1}

x _i ≥y _i，j ；x _j ≥y _i，j

∑x _i ≤N

Σx _i ·f(i)+Σy _i，j ·g(i，j)≤S

Table 2: dynamic integration of contextual information and content selection

FIG. 5 is a diagram of a simplified unified semantic graph for an example domain ontology according to one embodiment of the present invention. The example ontology relates to a university course selection website. The figure has a plurality of nodes 501-545 representing topics in the web site, such as course nodes 509 and student nodes 519 linked together by a plurality of edges (unnumbered). The edges hold the values of the relationships between the corresponding nodes. In embodiments of the invention, the unified semantic graph may have more nodes and edges. Prior to the operation of the dialog system, the unified semantic graph is a domain ontology that is manually built by the developer of the dialog system or automatically derived from the database of the backend system. By traversing the knowledge graph, the system can find out which corresponding nodes are related to other nodes. During operation of the system, as the meaning representation for each user utterance is sent by the graph integrator, portions of the meaning representation are incorporated into a unified semantic graph.

Two sentence examples are given: the first user utterance "i need to register an additional 3 points" and the second user utterance "i prefer a theoretical lesson". Using sentences in the first user utterance, MR maps are generated and matched to domain concepts. A unified semantic graph may then be generated from the MR graph. After receiving the second user utterance, another MR graph is generated, which is matched to the domain concepts, and the new concepts "theory" are integrated into the existing unified semantic graph accordingly.

FIG. 6 illustrates a system dynamically highlighting a portion of a context graph to cause feedback to a user. In this figure, a user 601 is shown during two

states

600, 602 of a dialog with a dialog system. A portion of the context graph (i.e., the subgraph) includes

nodes

605, 607, and 609 and is shown in the SSR portion of the interface of the dialog system. In state 600, the user has initiated a session with the system regarding course selection. Based on query 611, "show Computer Science (CS) lesson to me," the system highlights lesson node 605 with value 30. The user has performed annotation 613 by highlighting the credit: 3 node 607, or drawing an edge between course node 605 and score node 607 to indicate that it should be part of the query. Then in the dialogue, state 602 is reached, and in state 602 the user has queried the theoretical lesson, so theoretical node 609 is now highlighted. In state 602, the credit: the 3 node 607 is no longer highlighted either because the user has deselected the credit node in the interface or the user has indicated a indifference about the number of credit hours by the conversation.

FIG. 7 illustrates one embodiment of how a Surface Semantic Representation (SSR) is presented in a conversational interface. In this figure, a first part 701 of the interface is dedicated to the dialog between the user and the dialog system. In the second part of the interface, SSR display 703 shows a portion of the relevant subgraph that dynamically changes with the conversation. At round (1), SSR display 703' corresponds to how the system analyzes the user input "I have additional 3 point courses" and displays two entities, namely course node 709 and point node 711, and displays the entity/value pair within the point node (point: 3). At each particular round, for example round (2) "find 9 courses, for example AA, BB, CC" and round (3) "kava? What are those courses taught by praston? The SSR interface dynamically presents graphical elements 713, 715, 717 that present new nodes and remove old nodes (not shown) to show the user how the system interprets the dialog and performs the system response. For example, the system may present how the system forms queries to the backend by presenting graphical elements representing sub-graph components of the unified semantic graph.

In an embodiment of the invention, individual graphical elements are emphasized to indicate nodes having a particular semantic meaning. Colors are used to indicate which nodes are contributed by the dialog, by the original ontology, or by a query to the backend database. The user may interact with the graphical element representing the node or edge of the displayed sub-graph by selecting or deselecting the corresponding element. For example, a user may draw a new line representing an edge to indicate that a given node should be included in a search. In the case that there is insufficient space to display the graphical elements of all relevant nodes, the lines, e.g. dashed lines, may be presented differently, indicating that the two nodes are not directly linked in the semantic graph. The user may select the line to change the SSR interface to present elements representing sub-graph components. Those skilled in the art will recognize that there are many alternative ways to highlight and select the different elements of the graphical interface.

Additional elements of the interface used in some embodiments include simple interpretations of the user state 719 or the system state 721.

In the alternative alternate view 703", the user may choose to view the overall context information used by the system from previous user utterances in the dialog. In this figure, the context information is arranged according to a timeline 725 such that the older context information is located on the left side. Furthermore, in embodiments of the present invention, one or

more indicators

722, 723, 724 may be used to indicate how the result becomes 0 after adding the latest context information as a search criterion (name preston).

Surface semantic representation of dialog systems

In a preferred embodiment of the invention, a Surface Semantic Representation (SSR) is used as a user interface for the conversation system. It generates a visual representation of underlying salient semantic information that is used by the dialog agent to complete the task that the user and system are talking. By integrating the chat interface with the SSR, the end user participates further than a separate conversation by revealing how to predict task related information and enabling the system to directly request user feedback on predictions (or assumptions) through the conversation. This targeted feedback is valuable for training a statistical dialog system. In a preferred embodiment, object-oriented dialogs are the basis of SSR interfaces, which involve the exchange of fact information between users and the system and are based on a domain knowledge base.

SSR-based feedback

The SSR interface can conveniently employ a rich set of graphical elements that present information to and collect feedback from the end user. FIG. 8 illustrates several example user feedback inputs ranging from lexical level to discussion level. If a new user utterance 801 "I need only an additional 3 score theory course" is given, then several system outputs are triggered. In this example, the system output includes location (slot) scores 803 and 3 as output from a location tagger (slot labeller). Another output is a search query 805 about "curriculum," which indicates how the background database query was formed. Yet another output is the invisible semantic relationship 807 between "course" and "theory" to be confirmed or specified by the user.

The end user may even be effectively presented with past subtasks 809. The purpose of presenting the past subtasks is to let the user associate the current utterance ("i need only an additional 3 score theory course") with the context of the previous subtask, i.e. the current utterance is a continuation of the previous task. The SSR-based dynamic feedback used in embodiments of the present invention is more versatile and flexible than prior art methods for presenting conversation data annotations (typically taking simple multi-round positions using static intent padding), thereby enabling more complex conversation strategies to be annotated.

Fig. 9 shows the system logic behind the following user/system dialogs in one embodiment of the invention, where U indicates the user, S indicates the system, and numbers indicate the order of the dialogs:

u1: in, i are looking for CS courses to register in the next school period. (P1)

S1: what type of lesson you want, is theory programmed? (P1)

U2: how much credit is provided by the theoretical lesson? (P2)

S2 (a): not so much, I are not well aware. What courses are you preferred? (P1)

S2 (b): all three theoretical courses in the lower school stage provide three points. (P2)

U3: how does the programming course? (P2)

S3: two programming courses provide two points and two gates provide three points. (P2)

U4: preferably, I prefer programming courses. (P1)

The dialog strategy is used to determine how the system generates the next natural language response. In response to the user query U1, the chat proxy asks the correct question S1. However, rather than providing answers that the system expects, the user asks question U2 that helps answer system query S1. When sharing locations (i.e., tie number and session number), the intent of U2 is semantically related to the intent of U1, but because the specified category number is not necessarily the same as the previous user intent, the intent of U2 is different. If the dialog strategy is optimized assuming a static user intent, the dialog system may still attempt to populate the location category by using the S2 (a) answer. However, a better dialog strategy would be able to respond to new relevant user intentions by providing information as in S2 (b), as it can trace back to historical intentions in previous utterances.

Inspired by the launch concept in human memory, the system uses a "context launch (contextual priming)" process for modeling more complex dialog strategies. Each launch corresponds to a previous user intent that shares the same constraint or set of positions/values as the current launch or user intent. The end user may provide feedback regarding whether the current utterance relates to a previous or new (recent) start in the dialog. By using a context-activated process, a conversational policy can be generated from new user intent and historical utterances in a conversation, rather than being limited to using only the latest utterances in the conversation.

Tasks

Chat interactions with end users for goal-oriented tasks are largely determined by dialog strategies that are pre-designed or pre-trained in a given domain. Adapting dialog strategies to practical applications presents many challenges, especially if: 1) The underlying domain or task is frequently spread out or 2) it is difficult to construct a complex dialog manager a priori. Furthermore, offline manual annotation is expensive and noisy. The SSR-based feedback scheme effectively encourages end-users and provides various user feedback mechanisms to improve pre-designed or trained dialogue strategies. In an embodiment of the invention, statistical dialog management is used to incorporate SSR-based user feedback into dialog policies.

Dialog management corresponds to two subtasks: dialog state tracking and dialog policy learning. When communicating with users, statistical dialog systems typically maintain a distribution over possible dialog states in a process called dialog state tracking that is used to interface with domain knowledge bases. It also prepares the component for dialogue strategy learning. Policies may be represented by transition probabilities between states, where a state is a representation of a conversation. In embodiments of the present invention, the status includes up-to-date user/system conversational actions such as requests, information displays, social networking gifts, and corresponding location/value information. The dialog policy directly determines how the system generates the next response.

Proposed model

In embodiments of the present invention, neural network-based methods are used to enable model architecture to be built on top of sequence marker datasets without the need for hand-made data. By combining multiple inputs (including user utterances, associated dialog actions, per-context initiated domain location/value), model predictions semantically determine the dialog activity of the best system response.

The framework of a modular connection for dialog management is shown in fig. 10. The state tracker 1001 is a neural network that takes the user utterance 1003 ut= (w 1, w2,..wi) and generates one output of the model, which is a sequence of position tags of the position tag 1005. Another output of the model is the set of intent tags generated by the intent labeler 1007 and the start tag generated by the start labeler 1009. Policy learner 1013 is another neural network having an output layer for dialog actions 1015 and query location 1017. The output dialog action 1019 and the query location 1021 are passed from the policy learner to the state tracker 1001.

Semantic coding

The speech coding-sequence tagging architecture uses word embedding to capture similarity, but is affected when processing previously invisible or rare words. Embodiments of the present invention use a bag of averages (bag-of-means) for word embedding and Recurrent Neural Networks (RNNs). If the utterance ut= (w 1, w2...wi) at time t is given, the corresponding vectors represent long-short-term memory (LSTM) hidden states, respectively, encoded backward in the RNN as at time t.

Dialog coding-each position/value is noted as<s＝(m,d,g),v>Where s is the location of type M ε M; d is the directionality of the information and d e { user→proxy, proxy→user }; g represents the type of change, e.g. +, -,

v is the latest result value from g. When v is a string-based entity name (e.g., "condo" (for attribute type) or "New York" (for locale)), then the embedding of v is calculated as string text embedding. Embodiments use normalized marker representations to replace the embedding of values. For example, the normalized representation "meeting time" is used instead of "5pm". The position s is encoded as an index in the position type dictionary Dm and is associated with the change type dictionary DgThe index is concatenated with a hot bit of the associated directionality. Each round typically corresponds to a context activation Pi, which is semantically constrained by a set of s. Thus, the context enables concatenation of all associations s encoded with the latest v. The system also maintains a lookup table of context histories for s for each P for specifically forming queries.

State tracker

The embodiment of the invention realizes the state tracking task as a multi-task sequence learning problem. Various methods for sequence marking tasks are used in alternative embodiments. The neural model updates the probability distribution p (sm) of candidate values for the location type, e.g., one of a new context activation or a previous context activation. For each user turn t, a bi-directional gating loop unit (GRU) is used to calculate the coding of the user utterance, the concatenation of ht=gru (xt, ht-1) as hidden states for forward and backward calculations. Another bi-directional GRU is used to calculate the hidden activation for each s.

Supervised learning of dialog strategies

Using state tracking tags as input features, the goal of the dialog strategy is to minimize the joint loss function between the tags and the predicted p of the shared network parameter θ (theta):

L(θ)＝∑H(y _d ,p _d )

d∈{a,u,Ds}

where a is the dialogue action, u is the classification distribution of the intent (the expected entity), and Ds is the binary value of the location.

h _t ＝tanh(Wx _t +U h _t-1 )

r _t ＝σ(W _r x _t +U _r h _t-1 )

h～ _t ＝tanh(W x _t +r _t Θ(U _ht-1 ))

z _t ＝σ(W _z x _t +U zh _t-1 )

h _t ＝(1-z _t )Θh _t-1 +z _t Θh ^～ _t

Wherein x is _t Is input at time t, h _t Is the hidden state at time t, W and U are the transformed matrices of the input and the previous hidden state. Variables r and z are reset gate and get updated, respectively.

Reinforcement learning of dialog strategies

Reinforcement Learning (RL) is used in embodiments of the invention for learning the best dialog strategy of a task oriented dialog system. To incorporate online feedback on dialog policies, a RL-based approach is used to optimize the policy network. The goal is to maximize the reward J (θ) for a conversation

t＝T

J(θ)＝E[∑γ _t R(s _t ,a _t )|θ]

t＝0

Wherein gamma is _t E [0,1 ] is a discount factor, R (a) _t ,s _t ) Is a reward when action a for state s is active at time t.

Deep Q Networks (DQN) use deep neural networks to parameterize Q-value functions Q (a, s, P; θ). The network takes observations o at time t _t . The loop unit updates its hidden state based on both the history and current round embeddings. The model then outputs the Q values for all actions. Specifically, rewards are utilized for two possible observations, one from the end user and one from the domain knowledge base. User feedback o observed via SSR ^U Based on (1) round level success, i.e., if the current system response is useful for completing the task; (2) The state level is successful, i.e. if the dialog state is correctly marked. Observed query result o ^Q Determined by the query q constrained for the most likely location/value. Thus, observe o _t Can be represented by a _t 、o _t ^U And o _t ^Q And (5) defining. Polymerization run b Using LSTM _t ＝LSTM(o _t ,b _t-1 ) Contextual information on the web page.

An important problem in applying RL-based methods in practice is slow convergence due to the large space of possible values. In the present invention, the system is able to significantly reduce the size of the search space for actions based on user feedback regarding dialog states. The model uses user feedback to mask actions as confirmation (e.g., user indicated "yes" or "no") and designation (e.g., user required to designate the value). The model outputs the Q values of all dialog actions.

While preferred operating environments and use cases have been described, the techniques herein may be used in any other operating environment where a service needs to be deployed.

As already described, the above-described functionality may be implemented as a stand-alone method, such as one or more software-based functions executed by one or more hardware processors, or may be used as a managed service (including as a Web service via SOAP/XML or REST-type interfaces). Specific hardware and software implementation details described herein are for illustrative purposes only and are not meant to limit the scope of the described subject matter.

More generally, computing devices within the context of the disclosed subject matter are data processing systems that include hardware and software, and these entities communicate with each other over a network (e.g., the Internet, intranet, extranet, private network) or any other communication medium or link. Applications on the data processing system provide native support for the Web as well as other known services and protocols, including but not limited to support for HTTP, FTP, SMTP, SOAP, XML, WSDL, UDDI, and WSFL, etc. Information about SOAP, WSDL, UDDI and WSFL is available from the world wide web consortium (W3C), which is responsible for developing and maintaining these standards; further information about HTTP, FTP, SMTP and XML is available from the Internet Engineering Task Force (IETF).

In addition to cloud-based environments, the techniques described herein may be implemented in or in conjunction with a variety of server-side architectures, including simple n-tier architectures, web portals, federated systems, and the like.

More generally, the subject matter described herein may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In a preferred embodiment, the module functions are implemented in software, which includes but is not limited to firmware, resident software, microcode, etc. Furthermore, the interfaces and functions can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can contain or store the program for use by or in connection with the instruction execution system, apparatus, or device. The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device). Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a Random Access Memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical discs include compact disc-read only memory (CD-ROM), compact disc-read/write (CD-R/W), and DVD. The computer readable medium is a tangible, non-transitory article.

The computer program product may be the following: having program instructions (or program code) to implement one or more of the functions described. The instructions or code may be stored in a computer readable storage medium in the data processing system after being downloaded over a network from a remote data processing system. Alternatively, the instructions or code may be stored in a computer readable storage medium in a server data processing system and adapted to be downloaded over a network to a remote data processing for use in a computer readable storage medium in a remote system.

In representative embodiments, these techniques are implemented in a special purpose computing platform, preferably in software executed by one or more processors. The software is maintained in one or more data stores or memories associated with the one or more processors and may be implemented as one or more computer programs. In general, such specialized hardware and software includes the functionality described above.

In a preferred embodiment, the functionality provided herein is implemented as an attachment or extension to existing cloud computing deployment management solutions.

While a particular order of operations performed by certain embodiments of the invention has been described above, it should be understood that such order is exemplary, as alternative embodiments may perform the operations in a different order, combine certain operations, overlap certain operations, etc. References in the specification to a given embodiment indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic.

Finally, while given components of the system have been described separately, one of ordinary skill will appreciate that certain functions may be combined or shared in given instructions, program sequences, code portions, and the like.

Having described the invention, what is now claimed is as follows.

Claims

1. A method for presenting a user interface of a dialog system, comprising:

creating a unified semantic representation of dialog content between the user and the dialog system as a semantic context graph of concepts and relationships;

dynamically identifying a set of sub-graph components of the semantic context graph to be presented to the user based on current dialog activity; and

presenting the identified sub-graph component set in a user interface as a set of graphical elements representing respective concepts and relationships, wherein the set of graphical elements representing respective sub-graph components accepts user input to change the unified semantic representation.

2. The method of claim 1, wherein the set of sub-graph components is identified based on which concepts and relationships are currently being used by the dialog system to form a system response in the current dialog activity.

3. The method of claim 2, further comprising:

Dynamically identifying a plurality of sub-graph components of the semantic context graph that are related to the current dialog activity;

identifying a set of constraints that prevents presentation of all of the plurality of sub-graph components to the user as a corresponding plurality of graphical elements; and

the identified set of sub-graph components in the plurality of sub-graph components is optimized to assist an end user in understanding the set of graphical elements in the user interface based on semantic integrity criteria.

4. The method of claim 3, wherein the constraint is a set of temporal and spatial constraints for presenting the identified set of sub-graph components in the user interface, the method further comprising:

identifying sub-graph components corresponding to the likelihood of the user providing feedback if graphical elements representing the respective sub-graph components are presented;

estimating a space required to present the respective graphical element as compared to a total available space for a set of graphical element components in the user interface; and

the time required to present the corresponding graphical element in the user interface is estimated.

5. The method of claim 1, wherein the set of graphical elements representing the respective concepts and relationships are user-annotated such that user feedback to the dialog system is provided from user interactions with the set of graphical elements.

6. The method of claim 1, further comprising: in response to the sub-graph component corresponding to the current dialog activity being unavailable, scoring a set of candidate sub-graph components according to: the degree of semantic relativity of the corresponding candidate sub-graph component and the current user intention; and if the corresponding sub-graph component is displayed, the user will provide a likelihood of feedback;

wherein the candidate sub-graph component is obtained by a predetermined number of hops away from a concept node representing the current user intent in the unified semantic representation.

7. The method of claim 1, the method further comprising:

highlighting the corresponding graphical element in the user interface;

receiving a first user input for the respective graphical element;

providing the first user input to an atlas component of the dialog system, the atlas component generating a meaning representation corresponding to the first user input; and

the unified semantic representation is changed based on the meaning representation diagram.

8. The method of claim 1, wherein the identified sub-graph component set is identified according to a level of expertise of the user.

9. An apparatus, comprising:

a processor;

a computer memory holding computer program instructions for execution by the processor to search for a user interface presenting a dialog system, the computer program instructions comprising:

program code for performing the steps of any of claims 1 to 8.

10. A non-transitory computer readable medium having stored thereon a computer program for use in a data processing system, the computer program comprising computer program instructions for execution by the data processing system to present a user interface of a dialog system, the computer program instructions comprising:

program code for performing the steps of any of claims 1 to 8.

11. A system for presenting a user interface of a dialog system, the system comprising means for performing the steps of any of claims 1 to 8.

12. A method for presenting a user interface of a dialog system, comprising:

determining a semantic meaning representation for each user utterance in a set of user utterances generated in a conversation with the dialog system;

converting the semantic meaning representation into a corresponding sentence conceptual diagram;

Integrating a first sentence conceptual diagram into a unified context diagram, wherein the unified context diagram is created by taking as concepts and relationships a unified semantic representation of dialog content between the user and the dialog system;

when the dialog with the dialog system is in progress,

dynamically identifying a set of sub-graph components of the unified context graph to be presented to the user;

presenting the identified sub-graph component set in a user interface as a set of graphical elements representing corresponding concepts and relationships;

the set of graphical elements representing the respective sub-graph component accepts user input to update the unified context graph based on the new sentence conceptual graph.

13. The method of claim 12, further comprising: the unified context map is updated based on semantic matches to domain knowledge stored in a database of the dialog system.

14. The method of claim 12, further comprising: concepts and relationships semantically related to the latest dialog activity in the dialog of the dialog system are identified.

15. The method of claim 12, further comprising:

identifying concepts and relationships in the first sentence concept graph that are semantically related to concepts and relationships in the unified context graph; and

Queries to databases of the dialog system are constructed from the identified concepts and relationships.

16. The method of claim 12, further comprising:

identifying a set of changes to the concept, the concept value, and the concept state based on the latest user input;

identifying related components of the concept, the concept value, and the concept state in the unified context graph; and

based on results from queries to databases of the dialog system, changes to related components in the unified context graph are identified.

17. The method of claim 16, further comprising:

quantifying a set of factors associated with a respective sub-graph, the set of factors including at least one of a concept level characteristic, a relationship level characteristic, or a discussion level characteristic; and

the sub-graph sets in the unified context graph are ordered based on the quantized factors.

18. The method of claim 12, further comprising: a dialog policy is generated based on the new user intent and the historical utterances in the dialog instead of the latest utterances in the dialog.

19. An apparatus, comprising:

a processor;

a computer memory holding computer program instructions executable by the processor to present a user interface of a dialog system, the computer program instructions comprising:

Program code for performing the steps of any of claims 12 to 18.

20. A non-transitory computer readable medium having stored thereon a computer program for use in a data processing system, the computer program comprising computer program instructions for execution by the data processing system to present a user interface of a dialog system, the computer program instructions comprising:

program code for performing the steps of any of claims 12 to 18.

21. A system for presenting a user interface of a dialog system, the system comprising means for performing the steps of any of claims 12 to 18.