WO2019205705A1 - Semantic-framework-based human-machine conversation method and system - Google Patents

Semantic-framework-based human-machine conversation method and system Download PDF

Info

Publication number
WO2019205705A1
WO2019205705A1 PCT/CN2018/124937 CN2018124937W WO2019205705A1 WO 2019205705 A1 WO2019205705 A1 WO 2019205705A1 CN 2018124937 W CN2018124937 W CN 2018124937W WO 2019205705 A1 WO2019205705 A1 WO 2019205705A1
Authority
WO
WIPO (PCT)
Prior art keywords
semantic
topic
question
visitor
forest
Prior art date
Application number
PCT/CN2018/124937
Other languages
French (fr)
Chinese (zh)
Inventor
蔡振华
肖龙源
谭玉坤
李稀敏
刘晓葳
Original Assignee
厦门快商通信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 厦门快商通信息技术有限公司 filed Critical 厦门快商通信息技术有限公司
Publication of WO2019205705A1 publication Critical patent/WO2019205705A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Definitions

  • the invention relates to the field of artificial intelligence technology, in particular to a human-machine dialogue method based on a semantic framework and a system using the same.
  • Intelligent customer service is an industry-oriented application developed on the basis of large-scale knowledge processing. It involves large-scale knowledge processing technology, natural language understanding technology, knowledge management technology, automatic question answering system, reasoning technology, etc., and has industry versatility. It not only provides enterprises with fine-grained knowledge management technology, but also establishes a fast and effective technical means based on natural language for communication between enterprises and mass users. At the same time, it can also provide statistical analysis information required for refined management. Can greatly reduce the labor costs of the company in customer service.
  • the working principle of intelligent customer service is mainly based on the application of big data knowledge processing technology, that is, by extracting the keyword of the visitor to judge the problem of the visitor, and then matching the corresponding answer to the visitor from the knowledge base.
  • the premise of getting an accurate answer is to be able to extract accurate and complete questions.
  • due to Chinese language problems there are often multiple expressions and multiple use habits in the same problem, resulting in unsatisfactory answers and problem words, or the inability to identify users, resulting in a decline in user experience.
  • the present invention provides a human-machine dialogue method and system based on a semantic framework, which analyzes the human-machine dialogue content through the mapping relationship between the theme forest structure tree and the semantic framework model, thereby ensuring accurate acquisition. Complete visitor questions to ensure the accuracy of the answers and improve communication efficiency.
  • a man-machine dialogue method based on a semantic framework which comprises the following steps:
  • the topic type is matched to the visitor question, and the visitor question is filled into the semantic slot in the semantic framework model corresponding to the topic type;
  • the theme forest tree performs question matching from the knowledge base according to the visitor question, and feeds the answer corresponding to the matched question to the visitor.
  • step d it is further determined, according to the mapped topic forest tree, whether the visitor problem satisfies a preset condition; when the visitor problem satisfies a preset condition, the theme forest tree according to the The guest problem is matched from the knowledge base; when the visitor problem does not satisfy the preset condition, the theme forest tree feeds the judgment result to the dialogue robot at the front end.
  • the entity attribute includes a necessary attribute and an optional attribute
  • the semantic slot includes a necessary semantic slot and an optional semantic slot
  • the preset condition is whether the necessary attribute is complete
  • the subject of the visitor question is Matching of types and populating guest questions into the necessary semantic slots and/or optional semantic slots in the semantic framework model corresponding to the topic type
  • the necessary semantic slots and optional semantic slots of the populated semantic framework model The visitor problem is mapped to the necessary attributes and optional attributes of the topic forest tree; and further determining whether the necessary attribute is complete according to the mapped topic forest tree; when the necessary attribute of the visitor problem is complete,
  • the theme forest tree performs problem matching from the knowledge base according to the visitor problem; when the necessary attribute of the visitor problem is incomplete, the theme forest tree feeds back the necessary attributes missing to the front-end dialogue robot, The dialogue robot asks the visitor according to the missing necessary attributes to obtain all necessary genus of the subject type .
  • the step a further includes:
  • A1. Collect the original corpus and perform subject clustering on the original corpus to obtain different types of topics
  • the subject clustering is performed on the original corpus, and the theme extraction and topic classification are performed by using the LDA topic model tool.
  • the identification and extraction of the entity relationship for each topic type is performed by parsing and semantically parsing the original corpus, and extracting the relationship between the entity information and the tagged entity information according to the parsing result.
  • the topic structure tree includes current topic information and inter-topic association information, and all types of topics are indexed according to the inter-topic association information to obtain a topic forest knowledge base.
  • the word segmentation processing and keyword extraction are performed on the visitor problem, and the type of the topic to which the keyword belongs is matched according to the extracted keyword, and the guest question is filled into the semantic corresponding to the topic type.
  • the necessary semantic slots and/or optional semantic slots in the framework model are provided.
  • the necessary semantic slots of the filled semantic framework model and the guest problem of the optional semantic slot are mapped to the necessary attributes and optional attributes of the theme forest structure tree, and the extraction and extraction are performed.
  • the keyword matches the necessary attribute and the optional attribute, and judges whether the necessary attribute is missing according to the matching result.
  • the present invention also provides a human-machine dialogue system based on a semantic framework, which includes:
  • a topic tree creation module which creates a topic forest tree according to the original corpus, and extracts an entity attribute corresponding to each topic type in the topic forest tree;
  • a semantic framework model generating module which generates a semantic framework model by using the theme forest structure tree, and maps an entity attribute of the theme forest structure tree to a corresponding semantic slot in the semantic framework model;
  • a human-machine dialog module for matching a topic type to a guest question, and populating a guest question into a semantic slot in a semantic framework model corresponding to the topic type;
  • a problem matching module configured to map a guest problem of a semantic slot of the filled semantic framework model to an entity attribute of the topic forest structure tree, where the topic forest tree performs problem matching from the knowledge base according to the guest problem ;
  • An answer feedback module for feeding back the answer corresponding to the matched question to the visitor.
  • the present invention analyzes the content of human-machine dialogue through the mapping relationship between the theme forest tree and the semantic framework model, and can ensure accurate and complete visitor problems, so as to ensure the accuracy of the answer and improve communication on the basis of this. effectiveness.
  • the present invention maps the necessary and optional attributes of the theme when creating the topic forest-based knowledge base, and maps with the necessary semantic slots and optional semantic slots in the semantic framework model to thereby invite visitors during human-machine dialogue
  • the problem is to match the topic matching and necessary attributes and the questioning of the necessary attributes, so as to actively interact with the visitor and increase the user experience.
  • FIG. 1 is a schematic flow chart of a human-machine dialogue method based on a semantic framework according to the present invention
  • FIG. 2 is a schematic structural diagram of a human-machine dialogue system based on a semantic framework according to the present invention.
  • a semantic framework-based human-machine dialogue method of the present invention includes the following steps:
  • the topic type is matched to the visitor question, and the visitor question is filled into the semantic slot in the semantic framework model corresponding to the topic type;
  • the theme forest tree performs question matching from the knowledge base according to the visitor question, and feeds the answer corresponding to the matched question to the visitor.
  • the semantic framework is a kind of knowledge representation.
  • Frame Semantics is a cognitive linguistic theory proposed by American linguist Fillmore.
  • the slot is the "slot" in the framework.
  • Framework semantics is first and foremost a way to understand and describe the meaning of words and grammatical structures. It starts with the assumption that in order to understand the meaning of words in the language, we must first have the conceptual structure, that is, the knowledge of the semantic framework.
  • the semantic framework provides the context and motivation for the meaning of words in the language and in the words. Framework semantics assumes that words can select and highlight certain aspects or instances of the basic semantic framework through its linguistic structure, which is done in a certain way (according to certain principles). Therefore, the interpretation of the meaning and function of words can be carried out in the light of the description of the basic semantic framework until the characteristics of these methods are detailed.
  • V To 1V, V will be V, V will be V V, V is going, V is V, V is going down, V is down, V is up 2
  • the brackets represent the semantic slot, and the content after the colon represents the content filled by the semantic slot.
  • the semantic slot can be divided into the necessary semantic slot and the optional semantic slot as needed.
  • the entity attribute includes a required attribute and an optional attribute.
  • the semantic slot includes a necessary semantic slot and an optional semantic slot.
  • the topic type is matched to the guest question, and the guest question is filled into the necessary semantic slot and/or the optional semantic slot in the semantic framework model corresponding to the topic type;
  • the necessary semantic slots of the populated semantic framework model and the guest questions of the optional semantic slot are mapped to the necessary and optional attributes of the topic forest tree.
  • step d it is further determined, according to the mapped topic forest tree, whether the visitor problem satisfies a preset condition; when the visitor question satisfies a preset condition, the topic forest tree is based on the visitor problem The problem matching is performed from the knowledge base; when the guest question does not satisfy the preset condition, the theme forest tree feeds the judgment result to the dialogue robot at the front end.
  • the preset condition is whether the necessary attribute is complete; that is, determining whether the required attribute is complete according to the mapped topic forest tree; and when the necessary attribute of the visitor problem is complete, the theme forest
  • the tree performs question matching from the knowledge base according to the visitor question; when the necessary attribute of the visitor question is incomplete, the theme forest tree feeds back the necessary attributes that are missing to the front-end dialog robot, and the dialog robot The visitor is questioned based on the missing necessary attributes to get all the necessary attributes of the subject type.
  • step c the process returns to step c to perform the extraction of the guest question, the filling of the semantic slot, the mapping of the attribute of the entity, the determination of the missing attribute, and the like, and repeats the above process until all the required trees of the theme forest tree are satisfied.
  • step a further includes:
  • A1. Collect the original corpus and perform subject clustering on the original corpus to obtain different types of topics
  • subject clustering of the original corpus is to use the LDA topic model tool for topic extraction and topic classification.
  • the original corpus refers to a historical conversation record between the visitor and the customer service, and the original corpus is periodically updated or updated in real time according to the new conversation record.
  • the LDA (Latent Dirichlet Allocation) topic model is a document topic generation model, also called a three-layer Bayesian probability model, which includes a three-layer structure of words, topics and documents.
  • the document to topic follows a polynomial distribution, and the subject to the word follows a polynomial distribution.
  • a topic is extracted from the topic distribution, and a word is extracted from the word distribution corresponding to the extracted topic; the above process is repeated until each word in the document is traversed, thereby obtaining the subject of the document.
  • the document is a dialogue record between the visitor and the customer service in the present invention.
  • a raw corpus is divided into topics such as weather queries, train inquiries, flight inquiries, and the like.
  • the identification and extraction of the entity relationship for each topic type is performed by parsing and semantically parsing the original corpus, and extracting the relationship between the entity information and the tagged entity information according to the parsing result, and the entity may be used.
  • the diagram is represented.
  • Entity relationship diagram A shorthand E-R diagram refers to the basic structure of data summarized by three basic concepts of entity, relationship and attribute.
  • the entity is a named entity, which includes a text entity with explicit semantic information such as a name (organization name, person name, place name, trade name), an expression (date, time), etc., used in the ER diagram.
  • the rectangle indicates that the name of the entity is written in the rectangle; for example, the visitor is an entity.
  • the attribute, an attribute possessed by the entity, an entity can be characterized by several attributes; it is represented by an ellipse in the ER diagram, and is connected with the corresponding entity by an undirected edge; for example, the name of the visitor , account number, gender, etc., are attributes.
  • the relationship refers to a way in which data objects are connected to each other, including a one-to-one relationship, a one-to-many relationship, and a many-to-many relationship.
  • the topic structure tree includes current topic information and inter-topic association information, and all types of topics are indexed according to the inter-topic relationship information to obtain a topic forest knowledge base.
  • a conversation may be limited to a single topic within a domain, or it may involve multiple topics in multiple domains.
  • the topic type is searched for by the topic forest knowledge base, and the necessary and optional attributes of the topic type are obtained to confirm the integrity of the problem.
  • the topic type matching is performed on the visitor problem by performing word segmentation processing and keyword extraction on the visitor problem, matching the type of the topic to which the keyword is matched according to the extracted keyword, and filling the visitor question into the location.
  • the necessary semantic slots and/or optional semantic slots in the semantic framework model corresponding to the topic type are necessary semantic slots and/or optional semantic slots in the semantic framework model corresponding to the topic type.
  • filling the guest question into the semantic slot in the semantic framework model corresponding to the topic type using the natural language framework parser to populate the corresponding content in the guest question into each of the semantic framework models. In the semantic slot.
  • the necessary semantic slots of the filled semantic framework model and the guest problem of the optional semantic slot are mapped to the necessary attributes and optional attributes of the theme forest structure tree, and the extracted keywords are Matching the necessary attributes and optional attributes, and judging whether the necessary attributes are missing according to the matching result.
  • the weather query topic is taken as an example for explanation:
  • Optional attribute 1 Weather type, such as rain, snow, haze, etc.;
  • the semantic framework model is generated according to the theme forest tree as follows:
  • the results obtained after the weather query semantic framework model are as follows:
  • Semantic Framework Weather Query
  • the semantic framework model maps the above acquired content to the theme forest structure tree. After the theme forest structure tree is judged, it finds that it has met the necessary attribute 1 and the necessary attribute 2, so the problem query is made into the knowledge base and the query result (answer ) Feedback to visitors.
  • the present invention also provides a human-machine dialogue system based on a semantic framework, which includes:
  • a topic tree creation module which creates a topic forest tree according to the original corpus, and extracts an entity attribute corresponding to each topic type in the topic forest tree;
  • a semantic framework model generating module which generates a semantic framework model by using the theme forest structure tree, and maps an entity attribute of the theme forest structure tree to a corresponding semantic slot in the semantic framework model;
  • a human-machine dialog module for matching a topic type to a guest question, and populating a guest question into a semantic slot in a semantic framework model corresponding to the topic type;
  • a problem matching module configured to map a guest problem of a semantic slot of the filled semantic framework model to an entity attribute of the topic forest structure tree, where the topic forest tree performs problem matching from the knowledge base according to the guest problem ;
  • An answer feedback module for feeding back the answer corresponding to the matched question to the visitor.
  • the term "comprises”, “comprising”, or any other variants thereof, is intended to encompass a non-exclusive inclusion, such that a process, method, article, or device comprising a series of elements includes not only those elements but also Other elements not explicitly listed, or elements that are inherent to such a process, method, item, or device.
  • An element that is defined by the phrase “comprising a " does not exclude the presence of additional equivalent elements in the process, method, item, or device that comprises the element.
  • all or part of the steps of implementing the foregoing embodiments may be completed by hardware, or may be instructed by a program to perform related hardware.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A semantic-framework-based human-machine conversation method and system. The method comprises: creating a topic forest structure tree according to a raw corpus and generating a semantic framework model using the topic forest structure tree, and mapping an entity attribute of the topic forest structure tree to a corresponding semantic slot in a semantic framework model; during a human-machine conversation, matching a visitor question in terms of a topic type, and filling the visitor question into a semantic slot in a semantic framework model corresponding to the topic type; then mapping the visitor question filled into the semantic slot in the semantic framework model to the entity attribute of the topic forest structure tree; and finally, the topic forest structure tree matching a question in a knowledge base according to the visitor question, and feeding back an answer corresponding to the matching question to a visitor. Therefore, an accurate and complete visitor question can be acquired, so that the accuracy of an answer is ensured and the communication efficiency is improved based on same, and active interaction with a visitor can be achieved, thereby improving the user experience.

Description

基于语义框架的人机对话方法及系统Man-machine dialogue method and system based on semantic framework 技术领域Technical field
本发明涉及人工智能技术领域,特别是一种基于语义框架的人机对话方法及其应用该方法的系统。The invention relates to the field of artificial intelligence technology, in particular to a human-machine dialogue method based on a semantic framework and a system using the same.
背景技术Background technique
随着互联网及电子商务的普及应用,及人工智能技术的发展,智能客服越来越常见。智能客服是在大规模知识处理基础上发展起来的一项面向行业应用的,涉及大规模知识处理技术、自然语言理解技术、知识管理技术、自动问答系统、推理技术等等,具有行业通用性,不仅为企业提供了细粒度知识管理技术,还为企业与海量用户之间的沟通建立了一种基于自然语言的快捷有效的技术手段;同时还能够为企业提供精细化管理所需的统计分析信息,可以大大降低企业在客服方面的人工成本。With the popularization of Internet and e-commerce, and the development of artificial intelligence technology, intelligent customer service is becoming more and more common. Intelligent customer service is an industry-oriented application developed on the basis of large-scale knowledge processing. It involves large-scale knowledge processing technology, natural language understanding technology, knowledge management technology, automatic question answering system, reasoning technology, etc., and has industry versatility. It not only provides enterprises with fine-grained knowledge management technology, but also establishes a fast and effective technical means based on natural language for communication between enterprises and mass users. At the same time, it can also provide statistical analysis information required for refined management. Can greatly reduce the labor costs of the company in customer service.
智能客服的工作原理主要是基于大数据知识处理技术的应用,即通过提取访客的关键词来判断访客的问题,然后从知识库中匹配相应的答案给访客。获得准确答案的前提,是能够提取准确和完整的问题。但是,由于中文语言问题,往往同一个问题有多种表达方法、多种用词习惯等,造成答案与问题词不达意,或者无法识别用户的问题,造成用户体验度下降。The working principle of intelligent customer service is mainly based on the application of big data knowledge processing technology, that is, by extracting the keyword of the visitor to judge the problem of the visitor, and then matching the corresponding answer to the visitor from the knowledge base. The premise of getting an accurate answer is to be able to extract accurate and complete questions. However, due to Chinese language problems, there are often multiple expressions and multiple use habits in the same problem, resulting in unsatisfactory answers and problem words, or the inability to identify users, resulting in a decline in user experience.
发明内容Summary of the invention
本发明为解决上述问题,提供了一种基于语义框架的人机对话方法及系统,其通过主题森林结构树和语义框架模型的映射关系对人机对话内容进行解析,能够保证获取到准确的、完整的访客问题,以在此基础上保证答案的准确性和提高沟通效率。In order to solve the above problems, the present invention provides a human-machine dialogue method and system based on a semantic framework, which analyzes the human-machine dialogue content through the mapping relationship between the theme forest structure tree and the semantic framework model, thereby ensuring accurate acquisition. Complete visitor questions to ensure the accuracy of the answers and improve communication efficiency.
为实现上述目的,本发明采用的技术方案为:In order to achieve the above object, the technical solution adopted by the present invention is:
一种基于语义框架的人机对话方法,其包括以下步骤:A man-machine dialogue method based on a semantic framework, which comprises the following steps:
a.根据原始语料创建主题森林结构树,并在所述主题森林结构树中提取每个主题类型对应的实体属性;a. creating a theme forest structure tree according to the original corpus, and extracting entity attributes corresponding to each topic type in the theme forest structure tree;
b.利用所述主题森林结构树生成语义框架模型,并将所述主题森林结构树的实体属性映射至所述语义框架模型中对应的语义槽;b. generating a semantic framework model by using the theme forest structure tree, and mapping an entity attribute of the theme forest structure tree to a corresponding semantic slot in the semantic framework model;
c.人机对话时,对访客问题进行主题类型的匹配,并将访客问题填充至所述主题类型对应的语义框架模型中的语义槽中;c. When the human-machine dialogue, the topic type is matched to the visitor question, and the visitor question is filled into the semantic slot in the semantic framework model corresponding to the topic type;
d.将填充后的语义框架模型的语义槽的访客问题映射至所述主题森林结构树的实体属性中;d. mapping the guest problem of the semantic slot of the filled semantic framework model to the entity attribute of the topic forest tree;
e.所述主题森林结构树根据所述访客问题从知识库中进行问题匹配,并将匹配的问题所对应的答案反馈给访客。e. The theme forest tree performs question matching from the knowledge base according to the visitor question, and feeds the answer corresponding to the matched question to the visitor.
优选的,所述的步骤d中,进一步根据映射后的主题森林结构树进行判断所述访客问题是否满足预设条件;当所述访客问题满足预设条件时,所述主题森林结构树根据所述访客问题从知识库中进行问题匹配;当所述访客问题未满足预设条件时,所述主题森林结构树将判断结果反馈至前端的对话机器人。Preferably, in the step d, it is further determined, according to the mapped topic forest tree, whether the visitor problem satisfies a preset condition; when the visitor problem satisfies a preset condition, the theme forest tree according to the The guest problem is matched from the knowledge base; when the visitor problem does not satisfy the preset condition, the theme forest tree feeds the judgment result to the dialogue robot at the front end.
进一步的,所述实体属性包括必要属性和可选属性,所述语义槽包括必要语义槽和可选语义槽;所述预设条件为必要属性是否完整;人机对话时,对访客问题进行主题类型的匹配,并将访客问题填充至所述主题类型对应的语义框架模型中的必要语义槽和/或可选语义槽中;并将填充后的语义框架模型的必要语义槽和可选语义槽的访客问题映射至所述主题森林结构树的必要属性和可选属性;再进一步根据映射后的主题森林结构树进行判断所述必要属性是否完整;当所述访客问题的必要属性完整时,所述主题森林结构树根 据所述访客问题从知识库中进行问题匹配;当所述访客问题的必要属性不完整时,所述主题森林结构树将缺失的必要属性反馈至前端的对话机器人,由所述对话机器人根据缺失的必要属性向访客进行追问,得到所述主题类型的所有必要属性。Further, the entity attribute includes a necessary attribute and an optional attribute, the semantic slot includes a necessary semantic slot and an optional semantic slot; the preset condition is whether the necessary attribute is complete; and when the human-machine dialogue is performed, the subject of the visitor question is Matching of types and populating guest questions into the necessary semantic slots and/or optional semantic slots in the semantic framework model corresponding to the topic type; and the necessary semantic slots and optional semantic slots of the populated semantic framework model The visitor problem is mapped to the necessary attributes and optional attributes of the topic forest tree; and further determining whether the necessary attribute is complete according to the mapped topic forest tree; when the necessary attribute of the visitor problem is complete, The theme forest tree performs problem matching from the knowledge base according to the visitor problem; when the necessary attribute of the visitor problem is incomplete, the theme forest tree feeds back the necessary attributes missing to the front-end dialogue robot, The dialogue robot asks the visitor according to the missing necessary attributes to obtain all necessary genus of the subject type .
优选的,所述的步骤a中进一步包括:Preferably, the step a further includes:
a1.收集原始语料,并对原始语料进行主题聚类,得到不同类型的主题;A1. Collect the original corpus and perform subject clustering on the original corpus to obtain different types of topics;
a2.对每个主题类型进行实体关系的识别和提取,并根据所述实体关系确定每个主题类型的实体属性;A2. Identify and extract an entity relationship for each topic type, and determine an entity attribute of each topic type according to the entity relationship;
a3.根据所述实体属性,为每个类型的主题创建主题结构树,以及为所有的主题类型创建主题森林式知识库。A3. Create a topic tree for each type of topic based on the entity attributes, and create a topic forest knowledge base for all topic types.
进一步的,所述的步骤a1中,对原始语料进行主题聚类,是利用LDA主题模型工具进行主题提取和主题分类。Further, in the step a1, the subject clustering is performed on the original corpus, and the theme extraction and topic classification are performed by using the LDA topic model tool.
进一步的,所述的步骤a2中,对每个主题类型进行实体关系的识别和提取,是通过对原始语料进行语法解析和语义解析,根据解析结果提取实体信息和标注实体信息之间的关系。Further, in the step a2, the identification and extraction of the entity relationship for each topic type is performed by parsing and semantically parsing the original corpus, and extracting the relationship between the entity information and the tagged entity information according to the parsing result.
进一步的,所述的步骤a3中,所述主题结构树包括当前主题信息和主题间关联信息,根据所述主题间关联信息将所有类型的主题进行关联索引,得到主题森林式知识库。Further, in the step a3, the topic structure tree includes current topic information and inter-topic association information, and all types of topics are indexed according to the inter-topic association information to obtain a topic forest knowledge base.
优选的,所述的步骤c中,是通过对访客问题进行分词处理和关键词提取,根据提取的关键词进行匹配其所属的主题类型,,并将访客问题填充至所述主题类型对应的语义框架模型中的必要语义槽和/或可选语义槽中。Preferably, in the step c, the word segmentation processing and keyword extraction are performed on the visitor problem, and the type of the topic to which the keyword belongs is matched according to the extracted keyword, and the guest question is filled into the semantic corresponding to the topic type. The necessary semantic slots and/or optional semantic slots in the framework model.
进一步的,所述的步骤d中,是通过将填充后的语义框架模型的必要语义槽和可选语义槽的访客问题映射至所述主题森林结构树的必要属性和可选 属性,并将提取的关键词与所述必要属性和可选属性进行匹配,根据匹配结果判断是否缺失必要属性。Further, in the step d, the necessary semantic slots of the filled semantic framework model and the guest problem of the optional semantic slot are mapped to the necessary attributes and optional attributes of the theme forest structure tree, and the extraction and extraction are performed. The keyword matches the necessary attribute and the optional attribute, and judges whether the necessary attribute is missing according to the matching result.
对应的,本发明还提供一种基于语义框架的人机对话系统,其包括:Correspondingly, the present invention also provides a human-machine dialogue system based on a semantic framework, which includes:
主题结构树创建模块,其根据原始语料创建主题森林结构树,并在所述主题森林结构树中提取每个主题类型对应的实体属性;a topic tree creation module, which creates a topic forest tree according to the original corpus, and extracts an entity attribute corresponding to each topic type in the topic forest tree;
语义框架模型生成模块,其利用所述主题森林结构树生成语义框架模型,并将所述主题森林结构树的实体属性映射至所述语义框架模型中对应的语义槽;a semantic framework model generating module, which generates a semantic framework model by using the theme forest structure tree, and maps an entity attribute of the theme forest structure tree to a corresponding semantic slot in the semantic framework model;
人机对话模块,用于对访客问题进行主题类型的匹配,并将访客问题填充至所述主题类型对应的语义框架模型中的语义槽中;a human-machine dialog module for matching a topic type to a guest question, and populating a guest question into a semantic slot in a semantic framework model corresponding to the topic type;
问题匹配模块,用于将填充后的语义框架模型的语义槽的访客问题映射至所述主题森林结构树的实体属性中,所述主题森林结构树根据所述访客问题从知识库中进行问题匹配;a problem matching module, configured to map a guest problem of a semantic slot of the filled semantic framework model to an entity attribute of the topic forest structure tree, where the topic forest tree performs problem matching from the knowledge base according to the guest problem ;
答案反馈模块,用于将匹配的问题所对应的答案反馈给访客。An answer feedback module for feeding back the answer corresponding to the matched question to the visitor.
本发明的有益效果是:The beneficial effects of the invention are:
(1)本发明通过主题森林结构树和语义框架模型的映射关系对人机对话内容进行解析,能够保证获取到准确的、完整的访客问题,以在此基础上保证答案的准确性和提高沟通效率。(1) The present invention analyzes the content of human-machine dialogue through the mapping relationship between the theme forest tree and the semantic framework model, and can ensure accurate and complete visitor problems, so as to ensure the accuracy of the answer and improve communication on the basis of this. effectiveness.
(2)本发明通过在创建主题森林式知识库时设置主题的必要属性和可选属性,并与语义框架模型中的必要语义槽和可选语义槽相映射,从而在人机对话时将访客问题进行主题匹配和必要属性的匹配以及必要属性的追问,从而能够主动与访客进行互动,增加用户体验度。(2) The present invention maps the necessary and optional attributes of the theme when creating the topic forest-based knowledge base, and maps with the necessary semantic slots and optional semantic slots in the semantic framework model to thereby invite visitors during human-machine dialogue The problem is to match the topic matching and necessary attributes and the questioning of the necessary attributes, so as to actively interact with the visitor and increase the user experience.
附图说明DRAWINGS
此处所说明的附图用来提供对本发明的进一步理解,构成本发明的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The drawings described herein are intended to provide a further understanding of the invention, and are intended to be a part of the invention. In the drawing:
图1为本发明一种基于语义框架的人机对话方法的流程简图;1 is a schematic flow chart of a human-machine dialogue method based on a semantic framework according to the present invention;
图2为本发明一种基于语义框架的人机对话系统的结构示意图。2 is a schematic structural diagram of a human-machine dialogue system based on a semantic framework according to the present invention.
具体实施方式detailed description
为了使本发明所要解决的技术问题、技术方案及有益效果更加清楚、明白,以下结合附图及实施例对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。The present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
如图1所示,本发明的一种基于语义框架的人机对话方法,其包括以下步骤:As shown in FIG. 1 , a semantic framework-based human-machine dialogue method of the present invention includes the following steps:
a.根据原始语料创建主题森林结构树,并在所述主题森林结构树中提取每个主题类型对应的实体属性;a. creating a theme forest structure tree according to the original corpus, and extracting entity attributes corresponding to each topic type in the theme forest structure tree;
b.利用所述主题森林结构树生成语义框架模型,并将所述主题森林结构树的实体属性映射至所述语义框架模型中对应的语义槽;b. generating a semantic framework model by using the theme forest structure tree, and mapping an entity attribute of the theme forest structure tree to a corresponding semantic slot in the semantic framework model;
c.人机对话时,对访客问题进行主题类型的匹配,并将访客问题填充至所述主题类型对应的语义框架模型中的语义槽中;c. When the human-machine dialogue, the topic type is matched to the visitor question, and the visitor question is filled into the semantic slot in the semantic framework model corresponding to the topic type;
d.将填充后的语义框架模型的语义槽的访客问题映射至所述主题森林结构树的实体属性中;d. mapping the guest problem of the semantic slot of the filled semantic framework model to the entity attribute of the topic forest tree;
e.所述主题森林结构树根据所述访客问题从知识库中进行问题匹配,并将匹配的问题所对应的答案反馈给访客。e. The theme forest tree performs question matching from the knowledge base according to the visitor question, and feeds the answer corresponding to the matched question to the visitor.
其中,语义框架是知识表示的一种,框架语义学(Frame Semantics)是由美国语言学家菲尔墨(Fillmore)提出的认知语言学理论,slot是该框架中的 “槽”。框架语义学首先是一种通向理解及描写词语和语法结构的意义的途径。它是从这样的假设开始的,即为了理解语言中词语的意义,我们必需先具备概念结构,即语义框架的知识。语义框架提供词语的意义在语言中存在以及在话语中使用的背景和动因。框架语义学假设,词语可以通过它所在的语言结构,选择和突出基本的语义框架的某些方面或某些实例,而这是以一定的方式(按照一定的原则)进行的。因此,解释词语的意义和功能,可以按照从基本的语义框架的描写开始直到对这些方式的特点加以了详细刻画这样的思路进行。Among them, the semantic framework is a kind of knowledge representation. Frame Semantics is a cognitive linguistic theory proposed by American linguist Fillmore. The slot is the "slot" in the framework. Framework semantics is first and foremost a way to understand and describe the meaning of words and grammatical structures. It starts with the assumption that in order to understand the meaning of words in the language, we must first have the conceptual structure, that is, the knowledge of the semantic framework. The semantic framework provides the context and motivation for the meaning of words in the language and in the words. Framework semantics assumes that words can select and highlight certain aspects or instances of the basic semantic framework through its linguistic structure, which is done in a certain way (according to certain principles). Therefore, the interpretation of the meaning and function of words can be carried out in the light of the description of the basic semantic framework until the characteristics of these methods are detailed.
以汉语词典语义框架举例如下:The semantic framework of the Chinese dictionary is as follows:
[词形]:走[Word form]: go
[拼音]:zou3[Pinyin]: zou3
[制作者]:陈群秀[producer]: Chen Qunxiu
[工作单号]:6[Work Order Number]: 6
[动词类型]:他动词[verb type]: his verb
[论元数目]:2[Number of arguments]: 2
[义项数目]:10**[Number of meanings]: 10**
[义项序号]:10*[Sense item number]: 10*
[释义]:*比喻意义的“走”。[Interpretation]: * "going" in the meaning of metaphor.
[基本式1]:施事+走+客事[Basic 1]: Shi Shi + Go + Guest
[基本式句例1]:中国坚定不移地走改革开放的道路。教师队伍的建设走上规范化、法制化的轨道。[Basic sentence example 1]: China unswervingly follows the path of reform and opening up. The construction of the teaching team has embarked on the track of standardization and legalization.
[基本式2]:施事+走+方向[Basic 2]: Shi Shi + Go + Direction
[基本式句例2]:中国足球队走向世界。国产电梯正走向世界。三大民 族史诗走上书架。帆船时代走向终结。[Basic sentence example 2]: The Chinese football team goes to the world. Domestic elevators are heading for the world. The three national epics went to the bookshelf. The sailing era is coming to an end.
[论旨名称]:施事[Name of the thesis]: Acting
[句法功能]:主语[syntax function]: subject
[语义分类]:{超类|事|物|时空}[Semantic classification]: {super class|things|objects|spacetime}
[论旨实例]:宇宙、国民经济、工业、中国足球队、国产电梯、帆船时代、作品[Thesis]: Universe, national economy, industry, Chinese football team, domestic elevator, sailing era, works
[论旨名称]:客事[Name of the thesis]: Guest
[句法功能]:宾语[syntax function]: object
[语义分类]:{空间|抽象物}[semantic classification]: {space|abstracts}
[论旨实例]:道路、轨道、正轨[Example of thesis]: road, track, orbit
[论旨名称]:方向[Name of the thesis]: direction
[句法功能]:状语[syntax function]: adverbial
[语义分类]:{空间|抽象物|具体物}[semantic classification]: {space|abstract|specific}
[论旨标记]:向;上;朝[mark of the purpose]: toward; on;
[论旨实例]:世界、书架、终结、光明、实施、法治、失败、稳定、卖国主义[Thesis]: the world, bookshelves, end, light, implementation, the rule of law, failure, stability, traitorism
[否定式]:不V V不了[Negative]: No V V can't
[时态]:要1V将要V即将V就要V快要V V着V着呢正在V正在V呢V下去V下来V起来2[Temperature]: To 1V, V will be V, V will be V V, V is going, V is V, V is going down, V is down, V is up 2
[备注式1]:扩展式:处所[Remarks 1]: Extended: Location
其中,中括号内代表语义槽,冒号后面的内容代表该语义槽填充的内容,可根据需要将上述语义槽划分为必要语义槽和可选语义槽。The brackets represent the semantic slot, and the content after the colon represents the content filled by the semantic slot. The semantic slot can be divided into the necessary semantic slot and the optional semantic slot as needed.
本实施例中:In this embodiment:
所述的步骤a中,所述实体属性包括必要属性和可选属性。In the step a, the entity attribute includes a required attribute and an optional attribute.
所述的步骤b中,所述语义槽包括必要语义槽和可选语义槽。In the step b, the semantic slot includes a necessary semantic slot and an optional semantic slot.
所述的步骤c中,人机对话时,对访客问题进行主题类型的匹配,并将访客问题填充至所述主题类型对应的语义框架模型中的必要语义槽和/或可选语义槽中;并将填充后的语义框架模型的必要语义槽和可选语义槽的访客问题映射至所述主题森林结构树的必要属性和可选属性。In the step c, during the human-machine dialogue, the topic type is matched to the guest question, and the guest question is filled into the necessary semantic slot and/or the optional semantic slot in the semantic framework model corresponding to the topic type; The necessary semantic slots of the populated semantic framework model and the guest questions of the optional semantic slot are mapped to the necessary and optional attributes of the topic forest tree.
所述的步骤d中,进一步根据映射后的主题森林结构树进行判断所述访客问题是否满足预设条件;当所述访客问题满足预设条件时,所述主题森林结构树根据所述访客问题从知识库中进行问题匹配;当所述访客问题未满足预设条件时,所述主题森林结构树将判断结果反馈至前端的对话机器人。本实施例中,所述预设条件为必要属性是否完整;即:根据映射后的主题森林结构树进行判断所述必要属性是否完整;当所述访客问题的必要属性完整时,所述主题森林结构树根据所述访客问题从知识库中进行问题匹配;当所述访客问题的必要属性不完整时,所述主题森林结构树将缺失的必要属性反馈至前端的对话机器人,由所述对话机器人根据缺失的必要属性向访客进行追问,得到所述主题类型的所有必要属性。本实施例中,追问后重新返回步骤c进行访客问题的提取、语义槽的填充、实体属性的映射、缺失属性的判断等,并重复以上过程,直至满足所述主题森林结构树所需的全部必要属性,并从所述知识库中检索出答案给访客为止。In the step d, it is further determined, according to the mapped topic forest tree, whether the visitor problem satisfies a preset condition; when the visitor question satisfies a preset condition, the topic forest tree is based on the visitor problem The problem matching is performed from the knowledge base; when the guest question does not satisfy the preset condition, the theme forest tree feeds the judgment result to the dialogue robot at the front end. In this embodiment, the preset condition is whether the necessary attribute is complete; that is, determining whether the required attribute is complete according to the mapped topic forest tree; and when the necessary attribute of the visitor problem is complete, the theme forest The tree performs question matching from the knowledge base according to the visitor question; when the necessary attribute of the visitor question is incomplete, the theme forest tree feeds back the necessary attributes that are missing to the front-end dialog robot, and the dialog robot The visitor is questioned based on the missing necessary attributes to get all the necessary attributes of the subject type. In this embodiment, after the challenge, the process returns to step c to perform the extraction of the guest question, the filling of the semantic slot, the mapping of the attribute of the entity, the determination of the missing attribute, and the like, and repeats the above process until all the required trees of the theme forest tree are satisfied. Required attributes and retrieve the answers from the knowledge base to the visitor.
其中,所述的步骤a中进一步包括:Wherein, the step a further includes:
a1.收集原始语料,并对原始语料进行主题聚类,得到不同类型的主题;A1. Collect the original corpus and perform subject clustering on the original corpus to obtain different types of topics;
a2.对每个主题类型进行实体关系的识别和提取,并根据所述实体关系确定每个主题类型的实体属性;A2. Identify and extract an entity relationship for each topic type, and determine an entity attribute of each topic type according to the entity relationship;
a3.根据所述实体属性,为每个类型的主题创建主题结构树,以及为所有的主题类型创建主题森林式知识库。A3. Create a topic tree for each type of topic based on the entity attributes, and create a topic forest knowledge base for all topic types.
具体如下:details as follows:
所述的步骤a1中,对原始语料进行主题聚类,是利用LDA主题模型工具进行主题提取和主题分类。其中,所述原始语料是指访客与客服的历史对话记录,并根据新的对话记录对所述原始语料进行定期更新或实时更新。所述LDA(Latent Dirichlet Allocation)主题模型是一种文档主题生成模型,也称为一个三层贝叶斯概率模型,包含词、主题和文档三层结构。文档到主题服从多项式分布,主题到词服从多项式分布。对每一篇文档,从主题分布中抽取一个主题,从被抽到的主题所对应的单词分布中抽取一个单词;重复上述过程直至遍历文档中的每一个单词,从而得到文档的主题。所述文档即本发明中的访客与客服的对话记录。例如,将一份原始语料划分为天气查询、火车查询、航班查询等主题。In the step a1, subject clustering of the original corpus is to use the LDA topic model tool for topic extraction and topic classification. The original corpus refers to a historical conversation record between the visitor and the customer service, and the original corpus is periodically updated or updated in real time according to the new conversation record. The LDA (Latent Dirichlet Allocation) topic model is a document topic generation model, also called a three-layer Bayesian probability model, which includes a three-layer structure of words, topics and documents. The document to topic follows a polynomial distribution, and the subject to the word follows a polynomial distribution. For each document, a topic is extracted from the topic distribution, and a word is extracted from the word distribution corresponding to the extracted topic; the above process is repeated until each word in the document is traversed, thereby obtaining the subject of the document. The document is a dialogue record between the visitor and the customer service in the present invention. For example, a raw corpus is divided into topics such as weather queries, train inquiries, flight inquiries, and the like.
所述的步骤a2中,对每个主题类型进行实体关系的识别和提取,是通过对原始语料进行语法解析和语义解析,根据解析结果提取实体信息和标注实体信息之间的关系,可以用实体关系图进行表示。实体关系图:简记E-R图,是指以实体、关系、属性三个基本概念概括数据的基本结构。所述实体即命名实体(named entity),其包括名称(组织名、人名、地名、商品名)、表达式(日期、时间)等在内的具有明确语义信息的文本实体,在E-R图中用矩形表示,矩形框内写明实体名;比如访客作为一个实体。所述属性(Attribute),实体所具有的某一特性,一个实体可由若干个属性来刻画;在E-R图中用椭圆形表示,并用无向边将其与相应的实体连接起来;比如访客的姓名、账号、性别等,都是属性。所述关系(Relationship),是指数据对象彼此之间相互 连接的方式,包括一对一关系、一对多关系、多对多关系。In the step a2, the identification and extraction of the entity relationship for each topic type is performed by parsing and semantically parsing the original corpus, and extracting the relationship between the entity information and the tagged entity information according to the parsing result, and the entity may be used. The diagram is represented. Entity relationship diagram: A shorthand E-R diagram refers to the basic structure of data summarized by three basic concepts of entity, relationship and attribute. The entity is a named entity, which includes a text entity with explicit semantic information such as a name (organization name, person name, place name, trade name), an expression (date, time), etc., used in the ER diagram. The rectangle indicates that the name of the entity is written in the rectangle; for example, the visitor is an entity. The attribute, an attribute possessed by the entity, an entity can be characterized by several attributes; it is represented by an ellipse in the ER diagram, and is connected with the corresponding entity by an undirected edge; for example, the name of the visitor , account number, gender, etc., are attributes. The relationship refers to a way in which data objects are connected to each other, including a one-to-one relationship, a one-to-many relationship, and a many-to-many relationship.
所述的步骤a3中,所述主题结构树包括当前主题信息和主题间关联信息,根据所述主题间关联信息将所有类型的主题进行关联索引,得到主题森林式知识库。一个对话可能仅局限于某个领域内的单一主题,也可能同时涉及多个领域的多个主题。通过匹配对话中的访客问题所涉及的主题,当涉及单一主题时,则通过主题森林式知识库进行查找匹配的主题类型,并获取该主题类型的必要属性和可选属性进行确认问题的完整性;当涉及多个主题时,则通过主题森林式知识库进行多个主题类型的匹配,并获取多个主题类型的对应的必要属性和可选属性进行一一确认所述访客问题在每个主题类型中的完整性。In the step a3, the topic structure tree includes current topic information and inter-topic association information, and all types of topics are indexed according to the inter-topic relationship information to obtain a topic forest knowledge base. A conversation may be limited to a single topic within a domain, or it may involve multiple topics in multiple domains. By matching the topics involved in the guest question in the conversation, when a single topic is involved, the topic type is searched for by the topic forest knowledge base, and the necessary and optional attributes of the topic type are obtained to confirm the integrity of the problem. When multiple topics are involved, matching multiple topic types through the topic forest knowledge base, and obtaining corresponding necessary attributes and optional attributes of multiple topic types to confirm the visitor problem in each topic Integrity in the type.
所述的步骤c中,对访客问题进行主题类型的匹配,是通过对访客问题进行分词处理和关键词提取,根据提取的关键词进行匹配其所属的主题类型,,并将访客问题填充至所述主题类型对应的语义框架模型中的必要语义槽和/或可选语义槽中。所述的步骤c中,将访客问题填充至所述主题类型对应的语义框架模型中的语义槽中,是利用自然语言框架解析器将访客问题中的对应内容填充到语义框架模型中的每个语义槽中。In the step c, the topic type matching is performed on the visitor problem by performing word segmentation processing and keyword extraction on the visitor problem, matching the type of the topic to which the keyword is matched according to the extracted keyword, and filling the visitor question into the location. The necessary semantic slots and/or optional semantic slots in the semantic framework model corresponding to the topic type. In the step c, filling the guest question into the semantic slot in the semantic framework model corresponding to the topic type, using the natural language framework parser to populate the corresponding content in the guest question into each of the semantic framework models. In the semantic slot.
所述的步骤d中,是通过将填充后的语义框架模型的必要语义槽和可选语义槽的访客问题映射至所述主题森林结构树的必要属性和可选属性,并将提取的关键词与所述必要属性和可选属性进行匹配,根据匹配结果判断是否缺失必要属性。In the step d, the necessary semantic slots of the filled semantic framework model and the guest problem of the optional semantic slot are mapped to the necessary attributes and optional attributes of the theme forest structure tree, and the extracted keywords are Matching the necessary attributes and optional attributes, and judging whether the necessary attributes are missing according to the matching result.
本实施例中,以天气查询主题为例进行说明如下:In this embodiment, the weather query topic is taken as an example for explanation:
首选,构建主题森林结构树:First, build the theme forest tree:
根:天气查询;Root: weather query;
必要属性1:时间;Required attribute 1: time;
必要属性2:地点;Required attribute 2: location;
可选属性1:天气类型,如下雨、下雪、雾霾等;Optional attribute 1: Weather type, such as rain, snow, haze, etc.;
可选属性2:其他。Optional attribute 2: Other.
然后,根据主题森林树生成语义框架模型如下:Then, the semantic framework model is generated according to the theme forest tree as follows:
天气查询框架Weather query framework
槽(slot)1时间:Slot 1 time:
槽(slot)2地点:Slot 2 location:
槽(slot)3天气类型:Slot 3 weather type:
槽(slot)4其它:Slot 4 other:
接着,获取访客问题:明天北京会下雨吗?Then, get the visitor question: Will it rain in Beijing tomorrow?
经过天气查询语义框架模型后得到的结果如下:The results obtained after the weather query semantic framework model are as follows:
语义框架:天气查询Semantic Framework: Weather Query
槽(slot)1时间:明天(系统可根据当前系统日期来确定);Slot 1 time: Tomorrow (the system can be determined according to the current system date);
槽(slot)2地点:北京Slot 2 Location: Beijing
槽(slot)3天气类型:下雨Slot 3 Weather Type: Raining
槽(slot)4其它:无Slot 4 other: none
语义框架模型将以上获取到的内容映射到主题森林结构树,主题森林结构树判断后,发现其已满足必要属性1和必要属性2,所以到知识库中进行问题的查询并将查询结果(答案)反馈给访客。The semantic framework model maps the above acquired content to the theme forest structure tree. After the theme forest structure tree is judged, it finds that it has met the necessary attribute 1 and the necessary attribute 2, so the problem query is made into the knowledge base and the query result (answer ) Feedback to visitors.
如图2所示,本发明还提供一种基于语义框架的人机对话系统,其包括:As shown in FIG. 2, the present invention also provides a human-machine dialogue system based on a semantic framework, which includes:
主题结构树创建模块,其根据原始语料创建主题森林结构树,并在所述主题森林结构树中提取每个主题类型对应的实体属性;a topic tree creation module, which creates a topic forest tree according to the original corpus, and extracts an entity attribute corresponding to each topic type in the topic forest tree;
语义框架模型生成模块,其利用所述主题森林结构树生成语义框架模型,并将所述主题森林结构树的实体属性映射至所述语义框架模型中对应的语义槽;a semantic framework model generating module, which generates a semantic framework model by using the theme forest structure tree, and maps an entity attribute of the theme forest structure tree to a corresponding semantic slot in the semantic framework model;
人机对话模块,用于对访客问题进行主题类型的匹配,并将访客问题填充至所述主题类型对应的语义框架模型中的语义槽中;a human-machine dialog module for matching a topic type to a guest question, and populating a guest question into a semantic slot in a semantic framework model corresponding to the topic type;
问题匹配模块,用于将填充后的语义框架模型的语义槽的访客问题映射至所述主题森林结构树的实体属性中,所述主题森林结构树根据所述访客问题从知识库中进行问题匹配;a problem matching module, configured to map a guest problem of a semantic slot of the filled semantic framework model to an entity attribute of the topic forest structure tree, where the topic forest tree performs problem matching from the knowledge base according to the guest problem ;
答案反馈模块,用于将匹配的问题所对应的答案反馈给访客。An answer feedback module for feeding back the answer corresponding to the matched question to the visitor.
需要说明的是,本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。对于系统实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。It should be noted that each embodiment in the specification is described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same similar parts between the embodiments are referred to each other. can. For the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
并且,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。另外,本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成。Also, the term "comprises", "comprising", or any other variants thereof, is intended to encompass a non-exclusive inclusion, such that a process, method, article, or device comprising a series of elements includes not only those elements but also Other elements not explicitly listed, or elements that are inherent to such a process, method, item, or device. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, method, item, or device that comprises the element. In addition, those skilled in the art can understand that all or part of the steps of implementing the foregoing embodiments may be completed by hardware, or may be instructed by a program to perform related hardware.
上述说明示出并描述了本发明的优选实施例,应当理解本发明并非局限于本文所披露的形式,不应看作是对其他实施例的排除,而可用于各种其他 组合、修改和环境,并能够在本文发明构想范围内,通过上述教导或相关领域的技术或知识进行改动。而本领域人员所进行的改动和变化不脱离本发明的精神和范围,则都应在本发明所附权利要求的保护范围内。The above description shows and describes the preferred embodiments of the present invention. It is to be understood that the invention is not to be construed as being limited to the details disclosed herein. And modifications can be made by the above teachings or related art or knowledge within the scope of the inventive concept. All changes and modifications made by those skilled in the art are intended to be within the scope of the appended claims.

Claims (10)

  1. 一种基于语义框架的人机对话方法,其特征在于,包括以下步骤:A man-machine dialogue method based on a semantic framework, characterized in that it comprises the following steps:
    a.根据原始语料创建主题森林结构树,并在所述主题森林结构树中提取每个主题类型对应的实体属性;a. creating a theme forest structure tree according to the original corpus, and extracting entity attributes corresponding to each topic type in the theme forest structure tree;
    b.利用所述主题森林结构树生成语义框架模型,并将所述主题森林结构树的实体属性映射至所述语义框架模型中对应的语义槽;b. generating a semantic framework model by using the theme forest structure tree, and mapping an entity attribute of the theme forest structure tree to a corresponding semantic slot in the semantic framework model;
    c.人机对话时,对访客问题进行主题类型的匹配,并将访客问题填充至所述主题类型对应的语义框架模型中的语义槽中;c. When the human-machine dialogue, the topic type is matched to the visitor question, and the visitor question is filled into the semantic slot in the semantic framework model corresponding to the topic type;
    d.将填充后的语义框架模型的语义槽的访客问题映射至所述主题森林结构树的实体属性中;d. mapping the guest problem of the semantic slot of the filled semantic framework model to the entity attribute of the topic forest tree;
    e.所述主题森林结构树根据所述访客问题从知识库中进行问题匹配,并将匹配的问题所对应的答案反馈给访客。e. The theme forest tree performs question matching from the knowledge base according to the visitor question, and feeds the answer corresponding to the matched question to the visitor.
  2. 根据权利要求1所述的一种基于语义框架的人机对话方法,其特征在于:所述的步骤d中,进一步根据映射后的主题森林结构树进行判断所述访客问题是否满足预设条件;当所述访客问题满足预设条件时,所述主题森林结构树根据所述访客问题从知识库中进行问题匹配;当所述访客问题未满足预设条件时,所述主题森林结构树将判断结果反馈至前端的对话机器人。The semantic framework-based human-machine dialogue method according to claim 1, wherein in the step d, determining whether the visitor problem satisfies a preset condition according to the mapped theme forest structure tree; When the visitor question satisfies a preset condition, the theme forest tree performs question matching from the knowledge base according to the visitor question; when the visitor question does not satisfy the preset condition, the theme forest tree will judge The result is fed back to the dialogue robot at the front end.
  3. 根据权利要求2所述的一种基于语义框架的人机对话方法,其特征在于:所述实体属性包括必要属性和可选属性,所述语义槽包括必要语义槽和可选语义槽;所述预设条件为必要属性是否完整;人机对话时,对访客问题进行主题类型的匹配,并将访客问题填充至所述主题类型对应的语义框架模型中的必要语义槽和/或可选语义槽中;并将填充后的语义框架模型的必要语义槽和可选语义槽的访客问题映射至所述主题森林结构树的必要属性和可选属性;再进一步根据映射后的主题森林结构树进行判断所述必要属性是否完 整;当所述访客问题的必要属性完整时,所述主题森林结构树根据所述访客问题从知识库中进行问题匹配;当所述访客问题的必要属性不完整时,所述主题森林结构树将缺失的必要属性反馈至前端的对话机器人,由所述对话机器人根据缺失的必要属性向访客进行追问,得到所述主题类型的所有必要属性。The semantic framework-based human-machine dialog method according to claim 2, wherein the entity attribute comprises a necessary attribute and an optional attribute, the semantic slot includes a necessary semantic slot and an optional semantic slot; The preset condition is whether the necessary attribute is complete; when the man-machine conversation is performed, the topic type is matched to the guest question, and the guest question is filled into the necessary semantic slot and/or the optional semantic slot in the semantic framework model corresponding to the topic type. And mapping the necessary semantic slots of the filled semantic framework model and the guest problem of the optional semantic slot to the necessary attributes and optional attributes of the topic forest structure tree; further determining according to the mapped theme forest tree Whether the required attribute is complete; when the necessary attribute of the visitor question is complete, the subject forest tree performs question matching from the knowledge base according to the visitor question; when the necessary attribute of the visitor question is incomplete, The theme forest tree feeds back the necessary attributes of the missing to the front-end dialogue robot, which is based on the missing The necessary attributes are asked by the visitor to get all the necessary attributes of the subject type.
  4. 根据权利要求1至3任一项所述的一种基于语义框架的人机对话方法,其特征在于:所述的步骤a中进一步包括:The human-machine dialogue method based on the semantic framework according to any one of claims 1 to 3, wherein the step a further comprises:
    a1.收集原始语料,并对原始语料进行主题聚类,得到不同类型的主题;A1. Collect the original corpus and perform subject clustering on the original corpus to obtain different types of topics;
    a2.对每个主题类型进行实体关系的识别和提取,并根据所述实体关系确定每个主题类型的实体属性;A2. Identify and extract an entity relationship for each topic type, and determine an entity attribute of each topic type according to the entity relationship;
    a3.根据所述实体属性,为每个类型的主题创建主题结构树,以及为所有的主题类型创建主题森林式知识库。A3. Create a topic tree for each type of topic based on the entity attributes, and create a topic forest knowledge base for all topic types.
  5. 根据权利要求4所述的一种基于语义框架的人机对话方法,其特征在于:所述的步骤a1中,对原始语料进行主题聚类,是利用LDA主题模型工具进行主题提取和主题分类。The semantic framework-based human-machine dialogue method according to claim 4, wherein in the step a1, subject clustering of the original corpus is performed by using the LDA topic model tool for topic extraction and topic classification.
  6. 根据权利要求4所述的一种基于语义框架的人机对话方法,其特征在于:所述的步骤a2中,对每个主题类型进行实体关系的识别和提取,是通过对原始语料进行语法解析和语义解析,根据解析结果提取实体信息和标注实体信息之间的关系。The human-machine dialogue method based on the semantic framework according to claim 4, wherein in the step a2, the entity relationship is identified and extracted for each topic type, and the original corpus is parsed by syntax analysis. And semantic parsing, extracting the relationship between the entity information and the annotated entity information according to the parsing result.
  7. 根据权利要求4所述的一种基于语义框架的人机对话方法,其特征在于:所述的步骤a3中,所述主题结构树包括当前主题信息和主题间关联信息,根据所述主题间关联信息将所有类型的主题进行关联索引,得到主题森林式知识库。The semantic framework-based human-machine dialog method according to claim 4, wherein in the step a3, the topic structure tree includes current topic information and inter-topic association information, according to the inter-topic association Information correlates all types of topics to get a topical forest knowledge base.
  8. 根据权利要求3所述的一种基于语义框架的人机对话方法,其特征在于:所述的步骤c中,是通过对访客问题进行分词处理和关键词提取,根据提取的关键词进行匹配其所属的主题类型,,并将访客问题填充至所述主题类型对应的语义框架模型中的必要语义槽和/或可选语义槽中。The semantic framework-based human-machine dialogue method according to claim 3, wherein in the step c, the word segmentation processing and keyword extraction are performed on the visitor problem, and the extracted keywords are matched. The topic type, and the guest question is populated into the necessary semantic slots and/or optional semantic slots in the semantic framework model corresponding to the topic type.
  9. 根据权利要求8所述的一种基于语义框架的人机对话方法,其特征在于:所述的步骤d中,是通过将填充后的语义框架模型的必要语义槽和可选语义槽的访客问题映射至所述主题森林结构树的必要属性和可选属性,并将提取的关键词与所述必要属性和可选属性进行匹配,根据匹配结果判断是否缺失必要属性。The semantic framework-based human-machine dialogue method according to claim 8, wherein the step d is a guest problem by using a necessary semantic slot of the filled semantic framework model and an optional semantic slot. Mapping to the necessary attributes and optional attributes of the topic forest tree, and matching the extracted keywords with the necessary attributes and optional attributes, and determining whether the necessary attributes are missing according to the matching result.
  10. 一种基于语义框架的人机对话系统,其特征在于,包括:A human-machine dialogue system based on a semantic framework, which is characterized in that it comprises:
    主题结构树创建模块,其根据原始语料创建主题森林结构树,并在所述主题森林结构树中提取每个主题类型对应的实体属性;a topic tree creation module, which creates a topic forest tree according to the original corpus, and extracts an entity attribute corresponding to each topic type in the topic forest tree;
    语义框架模型生成模块,其利用所述主题森林结构树生成语义框架模型,并将所述主题森林结构树的实体属性映射至所述语义框架模型中对应的语义槽;a semantic framework model generating module, which generates a semantic framework model by using the theme forest structure tree, and maps an entity attribute of the theme forest structure tree to a corresponding semantic slot in the semantic framework model;
    人机对话模块,用于对访客问题进行主题类型的匹配,并将访客问题填充至所述主题类型对应的语义框架模型中的语义槽中;a human-machine dialog module for matching a topic type to a guest question, and populating a guest question into a semantic slot in a semantic framework model corresponding to the topic type;
    问题匹配模块,用于将填充后的语义框架模型的语义槽的访客问题映射至所述主题森林结构树的实体属性中,所述主题森林结构树根据所述访客问题从知识库中进行问题匹配;a problem matching module, configured to map a guest problem of a semantic slot of the filled semantic framework model to an entity attribute of the topic forest structure tree, where the topic forest tree performs problem matching from the knowledge base according to the guest problem ;
    答案反馈模块,用于将匹配的问题所对应的答案反馈给访客。An answer feedback module for feeding back the answer corresponding to the matched question to the visitor.
PCT/CN2018/124937 2018-04-28 2018-12-28 Semantic-framework-based human-machine conversation method and system WO2019205705A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810399238.4 2018-04-28
CN201810399238.4A CN108932278B (en) 2018-04-28 2018-04-28 Man-machine conversation method and system based on semantic framework

Publications (1)

Publication Number Publication Date
WO2019205705A1 true WO2019205705A1 (en) 2019-10-31

Family

ID=64448446

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/124937 WO2019205705A1 (en) 2018-04-28 2018-12-28 Semantic-framework-based human-machine conversation method and system

Country Status (2)

Country Link
CN (1) CN108932278B (en)
WO (1) WO2019205705A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932278B (en) * 2018-04-28 2021-05-18 厦门快商通信息技术有限公司 Man-machine conversation method and system based on semantic framework
CN109885835B (en) * 2019-02-19 2023-06-27 广东小天才科技有限公司 Method and system for acquiring association relation between words in user corpus
CN112911073B (en) * 2019-04-30 2023-04-25 五竹科技(北京)有限公司 Intelligent knowledge graph construction method and device for outbound flow dialogue content
CN110427470B (en) * 2019-07-25 2024-05-28 腾讯科技(深圳)有限公司 Question and answer processing method and device and electronic equipment
CN111414764A (en) * 2020-03-18 2020-07-14 苏州思必驰信息科技有限公司 Method and system for determining skill field of dialog text
CN113326702B (en) * 2021-06-11 2024-02-20 北京猎户星空科技有限公司 Semantic recognition method, semantic recognition device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101414310A (en) * 2008-10-17 2009-04-22 山西大学 Method and apparatus for searching natural language
CN105701253A (en) * 2016-03-04 2016-06-22 南京大学 Chinese natural language interrogative sentence semantization knowledge base automatic question-answering method
CN105868313A (en) * 2016-03-25 2016-08-17 浙江大学 Mapping knowledge domain questioning and answering system and method based on template matching technique
CN107729493A (en) * 2017-09-29 2018-02-23 北京创鑫旅程网络技术有限公司 Travel the construction method of knowledge mapping, device and travelling answering method, device
CN108932278A (en) * 2018-04-28 2018-12-04 厦门快商通信息技术有限公司 Interactive method and system based on semantic frame

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103000052A (en) * 2011-09-16 2013-03-27 上海先先信息科技有限公司 Man-machine interactive spoken dialogue system and realizing method thereof
CN104050256B (en) * 2014-06-13 2017-05-24 西安蒜泥电子科技有限责任公司 Initiative study-based questioning and answering method and questioning and answering system adopting initiative study-based questioning and answering method
US10783159B2 (en) * 2014-12-18 2020-09-22 Nuance Communications, Inc. Question answering with entailment analysis
CN104573028B (en) * 2015-01-14 2019-01-25 百度在线网络技术(北京)有限公司 Realize the method and system of intelligent answer
CN105513593B (en) * 2015-11-24 2019-09-17 南京师范大学 A kind of intelligent human-machine interaction method of voice driven
CN105788593B (en) * 2016-02-29 2019-12-10 中国科学院声学研究所 Method and system for generating conversation strategy
CN105931638B (en) * 2016-04-26 2019-12-24 北京光年无限科技有限公司 Intelligent robot-oriented dialogue system data processing method and device
CN106653019B (en) * 2016-12-07 2019-11-15 华南理工大学 A kind of human-machine conversation control method and system based on user's registration information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101414310A (en) * 2008-10-17 2009-04-22 山西大学 Method and apparatus for searching natural language
CN105701253A (en) * 2016-03-04 2016-06-22 南京大学 Chinese natural language interrogative sentence semantization knowledge base automatic question-answering method
CN105868313A (en) * 2016-03-25 2016-08-17 浙江大学 Mapping knowledge domain questioning and answering system and method based on template matching technique
CN107729493A (en) * 2017-09-29 2018-02-23 北京创鑫旅程网络技术有限公司 Travel the construction method of knowledge mapping, device and travelling answering method, device
CN108932278A (en) * 2018-04-28 2018-12-04 厦门快商通信息技术有限公司 Interactive method and system based on semantic frame

Also Published As

Publication number Publication date
CN108932278B (en) 2021-05-18
CN108932278A (en) 2018-12-04

Similar Documents

Publication Publication Date Title
WO2019205705A1 (en) Semantic-framework-based human-machine conversation method and system
CN107291687B (en) Chinese unsupervised open type entity relation extraction method based on dependency semantics
US20220004714A1 (en) Event extraction method and apparatus, and storage medium
CN109960786A (en) Chinese Measurement of word similarity based on convergence strategy
CN106776797A (en) A kind of knowledge Q-A system and its method of work based on ontology inference
JP2017511922A (en) Method, system, and storage medium for realizing smart question answer
CN104933027A (en) Open Chinese entity relation extraction method using dependency analysis
Shah et al. Ontology-based information extraction: An overview and a study of different approaches
Specia et al. A hybrid approach for extracting semantic relations from texts
Nguyen et al. Ripple down rules for question answering
JP2013190985A (en) Knowledge response system, method and computer program
Li et al. Neural factoid geospatial question answering
Deshpande et al. Natural language query processing using probabilistic context free grammar
Nguyen et al. A vietnamese question answering system
CN107291700A (en) Entity word recognition method and device
CN114091464B (en) High-universality many-to-many relation triple extraction method fusing five-dimensional features
Kilgarriff Foreground and background lexicons and word sense disambiguation for information extraction
Clifton et al. Bangor at TREC 2004: Question Answering Track.
Mersch et al. An Information-Theoretic Sentence Similarity Metric
Shen et al. Sliqa-i: Towards cold-start development of end-to-end spoken language interface for question answering
Chen et al. An ontology learning method enhanced by frame semantics
Reshadat et al. Confidence measure estimation for open information extraction
Dai et al. Qam: question answering system based on knowledge graph in the military
Meštrovic Semantic matching using concept lattice
CN114661856A (en) Fusion map construction method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18916522

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18916522

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 22.04.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18916522

Country of ref document: EP

Kind code of ref document: A1