WO2017041372A1 - Man-machine interaction method and system based on artificial intelligence - Google Patents

Man-machine interaction method and system based on artificial intelligence Download PDF

Info

Publication number
WO2017041372A1
WO2017041372A1 PCT/CN2015/096599 CN2015096599W WO2017041372A1 WO 2017041372 A1 WO2017041372 A1 WO 2017041372A1 CN 2015096599 W CN2015096599 W CN 2015096599W WO 2017041372 A1 WO2017041372 A1 WO 2017041372A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
user
chat
module
candidate
Prior art date
Application number
PCT/CN2015/096599
Other languages
French (fr)
Chinese (zh)
Inventor
王海峰
吴华
�田�浩
赵世奇
孙雯玉
吴甜
忻舟
马艳军
吕雅娟
Original Assignee
百度在线网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to CN201510563338.2A priority Critical patent/CN105068661B/en
Priority to CN201510563338.2 priority
Application filed by 百度在线网络技术(北京)有限公司 filed Critical 百度在线网络技术(北京)有限公司
Publication of WO2017041372A1 publication Critical patent/WO2017041372A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Abstract

A man-machine interaction method and system based on artificial intelligence. The man-machine interaction method based on artificial intelligence comprises the following steps: receiving input information input by a user through an application terminal (S1); acquiring intent information about the user according to the input information about the user, and distributing the input information to at least one interaction service subsystem according to the intent information (S2); receiving a return result returned by the at least one interaction service subsystem (S3); and generating a user return result according to the return result in a pre-set decision strategy, and providing the user return result to the user (S4). In the method and system, the man-machine interaction system is anthropomorphic instead of being instrumentalized, and the user can obtain relaxed and pleased interaction experience in the intelligent interaction process through chat, research and other service. The search in the form of keywords is improved into natural language based search, the user can express demands through flexible and free natural languages, and a multi-round interaction process is closer to the interaction experience among humans.

Description

基于人工智能的人机交互方法和系统Human-computer interaction method and system based on artificial intelligence

相关申请的交叉引用Cross-reference to related applications

本申请要求百度在线网络技术(北京)有限公司于2015年9月7日提交的、发明名称为“基于人工智能的人机交互方法和系统”的、中国专利申请号“201510563338.2”的优先权。This application claims the priority of the Chinese patent application No. 201510563338.2, which is filed on September 7, 2015 by Baidu Online Network Technology (Beijing) Co., Ltd. and whose invention is entitled "Human-Computer Interaction Method and System Based on Artificial Intelligence".

技术领域Technical field

本发明涉及人工智能技术领域,尤其涉及一种基于人工智能的人机交互方法和系统。The invention relates to the field of artificial intelligence technology, in particular to a human-computer interaction method and system based on artificial intelligence.

背景技术Background technique

人工智能(Artificial Intelligence)是计算机科学的一个分支,英文缩写为AI,是研究、开发用于模拟、延伸和扩展人的智能的理论、方法、技术及应用系统的一门新的技术科学。Artificial Intelligence is a branch of computer science, abbreviated as AI. It is a new technical science that studies and develops theories, methods, techniques and application systems for simulating, extending and extending human intelligence.

随着科技的不断进步,搜索引擎已成为人们生活中必不可少的部分,并日趋智能化。目前,传统的搜索引擎的交互方式是用户输入搜索关键字,搜索引擎返回与用户需求相关的搜索结果,并按照相关性由高到低的顺序排序。用户可浏览和点击搜索结果,并从中选择感兴趣或有需求的信息和内容。其中,有的搜索引擎利用了框计算技术与知识图谱技术。框计算技术主要是搜索引擎针对用户输入的查询关键词直接提供结果或者服务。例如:用户在搜索引擎中搜索“北京天气”、“人民币美元汇率”、“五一放假”等关键字,都可以在搜索结果页面的最顶端展现结果。而知识图谱技术旨在将与用户需求相关的知识组织并展现成一张“知识图”,以满足用户对背景知识的需求以及延伸的需求。例如搜索“刘德华”,通过知识图谱技术,搜索引擎可展现刘德华的身高、生日、影视作品等背景知识,以及“张学友”、“朱丽倩”等其他相关人物。With the continuous advancement of technology, search engines have become an indispensable part of people's lives, and they are becoming more and more intelligent. At present, the traditional search engine interaction mode is that the user inputs a search keyword, and the search engine returns search results related to the user's needs, and sorts according to the order of relevance from high to low. Users can browse and click on search results and select information and content that are of interest or need. Among them, some search engines use box computing technology and knowledge mapping technology. The box computing technology is mainly that the search engine directly provides results or services for the query keywords input by the user. For example, users searching for "Beijing Weather", "Renminbi Dollar Exchange Rate", "May Day Holiday" and other keywords in the search engine can display the results at the top of the search results page. Knowledge Mapping technology aims to organize and present knowledge related to user needs into a “knowledge map” to meet the needs of users for background knowledge and extended requirements. For example, searching for "Andy Lau", through the knowledge map technology, the search engine can display the background knowledge of Andy Lau's height, birthday, film and television works, as well as "Zhang Xueyou", "Zhu Liqian" and other related figures.

另外,有的搜索系统还可以基于自然语言,通过与用户进行交互问答的方式,向用户提供所需的资源。例如:在智能手机端,用户可以通过如:苹果siri、谷歌google now、百度语音助手等移动应用来获取所需资源。上述应用主要通过语音作为载体,以自然语言的形式向系统发出本地服务、网上搜索等指令,并以语音播报的形式向用户反馈结果。In addition, some search systems can also provide users with the required resources by interacting with users based on natural language. For example, on the smartphone side, users can obtain the required resources through mobile applications such as Apple Siri, Google google now, Baidu Voice Assistant. The above application mainly uses voice as a carrier to issue local services, online search and other instructions to the system in the form of natural language, and feeds the results to the user in the form of voice broadcast.

此外,用户还可以向深度问答系统提问,获得相应的答案。例如“黄河流经哪几个省”、“英国的首都是哪座城市”等。In addition, users can also ask questions in the deep question and answer system to get the corresponding answer. For example, "Which provinces are the Yellow Rivers?", "Which city is the capital of the United Kingdom?"

但是,在实现本发明过程中,发明人发现现有技术中至少存在如下问题:当前系统只能用于回答现有知识库中已存在的简单问题,而对于复杂度较高、时效性强、与用户主观 观点相关的深度问题等,则很难做出有效回答。对于基于自然语言的搜索系统,在当前话题结束后,系统需要继续等待用户提出的下一个话题,然后再进行回答。由于缺乏话题之间的关联的信息,系统无法主动地延续或者引导出新的话题,无法像人与人之间那样进行持续地交互,缺乏主动性和联想力。However, in the process of implementing the present invention, the inventors have found that at least the following problems exist in the prior art: the current system can only be used to answer simple questions existing in the existing knowledge base, and has high complexity and timeliness. Subjective with the user It is difficult to make an effective answer to the deep questions related to the viewpoint. For a natural language-based search system, after the current topic ends, the system needs to wait for the next topic raised by the user before answering. Due to the lack of information related to the topic, the system cannot actively continue or lead new topics, and cannot continuously interact like people, lacking initiative and association.

发明内容Summary of the invention

本发明旨在至少在一定程度上解决相关技术中的技术问题之一。为此,本发明的一个目的在于提出一种基于人工智能的人机交互方法,该方法能够基于自然语言进行多轮交互及搜索,将人机交互系统从工具化转变为拟人化的智能系统。The present invention aims to solve at least one of the technical problems in the related art to some extent. To this end, an object of the present invention is to provide a human-computer interaction method based on artificial intelligence, which can perform multi-round interaction and search based on natural language, and transform the human-computer interaction system from instrumentalization to anthropomorphic intelligent system.

本发明的第二个目的在于提出一种基于人工智能的人机交互系统。A second object of the present invention is to provide a human-computer interaction system based on artificial intelligence.

本发明的第三个目的在于提出一种设备。A third object of the invention is to propose an apparatus.

本发明的第四个目的在于提出一种非易失性计算机存储介质。A fourth object of the present invention is to provide a non-volatile computer storage medium.

为了实现上述目的,本发明第一方面实施例提出了一种基于人工智能的人机交互方法,包括:接收用户通过应用终端输入的输入信息;根据所述用户的输入信息获取所述用户的意图信息,并根据所述意图信息将所述输入信息分发至至少一个交互服务子系统;接收所述至少一个交互服务子系统返回的返回结果;以及按照预设的决策策略根据所述返回结果生成用户返回结果,并将所述用户返回结果提供至所述用户。In order to achieve the above object, an embodiment of the first aspect of the present invention provides a human-computer interaction method based on artificial intelligence, including: receiving input information input by a user through an application terminal; and acquiring an intention of the user according to input information of the user And distributing the input information to the at least one interactive service subsystem according to the intent information; receiving a return result returned by the at least one interactive service subsystem; and generating a user according to the returned result according to a preset decision policy The result is returned and the user returned results are provided to the user.

本发明实施例的基于人工智能的人机交互方法,包含以下优点:(1)实现了人机交互系统从工具化转变为拟人化,通过聊天、搜索等服务,让用户在智能交互的过程中获得轻松愉悦的交互体验,而不再仅仅是搜索和问答。(2)从关键词形式的搜索改进为基于自然语言的搜索,用户可以使用灵活自如的自然语言来表达需求,多轮的交互过程更接近人与人之间的交互体验。(3)实现从用户主动搜索演变为全天候的陪伴式服务,基于用户的个性化模型可以随时随地为用户提供推荐等服务。The human intelligence interaction method based on artificial intelligence in the embodiment of the invention comprises the following advantages: (1) realizing the transformation of the human-computer interaction system from instrumentalization to personification, and letting the user interact in the process of intelligent interaction through chat, search and other services. Get a relaxing and enjoyable interactive experience, not just search and question and answer. (2) From keyword-based search to natural-based search, users can express their needs in a flexible and natural language. The multi-round interaction process is closer to the interactive experience between people. (3) To realize the evolution from user active search to all-weather companion service, based on the user's personalized model, you can provide users with recommendations and other services anytime, anywhere.

本发明第二方面实施例提出了一种基于人工智能的人机交互系统,包括:第一接收子系统,用于接收用户通过应用终端输入的输入信息;分发子系统,用于根据所述用户的输入信息获取所述用户的意图信息,并根据所述意图信息将所述输入信息分发至至少一个交互服务子系统;第二接收子系统,用于接收所述至少一个交互服务子系统返回的返回结果;生成子系统,用于按照预设的决策策略根据所述返回结果生成用户返回结果;以及提供子系统,用于将所述用户返回结果提供至所述用户。The embodiment of the second aspect of the present invention provides a human-computer interaction system based on artificial intelligence, comprising: a first receiving subsystem, configured to receive input information input by a user through an application terminal; and a distribution subsystem, configured to be used according to the user The input information acquires the intent information of the user, and distributes the input information to at least one interactive service subsystem according to the intent information; and the second receiving subsystem is configured to receive the return of the at least one interactive service subsystem Returning a result; generating a subsystem for generating a user return result according to the returned result according to a preset decision policy; and providing a subsystem for providing the user return result to the user.

本发明实施例的基于人工智能的人机交互系统,包含以下优点:(1)实现了人机交互系统从工具化转变为拟人化,通过聊天、搜索等服务,让用户在智能交互的过程中获得轻松愉悦的交互体验,而不再仅仅是搜索和问答。(2)从关键词形式的搜索改进为基于自然 语言的搜索,用户可以使用灵活自如的自然语言来表达需求,多轮的交互过程更接近人与人之间的交互体验。(3)实现从用户主动搜索演变为全天候的陪伴式服务,基于用户的个性化模型可以随时随地为用户提供推荐等服务。The human intelligence interaction system based on artificial intelligence in the embodiment of the invention comprises the following advantages: (1) realizing the transformation of the human-computer interaction system from instrumentalization to personification, and allowing users to interact in the process of intelligent interaction through chat, search and other services. Get a relaxing and enjoyable interactive experience, not just search and question and answer. (2) Improved search from keyword form to based on nature Language search, users can use flexible natural language to express demand, multiple rounds of interaction process is closer to the interactive experience between people. (3) To realize the evolution from user active search to all-weather companion service, based on the user's personalized model, you can provide users with recommendations and other services anytime, anywhere.

本发明第三方面实施例提供了一种设备,包括:一个或者多个处理器;存储器;一个或者多个程序,所述一个或者多个程序存储在所述存储器中,当被所述一个或者多个处理器执行时,执行本发明第一方面实施例的基于人工智能的人机交互方法。A third aspect of the present invention provides an apparatus comprising: one or more processors; a memory; one or more programs, the one or more programs being stored in the memory when When the plurality of processors are executed, the artificial intelligence based human-computer interaction method of the first aspect of the present invention is executed.

本发明第四方面实施例提供了一种非易失性计算机存储介质,所述计算机存储介质存储有一个或者多个程序,当所述一个或者多个程序被一个设备执行时,使得所述设备执行本发明第一方面实施例的基于人工智能的人机交互方法。A fourth aspect of the present invention provides a non-volatile computer storage medium storing one or more programs, when the one or more programs are executed by one device, causing the device An artificial intelligence based human-computer interaction method for implementing the first aspect of the present invention.

附图说明DRAWINGS

图1是根据本发明一个实施例的基于人工智能的人机交互方法的流程图;1 is a flowchart of a human-computer interaction method based on artificial intelligence according to an embodiment of the present invention;

图2是根据本发明一个实施例的按照预设的决策策略根据返回结果生成用户返回结果的流程图;2 is a flowchart of generating a user return result according to a return result according to a preset decision policy according to an embodiment of the present invention;

图3是根据本发明一个实施例的需求满足服务子系统执行步骤的流程图;3 is a flow diagram of a demand fulfillment service subsystem performing steps in accordance with one embodiment of the present invention;

图4是根据本发明一个实施例的垂类服务模块执行步骤的流程图;4 is a flow chart showing steps performed by a vertical class service module according to an embodiment of the present invention;

图5是根据本发明一个实施例的与用户进行至少一轮的交互得到用户需要的查询结果的具体过程的流程图;5 is a flowchart of a specific process of obtaining at least one round of interaction with a user to obtain a query result required by a user according to an embodiment of the present invention;

图6是根据本发明一个实施例的获取对应查询词的相关信息的过程的流程图;6 is a flowchart of a process of acquiring related information of a corresponding query word according to an embodiment of the present invention;

图7是包含查询词的相关信息的用户界面的示意图一;7 is a first schematic diagram of a user interface including related information of a query term;

图8是包含查询词的相关信息的用户界面的示意图二;8 is a schematic diagram 2 of a user interface including related information of a query word;

图9是包含查询词的相关信息的用户界面的示意图三;9 is a schematic diagram 3 of a user interface including related information of a query word;

图10是包含查询词的相关信息的用户界面的示意图四;10 is a schematic diagram 4 of a user interface including related information of a query term;

图11是根据本发明一个实施例的聊天服务子系统执行步骤的流程图;11 is a flow chart showing steps performed by a chat service subsystem in accordance with one embodiment of the present invention;

图12是根据本发明一个实施例的引导和推荐服务子系统执行步骤的流程图;12 is a flow diagram of steps performed by a bootstrap and recommendation service subsystem in accordance with one embodiment of the present invention;

图13是根据本发明一个实施例的话题图谱的效果示意图;FIG. 13 is a schematic diagram showing the effect of a topic map according to an embodiment of the present invention; FIG.

图14是根据本发明一个实施例的网络文本数据为半结构化数据时的效果示意图;14 is a schematic diagram showing the effect of network text data as semi-structured data according to an embodiment of the present invention;

图15是根据本发明一个实施例的网络文本数据为结构化数据时的效果示意图;15 is a schematic diagram showing the effect of network text data as structured data according to an embodiment of the present invention;

图16是根据本发明一个实施例的获取用户浏览行为数据的效果示意图;16 is a schematic diagram of an effect of acquiring user browsing behavior data according to an embodiment of the present invention;

图17是根据本发明一个实施例的建立话题图谱的效果示意图;17 is a schematic diagram showing the effect of establishing a topic map according to an embodiment of the present invention;

图18是根据本发明一个实施例的基于人工智能的人机交互系统的结构示意图一;18 is a schematic structural diagram 1 of a human-computer interaction system based on artificial intelligence according to an embodiment of the present invention;

图19是根据本发明一个实施例的生成子系统的结构示意图一; 19 is a schematic structural diagram 1 of a generation subsystem according to an embodiment of the present invention;

图20是根据本发明一个实施例的生成子系统的结构示意图二;20 is a second schematic structural diagram of a generation subsystem according to an embodiment of the present invention;

图21是根据本发明一个实施例的基于人工智能的人机交互系统的结构示意图二;21 is a second schematic structural diagram of a human-computer interaction system based on artificial intelligence according to an embodiment of the present invention;

图22是根据本发明一个实施例的基于人工智能的人机交互系统的结构示意图三;22 is a schematic structural diagram 3 of a human-computer interaction system based on artificial intelligence according to an embodiment of the present invention;

图23是根据本发明一个实施例的基于人工智能的人机交互系统的结构示意图四;23 is a fourth structural diagram of a human-computer interaction system based on artificial intelligence according to an embodiment of the present invention;

图24是根据本发明一个实施例的需求满足服务子系统的结构示意图;24 is a schematic structural diagram of a demand satisfaction service subsystem according to an embodiment of the present invention;

图25是根据本发明一个实施例的垂类服务模块的结构示意图;25 is a schematic structural diagram of a vertical service module according to an embodiment of the present invention;

图26是根据本发明一个实施例的交互子模块的结构示意图一;26 is a schematic structural diagram 1 of an interaction sub-module according to an embodiment of the present invention;

图27是根据本发明一个实施例的交互子模块的结构示意图二;FIG. 27 is a second schematic structural diagram of an interaction submodule according to an embodiment of the present invention; FIG.

图28是根据本发明一个实施例的第四获取单元的结构示意图;28 is a schematic structural diagram of a fourth acquiring unit according to an embodiment of the present invention;

图29是根据本发明一个实施例的深度问答服务模块的结构示意图;29 is a schematic structural diagram of a deep question and answer service module according to an embodiment of the present invention;

图30是根据本发明一个实施例的生成子模块的结构示意图一;FIG. 30 is a first schematic structural diagram of a generating submodule according to an embodiment of the present invention; FIG.

图31是根据本发明一个实施例的生成子模块的结构示意图二;FIG. 31 is a second schematic structural diagram of a generating submodule according to an embodiment of the present invention; FIG.

图32是根据本发明一个实施例的生成子模块的结构示意图三;32 is a third schematic structural diagram of a generating submodule according to an embodiment of the present invention;

图33是根据本发明一个实施例的信息搜索服务模块的结构示意图一;33 is a schematic structural diagram 1 of an information search service module according to an embodiment of the present invention;

图34是根据本发明一个实施例的信息搜索服务模块的结构示意图二;FIG. 34 is a second schematic structural diagram of an information search service module according to an embodiment of the present invention; FIG.

图35是根据本发明一个实施例的聊天服务子系统的结构示意图一;35 is a schematic structural diagram 1 of a chat service subsystem according to an embodiment of the present invention;

图36是根据本发明一个实施例的聊天服务子系统的结构示意图二;36 is a second schematic structural diagram of a chat service subsystem according to an embodiment of the present invention;

图37是根据本发明一个实施例的基于搜索的聊天模块的结构示意图;37 is a schematic structural diagram of a search-based chat module according to an embodiment of the present invention;

图38是根据本发明一个实施例的富知识聊天模块的结构示意图;FIG. 38 is a schematic structural diagram of a rich knowledge chat module according to an embodiment of the present invention; FIG.

图39是根据本发明一个实施例的基于画像的聊天模块的结构示意图一;39 is a schematic structural diagram 1 of a portrait-based chat module according to an embodiment of the present invention;

图40是根据本发明一个实施例的基于画像的聊天模块的结构示意图二;40 is a second schematic structural diagram of a portrait-based chat module according to an embodiment of the present invention;

图41是根据本发明一个实施例的聊天服务子系统的结构示意图三;FIG. 41 is a third schematic structural diagram of a chat service subsystem according to an embodiment of the present invention; FIG.

图42是根据本发明一个实施例的聊天服务子系统的结构示意图四;FIG. 42 is a schematic structural diagram 4 of a chat service subsystem according to an embodiment of the present invention; FIG.

图43是根据本发明一个实施例的聊天服务子系统的结构示意图五;43 is a schematic structural diagram 5 of a chat service subsystem according to an embodiment of the present invention;

图44是根据本发明一个实施例的聊天服务子系统的结构示意图六;FIG. 44 is a schematic structural diagram 6 of a chat service subsystem according to an embodiment of the present invention; FIG.

图45是根据本发明一个实施例的引导和推荐服务子系统的结构示意图一;45 is a schematic structural diagram 1 of a guidance and recommendation service subsystem according to an embodiment of the present invention;

图46是根据本发明一个实施例的引导和推荐服务子系统的结构示意图二;FIG. 46 is a second schematic structural diagram of a boot and recommendation service subsystem according to an embodiment of the present invention; FIG.

图47是根据本发明一个实施例的引导和推荐服务子系统的结构示意图三。Figure 47 is a third schematic diagram of the structure of the bootstrap and recommendation service subsystem in accordance with one embodiment of the present invention.

具体实施方式detailed description

下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的模块或具有相同或类似功能的模块。下面通过参考附图描 述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。The embodiments of the present invention are described in detail below, and the examples of the embodiments are illustrated in the accompanying drawings, in which the same or similar reference numerals indicate the same or similar modules or modules having the same or similar functions. The following is described by reference to the drawings The embodiments described are illustrative and are not intended to limit the invention.

下面参考附图描述本发明实施例的基于人工智能的人机交互方法和系统。An artificial intelligence based human-computer interaction method and system according to an embodiment of the present invention will be described below with reference to the accompanying drawings.

图1是根据本发明一个实施例的基于人工智能的人机交互方法的流程图。1 is a flow chart of a human-computer interaction method based on artificial intelligence according to an embodiment of the present invention.

如图1所示,基于人工智能的人机交互方法可包括:As shown in FIG. 1, the human intelligence interaction method based on artificial intelligence may include:

S1、接收用户通过应用终端输入的输入信息。S1. Receive input information input by the user through the application terminal.

其中,应用终端可包括PC端、移动终端或智能机器人。输入信息可以是文本信息、图像信息或语音信息。The application terminal may include a PC end, a mobile terminal, or an intelligent robot. The input information may be text information, image information, or voice information.

S2、根据用户的输入信息获取用户的意图信息,并根据意图信息将输入信息分发至至少一个交互服务子系统。S2: Acquire intent information of the user according to the input information of the user, and distribute the input information to the at least one interactive service subsystem according to the intent information.

其中,交互服务子系统可包括需求满足服务子系统、引导和推荐服务子系统和聊天服务子系统等。在本发明的一个实施例中,可根据用户的输入信息获取用户的意图信息,然后根据意图信息将输入信息分发至上述的交互服务子系统。The interactive service subsystem may include a requirement satisfaction service subsystem, a guidance and recommendation service subsystem, and a chat service subsystem. In an embodiment of the present invention, the user's intent information may be acquired according to the user's input information, and then the input information is distributed to the above-described interactive service subsystem according to the intent information.

另外,还可接收用户的定制任务信息,并根据定制任务信息将输入信息分发至至少一个交互服务子系统。例如:有的用户的任务只进行搜索,则可只需定制需求满足服务子系统;有的用户的任务既需要搜索,又需要进行聊天,则可定制需求满足服务子系统和聊天服务子系统,以上均可根据用户实际需求进行定制。In addition, the user's customized task information may also be received, and the input information may be distributed to at least one interactive service subsystem according to the customized task information. For example, if a user's task only performs a search, then only the customized requirements satisfy the service subsystem; some users' tasks need to search and chat, and the customized requirements satisfy the service subsystem and the chat service subsystem. All of the above can be customized according to the actual needs of users.

S3、接收至少一个交互服务子系统返回的返回结果。S3. Receive a return result returned by at least one interactive service subsystem.

S4、按照预设的决策策略根据返回结果生成用户返回结果,并将用户返回结果提供至用户。S4. Generate a user return result according to the returned result according to a preset decision policy, and provide the user return result to the user.

具体地,如图2所示,按照预设的决策策略根据返回结果生成用户返回结果可包括以下步骤:Specifically, as shown in FIG. 2, generating a user return result according to the returned result according to a preset decision policy may include the following steps:

S41、获取输入信息的需求分析特征。S41. Acquire a demand analysis feature of the input information.

S42、获取交互服务子系统返回的返回结果的置信度特征、用户的对话交互信息的上下文特征以及用户的个性化模型特征。S42. Acquire a confidence feature of the returned result returned by the interactive service subsystem, a context feature of the user's dialog interaction information, and a personalized model feature of the user.

S43、根据需求分析特征、返回结果的置信度特征、用户的对话交互信息的上下文特征以及用户的个性化模型特征对返回结果进行决策以确定用户返回结果。S43. Perform a decision on the returned result according to the requirement analysis feature, the confidence feature of the returned result, the context feature of the user's dialogue interaction information, and the user's personalized model feature to determine the user return result.

具体地,对返回结果进行决策以确定用户返回结果主要基于以下几个特征:1、需求分析特征,通过对用户的问题信息进行需求分析,可选择更符合用户需求的问答服务模块提供的问答结果。2、问答结果置信度特征,每个问答服务模块提供的问答结果均具有置信度,可选择置信度高的问答结果。3、用户的对话交互信息的上下文特征,可选择更符合上下文信息的问答结果。4、用户的个性化模型特征,可选择更符合用户个性化需求的问答结果。其中,需求分析特征、问答结果的置信度特征、用户的对话交互信息的上下文特征以及用 户的个性化模型特征分别对应有各自的决策权重。基于以上特征对问答结果进行决策,从而确定最终的问答结果。在确定最终的问答结果后,可反馈给用户,从而满足用户的需求。其中,问答结果可通过语音播报的方式,亦可以通过屏幕显示的方式反馈给用户。采用语音播报的方式使得人机交互的过程更加简便、自然。Specifically, the decision is made on the returned result to determine that the user returns the result mainly based on the following characteristics: 1. The requirement analysis feature, by performing a demand analysis on the user's problem information, selecting a question and answer result provided by the question and answer service module that more closely matches the user's needs. . 2, question and answer results confidence characteristics, each Q & A service module provides a question and answer results with confidence, you can choose a high confidence question and answer results. 3. The contextual characteristics of the user's conversational interaction information can select a question and answer result that is more in line with the context information. 4, the user's personalized model features, you can choose a question and answer results that better meet the user's individual needs. Among them, the characteristics of the demand analysis, the confidence characteristics of the question and answer results, the contextual characteristics of the user's dialogue interaction information, and the use The individualized model features of the households respectively have their own decision weights. Based on the above characteristics, the Q&A results are determined to determine the final question and answer results. After determining the final Q&A result, it can be fed back to the user to meet the user's needs. Among them, the question and answer results can be reported to the user by means of voice broadcast or by means of on-screen display. The use of voice broadcast makes the process of human-computer interaction more convenient and natural.

另外,还可根据用户的日志基于增强学习模型对需求分析特征、问答结果的置信度特征、用户的对话交互信息的上下文特征以及用户的个性化模型特征的决策权重进行训练,从而为用户提供更符合用户需求的问答结果。In addition, according to the user's log, based on the enhanced learning model, the demand analysis feature, the confidence feature of the question and answer result, the context feature of the user's dialogue interaction information, and the decision weight of the user's personalized model feature can be trained to provide the user with more Q&A results that meet user needs.

其中,需求分析特征、返回结果的置信度特征、用户的对话交互信息的上下文特征以及用户的个性化模型特征分别对应有各自的决策权重。The demand analysis feature, the confidence feature of the returned result, the contextual feature of the user's dialogue interaction information, and the user's personalized model feature respectively have respective decision weights.

在生成用户返回结果之后,可将用户返回结果转化为自然语言并播报给用户。当然,也可直接将用户返回结果对应的文本展现给用户。After generating the results returned by the user, the user returns the results into a natural language and broadcasts to the user. Of course, the text corresponding to the result returned by the user can also be directly presented to the user.

在本发明的一个实施例中,如果用户返回结果中包括有执行指令,则可将执行指令发送至对应的执行子系统,并通过执行子系统进行执行。其中,执行指令可包括但不仅限于硬件动作指令、播放音乐指令以及朗读故事指令等。举例来说,硬件动作指令主要针对智能机器人,智能机器人可具有头部、躯干、四肢等硬件组成部件,因此可执行诸如“点点头”、“笑一下”、“举起手来”等操控智能机器人硬件组成部件的指令。播放音乐指令通常可包括开始播放、停止播放、上一首、下一首、大点声、声音小一点等。需要说明的是,用户对于特定类型或风格等音乐的搜索(如“适合睡前听的音乐”、“周杰伦好听的歌”等)并不属于播放音乐指令。朗读故事指令主要针对的是面向儿童的应用,如智能机器人需要代替父母给儿童讲故事。与播放音乐指令类似,对特定主题、人物、情节等的故事进行搜索也不属于朗读故事指令。In one embodiment of the invention, if an execution instruction is included in the user return result, the execution instruction may be sent to the corresponding execution subsystem and executed by the execution subsystem. The execution instructions may include, but are not limited to, hardware action instructions, playing music instructions, and reading story commands. For example, hardware action instructions are mainly for intelligent robots, which can have hardware components such as heads, torso, limbs, etc., so that control intelligence such as "nodding", "laughing", "lifting hands" can be performed. The instructions of the robot hardware components. Playing music instructions can generally include starting playback, stopping playback, previous, next, big, small, and so on. It should be noted that the user's search for music of a specific type or style (such as "music suitable for listening to bedtime", "song of Jay Chou", etc.) does not belong to the music command. The reading of the story is mainly aimed at children-oriented applications, such as intelligent robots need to replace the parents to tell children stories. Similar to playing music instructions, searching for stories of specific topics, characters, plots, etc. is not a reading of story instructions.

此外,在接收用户通过应用终端输入的输入信息之后,还可获取与用户交互的交互上文信息,然后可根据交互上文信息对输入信息进行补全。具体地,在多轮交互过程中,用户通常会基于对话上文省略输入信息中的一部分内容,因此需要对输入信息进行补全,从而澄清用户的需求。例如:对话上文为“北京有什么小吃?”,而输入信息为“那特产呢?”,则需要对输入信息进行补全,生成新的问题信息“北京有什么特产?”。In addition, after receiving the input information input by the user through the application terminal, the above information of the interaction with the user may also be acquired, and then the input information may be complemented according to the information above. Specifically, in a multi-round interaction process, the user usually omits a part of the input information based on the conversation, and therefore needs to complete the input information to clarify the user's needs. For example, the dialogue above is “What snacks are there in Beijing?”, and the input information is “What about special products?”, you need to complete the input information and generate new question information “What special products does Beijing have?”.

在本发明的实施例中,如果在网络资源中不存在满足用户需求的用户返回结果,则可记录用户的输入信息,然后以预设周期监控网络资源中是否存在满足用户需求的用户返回结果,并当用户返回结果存在时,可将用户返回结果提供至用户。举例来说,用户搜索一部刚上映的电影,但是目前网络资源中并没有相应资源,则可记录下该用户这一需求,并按照一定周期搜索网络资源中是否出现相应资源。当搜索到相应资源后,可将该资源推送给用户,即实现异步需求满足。 In the embodiment of the present invention, if there is no user return result satisfying the user's requirement in the network resource, the input information of the user may be recorded, and then the user returns a result in the network resource that meets the user's requirement in a preset period. And when the user returns a result, the user returns the result to the user. For example, if a user searches for a movie that has just been released, but there is no corresponding resource in the current network resource, the user's demand can be recorded, and the corresponding resource is searched for in the network resource according to a certain period. After the corresponding resource is searched, the resource can be pushed to the user, that is, the asynchronous demand is satisfied.

如图3所示,需求满足服务子系统可执行以下步骤:As shown in Figure 3, the requirements fulfillment service subsystem can perform the following steps:

S31、获取用户输入的问题信息。S31. Obtain problem information input by the user.

其中,问题信息可以是文字信息,也可以是语音信息。例如,用户输入的问题信息“北京有什么小吃?”。The problem information may be text information or voice information. For example, the question information entered by the user "What snacks are there in Beijing?"

S32、根据问题信息获取用户的用户需求信息。S32. Acquire user demand information of the user according to the problem information.

具体地,可对问题信息进行需求分析,从而获取用户的用户需求信息。举例来说,用户需求信息可以为垂类需求、阿拉丁需求、深度问答需求、信息搜索需求等。Specifically, the problem information may be analyzed for requirements to obtain user demand information of the user. For example, user demand information can be vertical category requirements, Aladdin requirements, deep question and answer requirements, information search needs, and the like.

S33、根据用户需求信息将问题信息分发至对应的至少一个问答服务模块。S33. Distribute the problem information to the corresponding at least one question and answer service module according to the user demand information.

其中,问答服务模块可包括阿拉丁服务模块、垂类服务模块、深度问答服务模块和信息搜索服务模块。The question and answer service module may include an Aladdin service module, a vertical service module, a deep question and answer service module, and an information search service module.

在本发明的一个实施例中,当用户需求信息为阿拉丁需求时,可将问题信息分发至阿拉丁服务模块;当用户需求信息为垂类需求时,可将问题信息分发至垂类服务模块;当用户需求信息为深度问答需求时,可将问题信息分发至深度问答服务模块;当用户需求信息为信息搜索需求时,可将问题分发至信息搜索服务模块。In an embodiment of the present invention, when the user requirement information is an Aladdin requirement, the problem information may be distributed to the Aladdin service module; when the user demand information is a vertical type requirement, the problem information may be distributed to the vertical class service module. When the user demand information is a deep question and answer requirement, the problem information may be distributed to the deep question and answer service module; when the user demand information is the information search requirement, the problem may be distributed to the information search service module.

其中,阿拉丁服务是能够为用户需求提供精准满足的一类服务的统称,例如美元兑换人民币、2015年春节放假等。举例来说,用户的问题信息为“刘德华的老婆是谁?”,则可对该问题信息进行分析,可分析出需求类型为“人物”,查询主体为“刘德华”,查询属性为“老婆”,并可将查询属性进行归一,将查询属性归一为“妻子”。然后搜索并获得结果字段为“朱丽倩”,再基于自然语言生成技术(Natural Language Generation)生成问答结果“刘德华的老婆是朱丽倩”。再例如:用户的问题信息为“北京明天热吗?”,通过搜索并获得结果字段为“35摄氏度”,可基于常识知识库和预设的规则,生成问答结果“明天天气很热,最高温度为35摄氏度,建议注意防暑降温。”其中,常识知识库可包括常识类知识,如温度高于30摄氏度属于天气热。Among them, Aladdin service is a general term for a type of service that can provide users with precise satisfaction, such as the exchange of US dollars to RMB, and the Spring Festival holiday in 2015. For example, if the user's question information is “Who is Andy Lau's wife?”, the problem information can be analyzed, and the demand type can be analyzed as “person”, the query subject is “Andy Lau”, and the query attribute is “wife”. , and can be attributed to the query attribute, the query attribute is grouped as "wife." Then search and obtain the result field as "Zhu Liqian", and then generate a question and answer based on Natural Language Generation (Andy Lau's wife is Zhu Liqian). For another example: the user's question information is “Would it hot tomorrow in Beijing?”. By searching and obtaining the result field as “35 degrees Celsius”, the question and answer result can be generated based on the common sense knowledge base and preset rules. “The weather is very hot tomorrow, the highest temperature. For 35 degrees Celsius, it is recommended to pay attention to heatstroke prevention." Among them, common sense knowledge base can include common sense knowledge, such as temperature above 30 degrees Celsius is hot weather.

垂类服务是针对垂类需求进行多轮交互的服务,例如“订机票”等。垂类服务主要通过对话控制技术(Dialogue Management)和对话策略技术(Dialogue Policy),对用户的需求进行澄清,从而向用户提供满足用户需求的问答结果。举例来说,用户的问题信息为“北京到上海的机票”,则可对该问题信息进行分析,然后向用户反问“您的出发日期是哪天?”,用户回答“明天”,然后继续反问“您对航空公司是否有要求?”等,逐步澄清用户的需求,并最终返回满足用户需求的问答结果。The vertical service is a service that performs multiple rounds of interaction for the vertical type of demand, such as "booking a ticket". The service of the vertical class mainly clarifies the user's needs through Dialogue Management and Dialogue Policy, so as to provide users with questions and answers that meet the needs of users. For example, if the user's problem information is “Beijing to Shanghai ticket”, the problem information can be analyzed, and then the user is asked “Which day is your departure date?”, the user answers “Tomorrow” and then continues to ask questions. “Do you have any requirements for airlines?”, etc., gradually clarify the needs of users, and finally return the Q&A results that meet the needs of users.

S34、接收至少一个问答服务模块返回的问答结果,并对问答结果进行决策以确定最终的问答结果。S34. Receive a question and answer result returned by at least one question and answer service module, and make a decision on the question and answer result to determine a final question and answer result.

其中,对问答结果进行决策以确定最终的问答结果的实现方法与步骤S43中对返回结 果进行决策以确定用户返回结果的实现方法运用的技术手段一致,此处不赘述。Wherein, a decision is made on the result of the question and answer to determine the final question and answer result and the return knot in step S43 The technical means used to make the decision to determine the implementation method of the user return results are consistent, and will not be described here.

具体地,如图4所示,垂类服务模块可执行以下步骤:Specifically, as shown in FIG. 4, the vertical service module can perform the following steps:

S331,获取用户输入的查询词。S331. Acquire a query word input by a user.

在本发明的一个实施例中,用户可通过多种方式输入查询词,例如,用户可以以文本、语音或图像输入查询词。In one embodiment of the invention, the user can enter query terms in a variety of ways, for example, the user can enter query terms in text, voice, or images.

在用户通过语音或者图像输入时,可将输入的语音或者图像转换为用户方便理解的自然语言的查询词,并在交互界面上显示对应的文本。When the user inputs by voice or image, the input voice or image can be converted into a natural language query word that the user can easily understand, and the corresponding text is displayed on the interactive interface.

例如,在用户通过语音方式输入查询词后,可基于语言模型将用户输入的语音转换为对应的文本,并以自然语言的形式在交互界面上显示用户输入的查询词。For example, after the user inputs the query word by voice, the voice input by the user may be converted into a corresponding text based on the language model, and the query word input by the user is displayed on the interactive interface in the form of a natural language.

S332,确定查询词属于的垂类。S332, determining a vertical class to which the query term belongs.

具体地,在获得用户输入的查询词后,需要确定查询词属于的垂类,以方便后续在查询词属于的垂类下,与用户进行交互,或者获得查询词的相关信息。目前,可通过多种方式确定查询词所属的垂类,用户可根据实际需求进行选择,举例说明如下:Specifically, after obtaining the query word input by the user, it is necessary to determine the vertical class to which the query word belongs, so as to facilitate subsequent interaction with the user under the vertical class to which the query word belongs, or obtain relevant information of the query word. At present, the categorization class of the query term can be determined in a variety of ways, and the user can select according to actual needs, as illustrated by the following:

(1)基于机器学习方式确定查询词属于的垂类。(1) Determine the vertical class to which the query term belongs based on the machine learning method.

具体地,首先从搜索引擎日志(包含语音搜索)中挖掘和标注与垂类相关的查询(query),构建垂类相关的训练数据集合,然后对训练数据提取特征,训练机器学习分类器(例如最大熵模型、支持向量机)根据提取到的特征对垂类需求查询进行分类,以确定查询词语与垂类的对应关系,并保存查询词语与垂类的对应关系。Specifically, firstly, the query related to the vertical class is mined and labeled from the search engine log (including the voice search), the training data set related to the vertical class is constructed, and then the training data is extracted and the machine learning classifier is trained (for example, The maximum entropy model and the support vector machine classify the vertical class demand query according to the extracted features to determine the correspondence between the query words and the vertical class, and save the correspondence between the query words and the vertical class.

其中,需要说明的是,在分类的过程中,对于多个垂类,可以采用所有类别统一模型多分类,也可以采用每个垂类单独模型二分类,最后统一决策。Among them, it should be noted that in the process of classification, for a plurality of vertical classes, all categories of unified model multi-classification can be used, or each vertical class can be used to separate two models, and finally unified decision-making.

具体而言,在获得查询词后,可通过查询词与垂类的对应关系确定查询词对应的垂类。例如,在接收到用户输入的查询词为“天蚕土豆的小说”后,由于查询词中包含作者名,小说等词,通过机器学习方式可确定该查询词对应的垂类为小说垂类。Specifically, after obtaining the query word, the corresponding class of the query word may be determined by the correspondence between the query word and the vertical class. For example, after receiving the query word input by the user as "the novel of the silkworm potato", since the query word includes the author name, the novel and the like, the machine learning method can determine that the vertical class corresponding to the query word is a novel.

(2)基于模式解析方式确定查询词属于的垂类。(2) Determine the vertical class to which the query term belongs based on the mode analysis method.

为了可以基于模式解析方式确定查询词属于的垂类,在确定查询词属于的垂类之前,针对每类垂类(例如小说垂类,美食垂类、地点垂类、餐馆垂类等),可构建关键词列表,并保存垂类与关键词之间的对应关系。In order to determine the vertical class to which the query word belongs based on the mode analysis method, before determining the vertical class to which the query word belongs, for each type of vertical class (for example, novel vertical class, gourmet vertical class, location vertical class, restaurant vertical class, etc.) Build a keyword list and save the correspondence between the vertical class and the keyword.

在接收到用户输入的查询词后,可基于分词、命名实体识别等技术,对查询中的实体和关键词进行解析,并用解析结果匹配垂直类别的模式集合,如果匹配成功,则发到对应的垂直类别。After receiving the query word input by the user, the entity and the keyword in the query may be parsed based on techniques such as word segmentation and named entity recognition, and the parsing result is used to match the pattern set of the vertical category. If the matching is successful, the corresponding image is sent to the corresponding Vertical category.

以找餐馆垂类为例:假定用户当前输入的查询词为“三里屯附近安静的餐厅”,首先对这个query做分词、命名实体识别等基础词法分析,通过分析可确定该query对应的模式 为:[地点]_[风格]_[餐厅]。每个类别单独挖掘模式集合。也就是说,对于待分发的query,首先,通过分词、命名实体识别等基础词法分析方式对query进行分析,然后将分析结果与垂直类别的模式集合进行匹配,如果匹配成功,则分发到对应的垂直类别。Take the restaurant as an example: suppose the user's current query is “quiet restaurant near Sanlitun”. First, the basic lexical analysis of the query, named entity identification, etc., can be determined through analysis. For: [Location]_[Style]_[Restaurant]. Each category separates a collection of patterns. That is to say, for the query to be distributed, firstly, the query is analyzed by basic lexical analysis methods such as word segmentation and named entity recognition, and then the analysis result is matched with the pattern set of the vertical category, and if the matching is successful, the corresponding is distributed to the corresponding Vertical category.

S333,在查询词属于的垂类中,与用户进行至少一轮的交互,得到用户需要的查询结果,其中,每轮交互时,展示给用户的信息包括:对应查询词的查询结果,以及,引导信息。S333, in the vertical class to which the query term belongs, perform at least one round of interaction with the user to obtain a query result required by the user, wherein, in each round of interaction, the information displayed to the user includes: a query result corresponding to the query word, and, Boot information.

在本发明的一个实施例中,在查询词属于的垂类中,与用户进行至少一轮的交互得到用户需要的查询结果的具体过程,如图5所示,可以包括:In an embodiment of the present invention, in the vertical class to which the query term belongs, a specific process of performing at least one round of interaction with the user to obtain a query result required by the user, as shown in FIG. 5, may include:

S3331,将查询词解析为查询词属于的垂类的垂类知识体系能够表示的结构化信息。S3331, the query word is parsed into structured information that can be represented by the vertical class knowledge system of the vertical class to which the query word belongs.

其中,每种垂类的垂类知识体系是预先建立的,垂类知识体系是基于垂直类别结构化网页提供的信息和用户需求表示体系建立起来的。Among them, the vertical knowledge system of each vertical class is pre-established, and the vertical knowledge system is based on the information provided by the vertical category structured webpage and the user demand representation system.

其中,用户需求表示体系是用户需求的语义表示体系,具体地,可从用户需求表示体系中挖掘出语义和结构知识。Among them, the user requirement representation system is a semantic representation system of user requirements. Specifically, semantic and structural knowledge can be mined from the user requirement representation system.

需要说明的是,用户需求是根据查询词确定的。也就是说,用户需求表示体系中包含大量与用户需求对应的查询词,通过对查询词进行分析,可从中获得查询词的语义和结构知识。It should be noted that the user requirements are determined based on the query terms. That is to say, the user requirement representation system contains a large number of query words corresponding to the user's needs. By analyzing the query words, the semantic and structural knowledge of the query words can be obtained.

每种垂类的垂类知识体系的结构形式不同,下面举例说明一下垂类知识体系的结构形式。The structural form of the vertical class knowledge system of each vertical class is different. The following is an example to illustrate the structural form of the vertical class knowledge system.

例如,餐馆垂类的垂类知识体系的结构形式如表1所示。For example, the structure of the vertical knowledge system of the restaurant is shown in Table 1.

表1 餐馆垂类的垂类知识体系的结构形式Table 1 Structure of the vertical knowledge system of the restaurant

Figure PCTCN2015096599-appb-000001
Figure PCTCN2015096599-appb-000001

通过表1可以看出,餐馆垂类的垂类知识体系中包含各餐馆相关的位置、菜系、口味、环境等多个维度信息,以及各维度可能的取值。It can be seen from Table 1 that the vertical knowledge system of the restaurant category includes multiple dimensions of information related to the location, cuisine, taste, environment, etc. of each restaurant, as well as possible values of each dimension.

S3332,根据结构化信息、垂类知识体系,以及,查询词属于的垂类的垂类资源库,获取相关信息。S3332, according to the structured information, the vertical knowledge system, and the vertical class resource library of the vertical class to which the query term belongs, obtain relevant information.

其中,相关信息可以包括但不限于对应查询词的查询结果和引导信息。The related information may include, but is not limited to, a query result and a guide information of the corresponding query word.

在本发明的一个实施例中,为了可以获得查询词属于的垂类的垂类资源,步骤S3332之前,还可以获取查询词属于的垂类的结构化资源和非结构化资源,并将结构化资源和非 结构化资源组成垂类资源库。In an embodiment of the present invention, in order to obtain the vertical class resource of the vertical class to which the query word belongs, before step S3332, the structured resource and the unstructured resource of the vertical class to which the query word belongs may also be acquired and structured. Resources and non Structured resources constitute a library of vertical classes.

其中,结构化资源是从多个对应的垂类网站抓取整合数据后得到的全量数据资源,非结构化资源根据用户查询词或互联网文本挖掘得到的结构化资源的补充或扩展信息。The structured resource is a full-scale data resource obtained by fetching integrated data from a plurality of corresponding vertical-type websites, and the unstructured resources are supplemented or expanded according to structured query resources obtained by user query words or Internet text mining.

下面以小说为例说明根据小说垂类的结构化资源和非结构化姿态组成小说垂类的垂类资源的过程。The following takes the novel as an example to illustrate the process of composing the primitive resources of the novel according to the structured resources and unstructured gestures of the novel.

通常垂直类别的结构化资源呈现复杂的体系结构,在组成小说垂类的垂类资源的过程中,可先获取小说垂类的结构化资源,具体地,可通过抓取起点中文网、纵横中文网、晋江、红袖、17K小说网、小说阅读网等主流中文小说网站上小说的信息建立全量数据资源。Generally, the structured resources of the vertical category present a complex architecture. In the process of composing the primitive resources of the novel, the structured resources of the novel can be obtained first. Specifically, the Chinese network can be crawled from the starting point. The information on the novels on the mainstream Chinese novel websites such as Net, Jinjiang, Red Sleeve, 17K Novel Network, and Novel Reading Network establishes a full amount of data resources.

然后,对于小说垂类的非结构化资源,可获取小说名、作者、类别、标签词、资源满足链接、小说简介、小说周边和百科信息等资源,并对所获得的上述资源进行整合。最后可将整合后的资源和上述全量数据资源保存至垂类资源库,以完成小说垂类的垂类资源的入库。Then, for the unstructured resources of the novel, you can get the novel name, author, category, tag words, resource satisfaction links, novel introduction, novel surrounding and encyclopedic information, and integrate the above resources. Finally, the integrated resources and the above-mentioned full data resources can be saved to the vertical resource library to complete the storage of the vertical resources of the novel.

其中,需要理解的是,针对其他垂类,获取其对应的垂类资源的过程与获得小说垂类的垂类资源的过程相同,此处不再赘述。Among them, it should be understood that, for other vertical classes, the process of obtaining the corresponding vertical class resources is the same as the process of obtaining the vertical class resources of the novel class, and will not be described herein.

在本发明的一个实施例中,获取对应查询词的相关信息的过程,如图6所示,可以包括:In an embodiment of the present invention, the process of acquiring related information of the corresponding query word, as shown in FIG. 6, may include:

S33321,根据结构化信息和用户前一次的状态信息,更新用户的当前状态信息。S33321: Update the current state information of the user according to the structured information and the previous state information of the user.

根据垂类场景中的常见对话流程,实现对话系统的状态空间构建和交互策略初始化。具体地,在用户第一次输入查询次后,可根据用户的偏好或者交互历史获取用户的初始化状态信息。According to the common dialogue process in the vertical scene, the state space construction and interactive strategy initialization of the dialogue system are implemented. Specifically, after the user inputs the query for the first time, the initialization state information of the user may be obtained according to the user's preference or the interaction history.

S33322,根据垂类知识体系和垂类资源库,生成当前状态信息对应的候选动作。S33322: Generate a candidate action corresponding to the current state information according to the vertical class knowledge system and the vertical class resource library.

其中,上述候选动作可以包括:满足用户需求的动作,或者,进一步澄清用户需求的动作,或者,为用户需求提供横向或纵向的引导信息。其中,用户需求根据查询词确定。The candidate actions may include: an action that satisfies the user's needs, or an action to further clarify the user's needs, or provide horizontal or vertical guidance information for the user's needs. Among them, the user needs are determined according to the query word.

S33323,根据预设模型在候选动作中选择与当前状态信息匹配程度较高的预设个数的候选动作,将选择的候选动作作为相关信息。S33323: Select a preset number of candidate actions that match the current state information in the candidate action according to the preset model, and select the selected candidate action as related information.

具体地,在当前状态信息对应的候选动作后,可基于预设模型例如POMDP(partially observable Markov decision processes,部分可见马尔科夫决策过程)模型从多个候选动作中选择与当前状态信息匹配程度较高的预设个数的候选动作,并将选择的候选动作作为查询词的查询结果和引导信息返回给用户,用户所使用的具有对话功能的应用程序的当前界面中显示查询词的查询结果和引导信息。Specifically, after the candidate action corresponding to the current state information, the degree of matching with the current state information may be selected from the plurality of candidate actions based on a preset model, such as a POMDP (partially observable Markov decision processes) model. a high preset number of candidate actions, and returning the selected candidate action as a query result and guiding information of the query word to the user, and displaying the query result of the query word in the current interface of the application having the dialog function used by the user and Boot information.

其中,满足用户需求的动作,或者,进一步澄清用户需求的动作在被选择后作为查询结果,为用户需求提供横向或纵向的引导信息在被选择后作为引导信息。 Among them, the action that satisfies the user's demand, or the action that further clarifies the user's demand is selected as the query result, and the horizontal or vertical guidance information for the user's demand is selected as the guide information after being selected.

其中,预设个数是预先设定的,例如,预设个数为5,假定根据垂类知识体系和垂类资源库,生成当前状态信息的候选动作为10,此时,可通过POMDP模型选择出与当前状态信息匹配程度较高的5个候选动作,并将选择的候选动作作为相关信息返回给用户。The preset number is preset, for example, the preset number is 5. It is assumed that the candidate action for generating current state information is 10 according to the vertical knowledge system and the vertical resource library, and at this time, the POMDP model can be adopted. Five candidate actions that match the current state information are selected, and the selected candidate actions are returned to the user as related information.

S3333,向用户展示查询结果和引导信息。S3333, showing the query result and the guiding information to the user.

S3334,在用户根据引导信息再次输入查询词后,重复上述根据查询词获取相关信息的流程,直至得到用户需要的查询结果。S3334, after the user inputs the query word again according to the guiding information, repeat the above process of obtaining relevant information according to the query word, until the query result required by the user is obtained.

在本发明的一个实施例中,还可以根据用户的反馈更新预设模型的参数,以便在参数不同时选择不同的候选动作。也就是说,在用户再次输入查询词后,可根据用户再次输入的查询词调整预设模型的参数,以使预设模型根据调整后的参数为用户选择不同的候选动作。即根据当前状态信息提供引导信息和满足信息,不同状态信息对应的引导信息和满足信息不同,系统会根据当前状态信息和用户需求提供最优的满足信息和引导信息,以引导用户查询垂类信息。In an embodiment of the present invention, the parameters of the preset model may also be updated according to the feedback of the user, so as to select different candidate actions when the parameters are different. That is to say, after the user inputs the query word again, the parameters of the preset model may be adjusted according to the query word input by the user, so that the preset model selects different candidate actions for the user according to the adjusted parameters. That is, the guidance information and the satisfaction information are provided according to the current state information, and the guidance information and the satisfaction information corresponding to the different state information are different, and the system provides the optimal satisfaction information and the guidance information according to the current state information and the user requirements, so as to guide the user to query the vertical information. .

例如,当前用户输入的查询词为“西餐厅”,可确定该查询词对应的垂类为美食垂类,同时通过查询词可确定当前用户的用户需求是找一家西餐厅吃饭,由于时根据查询词不能确定用户需要什么类型的西餐厅,此时,根据垂类知识体系和垂类资源库可多种候选动作,并通过POMDP模型选择出与当前状态信息匹配程度较高的13个候选动作,并将选择的13个候选动作为查询的相关信息返回给用户。其中,当前用户的当前界面中显示的查询结果如图7所示,根据查询词不能确定用户需要什么类型的西餐厅,此时,可引导用户提供更加详细的第一引导信息,并提供与第一引导信息相对应的可能的回答,即第二引导信息,以方便用户选择或者输入。其中,用户还可通过点击下一条指示按键查看与第一指导信息相对应的其他回答。在用户点击“请客户吃饭后”,可根据用户当前输入的查询词确定符合用户需求的一家餐馆,并获得与当前查询词的查询结果和引导信息,其中,包含当前查询词的相关信息的界面,如图8所示,此时,用户可根据引导信息,进步一提问更多关于餐馆的问题,如是否有wifi,是否方便停车等问题。For example, the query word input by the current user is “Western Restaurant”, and it can be determined that the vertical category corresponding to the query word is a gourmet category, and at the same time, the query user can determine that the current user's user demand is to find a western restaurant to eat, due to the query according to the query. The word cannot determine what type of western restaurant the user needs. At this time, according to the vertical knowledge system and the vertical resource library, a plurality of candidate actions can be performed, and 13 candidate actions with a higher degree of matching with the current state information are selected through the POMDP model. The selected 13 candidate actions are returned to the user for related information of the query. The query result displayed in the current user's current interface is as shown in FIG. 7. According to the query word, it is not possible to determine what type of western restaurant the user needs. At this time, the user may be guided to provide more detailed first guiding information, and provide A possible answer corresponding to the guidance information, that is, the second guidance information, is convenient for the user to select or input. Among them, the user can also view other answers corresponding to the first guidance information by clicking the next instruction button. After the user clicks “please ask the customer to eat”, a restaurant that meets the user's needs can be determined according to the query word currently input by the user, and the query result and the guiding information with the current query word are obtained, wherein the interface containing the relevant information of the current query word is obtained. As shown in Figure 8, at this time, the user can make a question based on the guidance information, and ask questions about the restaurant, such as whether there is wifi, whether it is convenient to park or not.

再例如,如果当前用户输入的查询词为“天蚕土豆的小说”,在接收到用户的查询词后,通过语义分析可确定查询词中包含小说作者的名称,根据查询词可确定查询词对应的垂类为小说垂类,同时通过查询词可确定用户是想要根据作者名查询图书,可根据作者名获得对应的候选动作,并在用户所使用的应用程序中显示查询词对应的相关信息,包含查询词的相关信息的界面形式如图9所示,此时,用户可根据需求点击对应的书名。另外,用户还以通过点击第一按键,进行账号登录,或者清空消息记录。For another example, if the query word input by the current user is "Fiction of Tianshen Potato", after receiving the query word of the user, the semantic analysis can determine the name of the novel including the author of the novel, and the query word can be determined according to the query word. The vertical class is a novel class. At the same time, it can be determined by the query word that the user wants to query the book according to the author name, and the corresponding candidate action can be obtained according to the author name, and the related information corresponding to the query word is displayed in the application used by the user. The interface form of the related information including the query word is as shown in FIG. 9. At this time, the user can click the corresponding book name according to the requirement. In addition, the user also logs in by using the first button, or clears the message record.

再例如,如果当前用户输入的查询词为“好吃的韩国烤肉”,在接收到用户输入的查询词后,可将查询词对应的垂类为餐馆美食垂类,具体而言,可将查询词解析为垂类知识体 系能够表示的结构化信息,并根据结构化信息、垂类知识体系和查询词属于的垂类的垂类资源库获取查询词对应的查询结果和引导信息,并将所获得的查询词的查询结果和引导信息返回给用户,其中,包含查询词的相关信息的用户界面,如图10所示,此时,用户可根据引导信息另选一个,也可以根据需求直接确定这家店。另外,用户还可通过点击下一条提示按键查看其他引导信息。For another example, if the query word input by the current user is “a delicious Korean barbecue”, after receiving the query word input by the user, the vertical category corresponding to the query word may be a restaurant gourmet category, specifically, the query may be Word parsing The structured information that can be represented, and the query result and the guiding information corresponding to the query word are obtained according to the structured information, the vertical knowledge system and the vertical class resource library of the vertical class to which the query word belongs, and the obtained query word query is obtained. The result and the guiding information are returned to the user, wherein the user interface including the related information of the query word is as shown in FIG. 10, at this time, the user may select another one according to the guiding information, or directly determine the store according to the demand. In addition, users can also view other boot information by clicking the next prompt button.

综上可知,该实施例的基于人工智能的信息查询方法具有以下有益效果:(1)与通过搜索引擎查找相比,在查询过程中,该实施例的信息查询方式不需要用户对垂直类别有较深的了解,通过多轮交互的方式,引导用户准确描述需求,并根据需求为用户提供对应的查询结果和引导信息。(2)对比垂类网站浏览方式,该实施例的信息查询方式,不需要用户浏览大量的网页,且无需人工过滤无用的信息,该查询方式智能过滤无用的信息,仅为用户提供与查询词的相关信息。In summary, the artificial intelligence-based information query method of the embodiment has the following beneficial effects: (1) Compared with searching through the search engine, the information query mode of the embodiment does not require the user to have a vertical category in the query process. Deeper understanding, through multiple rounds of interaction, guide users to accurately describe the requirements, and provide users with corresponding query results and guidance information according to needs. (2) Compared with the browsing method of the vertical website, the information query mode of the embodiment does not require the user to browse a large number of web pages, and does not need to manually filter useless information. The query method intelligently filters useless information and provides only the user with the query words. Related information.

(3)对比相关的对话系统,该实施例的信息查询方式,针对垂直类别资源结构的复杂性做特定处理,产生基于垂类实体结构的状态空间,可以对垂类内的深层次问题进行满足,并通过引导信息提示用户再次输入查询词,以进行下一轮的查询,也就是说,该实施的信息查询方式通过显示引导信息可有效引导用户提供正确的问题。(3) Comparing the related dialogue system, the information query mode of this embodiment specifically processes the complexity of the vertical category resource structure, and generates a state space based on the vertical class entity structure, which can satisfy the deep level problem in the vertical class. And prompting the user to input the query word again through the guiding information to perform the next round of querying, that is, the information query mode of the implementation can effectively guide the user to provide the correct question by displaying the guiding information.

深度问答服务为针对用户输入的问题信息,基于深入的语义分析和知识挖掘技术,从而为用户提供精准的问答结果的服务。当用户需求信息为深度问答需求时,深度问答服务模块可接收问题信息,并根据问题信息获取对应的问题类型,然后根据问题类型选择对应的问答模式,以及根据选择的答案生成模式和问题信息生成对应的问答结果。其中,问题类型可包括实体类型、观点类型和片段类型。The deep question and answer service is a service for users to input accurate information based on in-depth semantic analysis and knowledge mining technology. When the user requirement information is a deep question and answer requirement, the deep question and answer service module can receive the problem information, and obtain a corresponding question type according to the problem information, and then select a corresponding question and answer mode according to the question type, and generate a pattern and a problem information according to the selected answer generation mode. Corresponding question and answer results. Among them, the problem type may include an entity type, a viewpoint type, and a fragment type.

更具体地,当问题类型为实体类型时,可根据问题信息生成实体类问题信息,并基于搜索引擎抓取的摘要和历史展现日志对实体类问题信息进行扩展以生成同族实体问题信息簇。其中,同族实体问题信息簇分别对应候选答案。然后从同族实体问题信息簇分别对应候选答案中抽取候选实体,再计算候选实体的置信度,以及将置信度大于预设置信度阈值的候选实体作为问答结果进行反馈。举例来说,问题信息为“刘德华老婆是谁?”,候选答案为“其实早在九二年时就有报道,刘德华和朱丽倩已经在加拿大秘密注册结婚…”,其中,候选实体为“刘德华”、“朱丽倩”、“加拿大”。然后基于实体知识库和问答语义匹配模型计算各候选实体的置信度,可计算出候选实体“朱丽倩”的置信度大于预设置信度阈值,则可确定“朱丽倩”为问答结果。另外,还可将候选答案中首次出现“朱丽倩”的分句作为答案摘要。More specifically, when the problem type is an entity type, the entity class problem information may be generated according to the problem information, and the entity class problem information is expanded based on the summary and historical presentation log captured by the search engine to generate a cluster problem information cluster. Among them, the information cluster of the same family entity problem corresponds to the candidate answer. Then, the candidate entities are extracted from the candidate answers of the same family entity problem information cluster, and then the confidence degree of the candidate entities is calculated, and the candidate entities whose confidence is greater than the preset reliability threshold are used as the question and answer results. For example, the question information is "Who is the wife of Andy Lau?" The candidate's answer is "In fact, as early as 1992, there were reports that Andy Lau and Zhu Liqian have been secretly registered in Canada...", in which the candidate entity is "Andy Lau" "Zhu Liqian" and "Canada". Then, based on the entity knowledge base and the question-and-answer semantic matching model, the confidence of each candidate entity is calculated. It can be calculated that the confidence level of the candidate entity “Zhu Liqian” is greater than the pre-set reliability threshold, and then “Zhu Liqian” can be determined as the question and answer result. In addition, the phrase "Zhu Liqian" for the first time in the candidate answer can also be used as the answer summary.

当问题类型为观点类型时,可获取问题信息对应的候选答案,并对候选答案进行切分以生成多个候选答案短句,然后对多个候选答案短句进行聚合以生成观点聚合簇。具体地, 可根据短句中词汇的IDF(反文档频率)得分提取候选答案短句中的关键词,并对包含否定词的关键词进行泛化并生成否定标签,然后基于否定标签将关键词用向量进行表示,计算每两个关键词之间的向量夹角和/或语义相似度,然后对向量夹角小于预设角度或语义相似度大于预设阈值的候选答案进行聚合以生成观点聚合簇。When the question type is a viewpoint type, a candidate answer corresponding to the problem information may be obtained, and the candidate answer is segmented to generate a plurality of candidate answer phrases, and then the plurality of candidate answer phrases are aggregated to generate a view aggregation cluster. specifically, The keyword in the candidate answer phrase can be extracted according to the IDF (anti-document frequency) score of the vocabulary in the short sentence, and the keyword containing the negative word is generalized and a negative label is generated, and then the keyword is vectorized based on the negative label Representing that the vector angle and/or semantic similarity between each two keywords is calculated, and then the candidate answers whose vector angle is smaller than the preset angle or whose semantic similarity is greater than a preset threshold are aggregated to generate a view aggregation cluster.

在此之后,可判断观点聚合簇的观点类型。其中,观点可包括是非类、评价类、建议类等。具体地,可通过预先设定的规则或者基于统计模型确定观点聚合簇的观点类型。然后根据观点类型从对应的观点聚合簇中选择出答案观点。其中,选择答案观点的规则可包括但不仅限于选取信息覆盖最全面的答案观点、选取IDF*log(IDF)值最低的答案观点和选取在候选答案对应的文章中出现次数最多的答案观点。其中,IDF为反文档频率。在此之后,可生成答案观点对应的摘要,然后可对答案观点进行评分,并将评分大于预设评分阈值的答案观点作为问答结果进行反馈。举例来说,问题信息为“怀孕注意事项”,其中一个候选答案为“怀孕时应谨守医、多、战原则,亦即定期看医师,多卧床休息,战胜自己的不良习惯。”,可将该候选答案切分为“怀孕时应谨守医、多、战原则”、“亦即定期看医师”、“多卧床休息”、“战胜自己的不良习惯”四个候选答案短句。然后可将候选答案短句中重复的内容或者近似的内容进行聚合生成观点聚合簇,并选出答案观点。之后,可根据信息丰富度、论据充分度、信息冗余度等对答案观点进行评分,并将评分大于预设评分阈值的答案观点作为问答结果进行反馈。此外,在选出答案观点后,可获取其在来源文章中所在的句子,然后按照预定长度截取句子,从而生成该答案观点对应的摘要。之后可根据内容丰富度、答案权威性对摘要进行排序。After that, the opinion type of the opinion cluster can be judged. Among them, opinions may include non-classes, evaluation classes, suggestion classes, and the like. Specifically, the opinion type of the opinion aggregation cluster may be determined by a preset rule or based on a statistical model. Then, based on the type of opinion, the answer point is selected from the corresponding view aggregation cluster. Among them, the rules for selecting the answer point of view may include, but are not limited to, selecting the information to cover the most comprehensive answer point of view, selecting the answer point with the lowest IDF*log (IDF) value, and selecting the answer point with the most occurrence in the article corresponding to the candidate answer. Among them, IDF is the anti-document frequency. After that, a summary corresponding to the answer point can be generated, and then the answer point can be scored, and the answer point with the score greater than the preset score threshold is fed back as a question and answer result. For example, the problem information is “Precautions for Pregnancy”. One of the candidate answers is “When you are pregnant, you should follow the principle of medical treatment, and the principle of war, that is, see a doctor regularly, rest in bed, and overcome your bad habits.” The candidate answers are divided into four short answer phrases: "When you are pregnant, you should follow the doctor's principles, and you should follow the principle of regular treatment," "that is, see the doctor regularly," "multiple bed rest," and "to defeat your bad habits." The duplicated content or approximate content in the candidate answer phrase can then be aggregated to generate a view aggregation cluster and the answer opinion can be selected. After that, the answer viewpoint can be scored according to the information richness, the sufficientness of the argument, the information redundancy, and the answer viewpoint whose score is greater than the preset score threshold is used as the question and answer result. In addition, after selecting the answer point, the sentence in the source article can be obtained, and then the sentence is intercepted according to the predetermined length, thereby generating a summary corresponding to the answer point. The summary can then be sorted based on content richness and authoritative answers.

当问题类型为片段类型时,可获取问题信息对应的候选答案,并对候选答案进行切分以生成多个候选答案短句,然后对多个候选答案短句进行重要度打分以生成候选答案短句对应的短句重要度特征,并根据短句重要度特征生成答案摘要,然后可根据答案摘要的短句重要度特征、答案权威性、问题信息的相关性和答案的丰富度对答案质量进行打分。其中,短句重要度特征可包括聚合特征、相关度特征、类型特征和问题答案匹配度特征。其中,聚合特征用于衡量短句的重复度,例如:词向量质心特征、NGram(计算出现概率)特征、Lexrank(多文本自动摘要)特征等。类型特征为问题的类型特征,如WHAT(什么)类型、WHY(为什么)类型、HOW(如何)类型等。答案权威性为答案来源的网站的权威度。在此之后,可获取用户的行为数据,然后根据用户的行为数据和打分结果对候选答案进行排序,最终将排序结果作为问答结果进行反馈。其中,用户的行为数据是可包括用户对问答结果的点击行为、在问答结果上停留的时间、通过当前的问答结果跳转至其他问答结果等用户的历史行为信息。When the problem type is a segment type, a candidate answer corresponding to the problem information may be obtained, and the candidate answer is segmented to generate a plurality of candidate answer short sentences, and then the importance scores are scored for the plurality of candidate answer short sentences to generate the candidate answer short The short sentence importance feature of the sentence is generated, and the answer summary is generated according to the short sentence importance feature, and then the answer quality can be based on the short sentence importance feature of the answer summary, the authority of the answer, the relevance of the question information, and the richness of the answer. Score. The short sentence importance feature may include an aggregate feature, a relevance feature, a type feature, and a question answer matching feature. Among them, the aggregation feature is used to measure the repetition of short sentences, such as: word vector centroid feature, NGram (calculated probability of occurrence) feature, Lexrank (multi-text automatic summary) feature. Type characteristics are type characteristics of the problem, such as WHAT (what) type, WHY (why) type, HOW (how) type, and so on. The authoritativeness of the answer is the authority of the website from which the answer originated. After that, the user's behavior data can be obtained, and then the candidate answers are sorted according to the user's behavior data and the scoring result, and finally the sorting result is fed back as a question and answer result. The user's behavior data is historical behavior information of the user, including the user's click behavior on the question and answer result, the time spent on the question and answer result, and the current question and answer result jump to other question and answer results.

当用户需求信息为信息搜索需求时,信息搜索服务模块可接收问题信息,并根据问题 信息进行搜索以生成多个候选网页,然后对候选网页进行篇章分析以生成对应的候选篇章。具体地,可对候选网页进行篇章内容抽取、篇章主题分割和篇章关系分析生成对应的候选篇章。其中,篇章内容抽取主要为识别候选网页的正文部分,删除与用户需求信息无关的内容。篇章主题分割为对篇章的主题结构进行分析,可将篇章划分为多个子主题。篇章关系分析为分析篇章中多个子主题之间的关系,例如并列关系等。在生成候选篇章之后,可对候选篇章中的句子进行打分排序。其中,打分排序主要基于句子在候选篇章中的重要度以及句子与用户需求信息之间的相关度。在此之后,可获取用户的需求场景信息,并根据需求场景信息和打分排序结果生成摘要,最终将摘要作为问答结果进行反馈。其中场景信息可包括移动终端场景、电脑场景。当场景信息为移动终端场景时,则可对句子进行压缩简写,使生成的摘要尽量简明扼要;当场景信息为电脑场景时,可对句子进行拼接融合,使得生成的摘要详细清楚。当然,生成候选篇章时,由于候选篇章中的内容均与用户需求信息具有相关性,则可能会有重复或互补的内容,则需要对多个候选篇章的信息进行聚合。When the user demand information is information search demand, the information search service module can receive the problem information and according to the problem The information is searched to generate a plurality of candidate web pages, and then the candidate web pages are subjected to text analysis to generate corresponding candidate chapters. Specifically, the candidate web page may be subjected to chapter content extraction, chapter topic segmentation, and chapter relationship analysis to generate corresponding candidate chapters. Among them, the chapter content extraction mainly identifies the body part of the candidate webpage, and deletes the content irrelevant to the user demand information. The chapter topic is divided into the analysis of the topic structure of the chapter, which can divide the chapter into multiple sub-themes. Chapter relationship analysis is to analyze the relationship between multiple subtopics in a chapter, such as a side-by-side relationship. After the candidate chapter is generated, the sentences in the candidate chapter can be scored. Among them, the scoring order is mainly based on the importance of the sentence in the candidate chapter and the correlation between the sentence and the user's demand information. After that, the user's demand scenario information can be obtained, and a summary is generated according to the demand scenario information and the score sorting result, and finally the summary is fed back as a question and answer result. The scene information may include a mobile terminal scene and a computer scene. When the scene information is a mobile terminal scene, the sentence can be compressed and abbreviated, so that the generated summary is as concise as possible; when the scene information is a computer scene, the sentences can be spliced and merged, so that the generated abstract is clear and detailed. Of course, when the candidate chapter is generated, since the content in the candidate chapter is related to the user requirement information, there may be repeated or complementary content, and the information of the plurality of candidate chapters needs to be aggregated.

如图11所示,聊天服务子系统可执行以下步骤:As shown in Figure 11, the chat service subsystem can perform the following steps:

S51、接收用户输入的输入信息。S51. Receive input information input by a user.

其中,输入信息可以是语音信息,也可以是文本信息。The input information may be voice information or text information.

在接收用户输入的输入信息之后,可对输入信息进行纠错和/或改写,用于纠正输入信息中的错别字,改写不规则的口语化表达等。After receiving the input information input by the user, the input information may be corrected and/or rewritten, used to correct the typos in the input information, and to rewrite the irregular colloquial expression.

另外,还可获取与用户聊天的上文信息,然后根据上文信息判断输入信息与上文信息的依赖关系是否大于预设关系阈值。如果大于预设关系阈值,则可根据上文信息对输入信息进行补全,从而保证人机聊天的流畅度。具体地,对输入信息进行补全可包括指代消解。举例来说,输入信息为“他结婚了么?”,则可根据上文信息“刘德华”将输入信息中的“他”替代为“刘德华”。对输入信息进行补全还可包括省略补全。举例来说,上文信息“刘德华老婆叫朱丽倩。”,输入信息为“我不认识。”,则可将输入信息补全为“我不认识朱丽倩。”。In addition, the above information that is chatted with the user may also be obtained, and then according to the above information, it is determined whether the dependency of the input information and the above information is greater than a preset relationship threshold. If it is greater than the preset relationship threshold, the input information can be complemented according to the above information, thereby ensuring the smoothness of the human-machine chat. Specifically, complementing the input information may include referring to the digestion. For example, if the input information is "Has he got married?", he can replace "He" in the input information with "Andy Lau" according to the above information "Andy Lau". Completing the input information may also include omitting the completion. For example, the above information "Andy Lau's wife is called Zhu Liqian.", the input information is "I don't know.", then the input information can be completed as "I don't know Zhu Liqian."

此外,还可根据上文信息获取用户当前的话题信息,以便后续聊天服务模块对聊天话题进行引导。In addition, the current topic information of the user may be obtained according to the above information, so that the subsequent chat service module guides the chat topic.

S52、将输入信息分发至聊天服务模块。S52. Distribute the input information to the chat service module.

具体地,可对输入信息进行领域分析以获取输入信息对应的领域。然后,可根据输入信息对应的领域将输入信息分发至具有相同或相近似领域的聊天服务模块。Specifically, domain analysis may be performed on the input information to obtain an area corresponding to the input information. The input information can then be distributed to chat service modules having the same or similar fields based on the fields corresponding to the input information.

其中,聊天服务模块可包括基于搜索的聊天模块、富知识聊天模块、基于画像的聊天模块和基于众包的聊天模块中的一种或多种。The chat service module may include one or more of a search-based chat module, a rich knowledge chat module, a portrait-based chat module, and a crowdsourcing-based chat module.

具体地,基于搜索的聊天模块可对输入信息进行切词以生成多个聊天短句,然后可根据多个聊天短句查询聊天语料库从而生成多个聊天语料上句和多个聊天语料上句对应的多 个聊天语料下句。其中,聊天语料库为预先建立,语聊来源可包括但不限于贴吧等论坛数据中的“发帖-回帖”、微博中的“博文-回复”、问答社区中的“问题-答案”等。Specifically, the search-based chat module may perform a word-cutting on the input information to generate a plurality of chat phrases, and then query the chat corpus according to the plurality of chat phrases to generate a plurality of chat corpus sentences and multiple chat corpus upper sentences. More Chat corpus. The chat corpus is pre-established, and the source of the chat can include, but is not limited to, “posting-replying” in the forum data such as posting bar, “blog-reply” in the microblog, and “question-answer” in the question and answer community.

在此之后,可对多个聊天语料上句进行过滤。具体地,可计算输入信息与多个聊天语料上句之间的相似度。如果相似度小于第一预设相似度阈值,则可将对应的聊天语料上句过滤;如果相似度大于或等于第一预设相似度阈值,则可将对应的聊天语料上句保留。After that, multiple chat corpus sentences can be filtered. Specifically, the similarity between the input information and the sentences of the plurality of chat corpora can be calculated. If the similarity is less than the first preset similarity threshold, the corresponding chat corpus can be filtered; if the similarity is greater than or equal to the first preset similarity threshold, the corresponding chat corpus can be retained.

在对聊天语料上句进行过滤之后,可对过滤之后的聊天语料上句对应的聊天语料下句进行分类。具体地,计算输入信息与多个聊天语料下句之间的相似度,并根据相似度基于GBDT(梯度升压决策树,Gradient Boost Decision Tree)、SVM(支持向量机,Support Vector Machine)等机器学习模型对多个聊天语料下句进行分类。其中,输入信息与多个聊天语料下句之间的相似度可以是输入信息与聊天语料下句之间字面的相似度,也可以是输入信息与聊天语料下句基于深度神经网络训练得到的相似度,也可以是输入信息与聊天语料下句基于机器翻译模型训练得到的相似度。应当理解的是,本实施例中输入信息与多个聊天语料下句之间的相似度以及GBDT、SVM等机器学习模型为公知技术,此处不赘述。After filtering the upper corpus of the chat corpus, the chat corpus corresponding to the sentence in the chat corpus after filtering may be classified. Specifically, the similarity between the input information and the plurality of chat corpus sentences is calculated, and the machine based on the similarity is based on GBDT (Gradient Boost Decision Tree), SVM (Support Vector Machine), and the like. The learning model classifies multiple chat corpora sentences. The similarity between the input information and the sentences of the plurality of chat corpora may be a literal similarity between the input information and the sentence of the chat corpus, or may be similar to the input information and the chat corpus based on the deep neural network training. Degree can also be the similarity between the input information and the chat corpus based on the machine translation model. It should be understood that the similarity between the input information and the plurality of chat corpus in the present embodiment and the machine learning model such as GBDT and SVM are well-known technologies, and are not described herein.

然后基于搜索的聊天模块可对分类之后的聊天语料下句进行重排序,并根据排序结果生成候选回复。具体地,可根据用户聊天的上文信息获取用户的聊天属性,再根据聊天属性对分类之后的聊天语料下句基于学习排序模型(Learning-To-Rank)进行重排序。其中,聊天属性可包括聊天的场合如时间地点等、聊天的趣味性、聊天的风格等。当然,聊天属性不仅限于从用户聊天的上文信息中获取,也可以根据用户长期的历史聊天记录获取。应当理解的是,本实施例中学习排序模型为公知技术,此处不赘述。Then, the search-based chat module can reorder the chat corpus under the classification, and generate a candidate reply according to the sort result. Specifically, the chat attribute of the user may be obtained according to the above information of the user chat, and then the chat quotation sentence after the classification is reordered based on the learning order model (Learning-To-Rank) according to the chat attribute. The chat attribute may include a chat occasion such as a time and place, a chat fun, a chat style, and the like. Of course, the chat attribute is not limited to the above information from the user chat, but also can be obtained according to the user's long-term historical chat record. It should be understood that the learning ordering model in this embodiment is a well-known technology, and details are not described herein.

富知识聊天模块可根据输入信息生成搜索词,并根据搜索词进行搜索以生成多个搜索结果,然后对多个搜索结果进行句子抽取,以获取与搜索词的相似度大于第二预设相似度阈值的句子的候选句子集合。在此之后,可对候选句子集合中的句子进行改写以生成候选回复。此外,还可根据用户的聊天属性对候选句子集合中的句子进行重排序。举例来说,输入信息为“希望有机会能到富士山旅游”,可对输入信息进行解析并生成对应的搜索词“富士山、旅游”,然后根据搜索词获得多个搜索结果,并抽取与搜索词相似度高的句子。其中,有的句子可能包括如“记者了解到”等明显节选自网页文本,因此需要对这些句子进行改写,使其更加流畅,更像自然语言聊天的句子,最终生成的候选回复为“富士山由于天气原因,一年中只有规定的夏季的一段时间可以登山”,相对于传统的回复“我也想去富士山,一起吧。”,具有一定的知识性,且具有一定时效性,可使用户能在聊天过程中获取有用的知识。The rich knowledge chat module may generate a search word according to the input information, and perform a search according to the search word to generate a plurality of search results, and then perform sentence extraction on the plurality of search results to obtain a similarity with the search word greater than the second preset similarity. A set of candidate sentences for a sentence of a threshold. After that, the sentences in the set of candidate sentences can be rewritten to generate candidate responses. In addition, sentences in the candidate sentence set may be reordered according to the user's chat attribute. For example, if the input information is "I hope to have the opportunity to travel to Mount Fuji", the input information can be parsed and the corresponding search term "Mount Fuji, Tourism" can be generated, and then multiple search results can be obtained according to the search term, and the search words can be extracted and searched. A sentence with high similarity. Among them, some sentences may include obvious texts such as “journalists”, so these sentences need to be rewritten to make them more fluid, more like the sentences of natural language chat, and the final candidate for the response is “Mt. Fuji due to Due to the weather, there is only a certain period of summer in the year to climb the mountain. Compared with the traditional reply, "I also want to go to Mount Fuji, let's go together.", with certain knowledge and timeliness, can enable users to Get useful knowledge during the chat.

为了更好地实现拟人化,以及为用户提供个性化服务,人机聊天系统可设定自身的属性、状态、兴趣等,即系统画像模型。还可设定用户的属性、状态、兴趣等,即用户画像 模型。当然,在面对不同的用户时,使用的系统画像模型可以是同一个,也可以针对每个用户均可设置与之对应的系统画像模型。系统画像模型和用户画像模型均基于画像知识图谱。画像知识图谱是一个层次化的知识体系。举例来说,“家庭成员”节点可包括“兄弟姐妹”和“父母”两个子节点,“父母”子节点包括“父亲”和“母亲”两个子节点。每个节点均对应有多个输入信息模板簇,例如“你父亲是谁”、“谁是你父亲”、“你的父亲叫什么”属于同一个输入信息模板簇。每个输入信息模板簇对应一个或多个候选回复。输入信息模板簇和候选回复可包含变量,例如兴趣、爱好、嗜好对应同一属性“INTEREST”,而“INTEREST”的属性值可包括爬山、音乐、读书、运动等。In order to better realize anthropomorphization and provide personalized services for users, the human-machine chat system can set its own attributes, status, interests, etc., that is, the system portrait model. You can also set the user's attributes, status, interests, etc. model. Of course, in the case of facing different users, the system portrait model used may be the same, or a system image model corresponding to each user may be set. Both the system portrait model and the user portrait model are based on the image knowledge map. The portrait knowledge map is a hierarchical knowledge system. For example, the "family member" node may include two child nodes "brothers" and "parents", and the "parent" child nodes include two child nodes "father" and "mother". Each node corresponds to a plurality of input information template clusters, such as "Who is your father", "Who is your father", and "What is your father's name" belongs to the same input information template cluster. Each input information template cluster corresponds to one or more candidate responses. The input information template cluster and the candidate reply may include variables, such as interest, hobbies, and preferences corresponding to the same attribute "INTEREST", and the attribute values of "INTEREST" may include climbing, music, reading, sports, and the like.

具体地,基于画像的聊天模块可获取用户的聊天语境,并根据聊天语境判断是否满足收集条件。如果判断满足收集条件,则可向用户发送问题。在此之后,可接收用户根据问题的回答信息,并根据回答信息对用户画像模型进行更新。例如:在与用户聊电影相关的话题时,可向用户发送问题“你喜欢什么电影?”或者用户问人机聊天系统“你喜欢吃什么?”,人机聊天系统可反问用户“你喜欢吃什么?”,在用户回答后,可基于用户的回答信息对用户画像模型进行更新,更加符合用户个性化的需求。Specifically, the portrait-based chat module may acquire the chat context of the user, and determine whether the collection condition is satisfied according to the chat context. If it is determined that the collection conditions are met, the question can be sent to the user. After that, the user's answer information according to the question can be received, and the user portrait model is updated according to the answer information. For example, when talking to a user about a movie-related topic, you can send the question "What movie do you like?" or the user asks the human-machine chat system "What do you like to eat?", the human-machine chat system can ask the user "You like to eat What? After the user answers, the user portrait model can be updated based on the user's answer information, which is more in line with the user's personalized needs.

此外,基于画像的聊天模块还可获取用户的聊天内容,并根据聊天内容提取用户画像数据,然后根据提取的用户画像数据对用户画像模型进行更新。例如:用户在聊天过程中说道“我没事的时候喜欢爬爬山、钓钓鱼。”,可提取用户画像数据“爱好爬山、爱好钓鱼”,从而对用户画像模型进行更新。同时,可基于用户画像数据抽取合适的答案,向用户返回合适的回答信息。In addition, the portrait-based chat module may also acquire the chat content of the user, extract user image data according to the chat content, and then update the user portrait model according to the extracted user portrait data. For example, the user said in the chat process, "I like to climb the mountain and catch fishing when I am fine." I can extract the user portrait data "hobby climbing, hobby fishing" to update the user portrait model. At the same time, an appropriate answer can be extracted based on the user's portrait data, and the appropriate answer information can be returned to the user.

众包(crowdsourcing)是一种将特定任务外包给互联网中非特定用户的方法,对于人机聊天中,机器难以回答的问题,可分发给执行者在线地实时地进行人工回复,从而满足用户的实际需求。Crowdsourcing is a method of outsourcing specific tasks to non-specific users on the Internet. For human-machine chat, the problem that the machine is difficult to answer can be distributed to the performer to manually respond in real time to meet the user's needs. Actual demand.

具体地,基于众包的聊天模块可判断输入信息是否适合众包完成,例如用户情绪低落需要安慰等,则适合众包完成。例如用户的输入信息中包含有个人身份信息、密码、电话等隐私信息,则不适合众包完成。Specifically, the crowdsourcing-based chat module can determine whether the input information is suitable for crowdsourcing completion, for example, if the user's mood is low and needs comfort, etc., it is suitable for crowdsourcing completion. For example, if the user's input information contains personal information such as personal identification information, password, and telephone, it is not suitable for crowdsourcing.

如果判断适合众包完成,则可将输入信息分发至对应的执行者。当然,同时也可将上文信息一同发送给执行者,执行者可根据上文信息和输入信息进行回复。然后基于众包的聊天模块可接收执行者的回复信息,并对回复信息进行质量判断。如果满足质量要求,则将回复信息作为候选回复。例如:回复信息中如果包含低俗、反动、色情内容,则质量不过关。或者执行者回复的时间超过了预定时长,则该执行者的回复信息将不被采用,同时可将该回复信息保存至聊天语料库中。If it is judged that the crowdsourcing is completed, the input information can be distributed to the corresponding performer. Of course, the above information can also be sent to the performer together, and the performer can reply according to the above information and input information. The crowd-based chat module can then receive the reply information of the performer and make a quality judgment on the reply information. If the quality requirement is met, the reply message is used as a candidate reply. For example, if the reply message contains vulgar, reactionary, or pornographic content, the quality is not enough. Or if the executor replies for more than the predetermined duration, the replies of the executor will not be used, and the reply information can be saved to the chat corpus.

在此之外,还可判断输入信息是否属于无实际内容的聊天信息,如“呵呵”、“hoho” 等。如果判断是属于无实际内容的聊天信息,则可获取当前话题,即基于话题模型(Topic Model)根据历史聊天记录计算出当前话题。在获取当前话题之后,可基于话题聊天图谱根据当前话题生成引导话题。其中,话题聊天图谱是一个以话题为节点的有向图。例如,如图2所示,节点“休闲”可指向节点“看电影”和节点“听歌”,则说明可从话题“休闲”引导至话题“看电影”或者话题“听歌”。话题“看电影”和话题“听歌”均具有一定的引导概率,可根据引导概率实现话题的引导,从而保证引导话题的多样性。In addition to this, it can also be judged whether the input information belongs to chat information without actual content, such as "hehe", "hoho" Wait. If it is determined that the chat information belongs to the content without the actual content, the current topic can be obtained, that is, the current topic is calculated according to the historical chat record based on the Topic Model. After acquiring the current topic, a guided topic may be generated based on the current topic based on the topic chat map. Among them, the topic chat map is a directed graph with a topic as a node. For example, as shown in FIG. 2, the node "leisure" can point to the node "watching movie" and the node "listening to the song", indicating that the topic "leisure" can be guided to the topic "watching movie" or the topic "listening to the song". The topic "watching movies" and the topic "listening songs" all have a certain guiding probability, and the guiding of the topic can be realized according to the guiding probability, thereby ensuring the diversity of guiding topics.

然后,可根据引导话题生成候选回复。具体地,可基于自然语言生成模型(Natural Language Generation),生成候选回复的模板,将引导话题填充至该模板中生成候选回复;也可以选取包含引导话题的句子作为候选回复,从而实现对用户进行主动地聊天话题引导。A candidate reply can then be generated based on the guided topic. Specifically, the template of the candidate reply may be generated based on the Natural Language Generation, and the guide topic may be filled into the template to generate a candidate reply; or the sentence including the guide topic may be selected as a candidate reply, thereby implementing the user. Actively chat topic guide.

S53、接收多个聊天服务模块返回的候选回复。S53. Receive a candidate reply returned by multiple chat service modules.

其中,候选回复具有对应的置信度。Among them, the candidate reply has a corresponding confidence.

S54、基于置信度对待选回复进行排序,并根据排序结果生成聊天信息,并向用户提供聊天信息。S54. Sort the responses according to the confidence degree, and generate chat information according to the sorting result, and provide the chat information to the user.

具体地,可获取用户的输入信息的特征,并基于输入信息的特征和置信度对待选回复进行排序。其中,输入信息的特征可包括分类特征、字面特征、话题特征等。置信度越高,则待选回复质量越好,可按照置信度从高到低的顺序对待选回复进行排序,最终向用户提供符合用户需求的聊天信息。Specifically, the characteristics of the user's input information may be acquired, and the selected responses are sorted based on the characteristics and confidence of the input information. The characteristics of the input information may include classification features, literal features, topic features, and the like. The higher the confidence, the better the quality of the candidate reply is. The responses can be sorted according to the order of confidence, and finally the chat information that meets the user's needs is provided to the user.

另外,还可通过增强学习模型(Reinforcement Learning)根据用户的反馈信息进行更新,从而能够为用户提供更满意的聊天信息。例如:在回复用户的聊天信息中添加评论按钮如“赞”或“踩”以收集用户的反馈信息;或者基于情感分析技术,对用户在聊天中的输入信息进行分析,从而获得用户的评价,例如:“你真智能”等;或者通过记录与用户聊天的交互次数,判断用户的满意度。In addition, the Reinforcement Learning can also be updated according to the feedback information of the user, so that the user can be provided with more satisfactory chat information. For example, adding a comment button such as “Like” or “Tread” to the user's chat information to collect feedback information of the user; or analyzing the input information of the user in the chat based on the sentiment analysis technology, thereby obtaining the user's evaluation, For example: "You are really smart", etc.; or judge the user's satisfaction by recording the number of interactions with the user.

如图12所示,引导和推荐服务子系统可执行以下步骤:As shown in Figure 12, the boot and recommendation service subsystem can perform the following steps:

S61、接收用户输入的交互信息,并根据交互信息确定当前话题。S61. Receive interaction information input by the user, and determine a current topic according to the interaction information.

具体地,可先接收用户输入的交互信息例如:“盗梦空间好看吗?”,然后对该交互信息进行需求识别以及相关性计算,从而确定当前话题为“盗梦空间评价”。Specifically, the interaction information input by the user may be received first, for example: “Is the dream space look good?”, and then the interaction information is subjected to requirement identification and correlation calculation, thereby determining that the current topic is “the dream space evaluation”.

S62、基于话题图谱获得多个与当前话题相关的待选引导话题。S62. Obtain a plurality of candidate guidance topics related to the current topic based on the topic map.

其中,话题图谱可包括多个话题及话题之间的关联关系。The topic map may include a plurality of topics and associations between topics.

具体地,可基于预先建立的话题图谱获取多个与当前话题相关的待选引导话题。例如:当前话题为“盗梦空间评价”,则可根据话题图谱获取多个与“盗梦空间评价”相关的引导话题如“诺兰导演的电影”、“莱昂纳多主演的电影”等,及它们与“盗梦空间评价”之间的关联关系。 Specifically, a plurality of candidate guidance topics related to the current topic may be acquired based on the pre-established topic map. For example, if the current topic is "Privacy Dream Space Evaluation", you can obtain a number of guiding topics related to "Pirates of Dreams" based on the topic map, such as "Nolan's film", "Leonardo's movie", etc. And their relationship with the "dream space evaluation".

S63、获取用户的用户画像数据。S63. Acquire user image data of the user.

其中,用户画像数据为用户的属性、状态、兴趣等数据的集合,可通过用户主动输入或者根据用户的历史交互记录获取,然后对其进行整合,从而生成关于用户的个性化的用户画像数据。The user portrait data is a collection of data such as attributes, states, interests, and the like of the user, and can be acquired by the user actively input or according to the historical interaction record of the user, and then integrated, thereby generating personalized user portrait data about the user.

S64、根据用户画像数据从多个与当前话题相关的待选引导话题中选择引导话题,并向用户反馈引导话题。S64. Select a guiding topic from a plurality of candidate guiding topics related to the current topic according to the user portrait data, and feed back the guiding topic to the user.

具体地,可根据用户画像数据和交互信息的上下文信息确定用户的意图信息,然后根据用户的意图信息从多个与当前话题相关的待选引导话题中选择引导话题,并向用户反馈引导话题。Specifically, the user's intention information may be determined according to the user image data and the context information of the interaction information, and then the guidance topic is selected from a plurality of candidate guidance topics related to the current topic according to the user's intention information, and the guidance topic is fed back to the user.

举例来说,引导话题可以是当前话题的延伸。例如:交互信息为“鸡肉怎么做?”,则当前话题可为“鸡肉的做法”。当前话题交互结束后,可对当前话题延伸,结合用户画像数据如“用户为孕妇”,则可向用户反馈引导话题“孕妇如何吃鸡肉比较好”。For example, a guided topic can be an extension of the current topic. For example, if the interactive information is “How do chickens do?”, the current topic can be “Chicken Practice”. After the current topic interaction ends, the current topic can be extended. In combination with the user portrait data such as “user is a pregnant woman”, the user can be fed back to the topic “How to eat chicken in a pregnant woman is better”.

当然,引导话题也可以是基于当前话题的推荐。例如:交互信息为“盗梦空间好看吗?”,则当前话题可为“盗梦空间评价”。当前话题交互结束后,可基于当前话题,并结合用户画像数据如“用户喜欢看电影”,则可向用户反馈引导话题“诺兰的电影”。Of course, the guiding topic can also be a recommendation based on the current topic. For example, if the interactive information is “Is the dream space to look good?”, the current topic can be “the dream space evaluation”. After the current topic interaction ends, based on the current topic, and combined with user portrait data such as "users like to watch movies", the user can be fed back to guide the topic "Nolan's movie."

而当无法根据用户画像数据和交互信息的上下文信息确定用户的意图信息时,则需要对用户的意图信息进行澄清。例如:交互信息为“去故宫怎么走?”,而北京、沈阳和台北都有“故宫”,因此需要对用户的意图信息进行澄清,可根据交互信息向用户返回意图澄清的问句“请问您是要去哪个故宫?”。When the user's intention information cannot be determined based on the user image data and the context information of the interaction information, the user's intention information needs to be clarified. For example, the interactive information is “How to go to the Forbidden City?”, while Beijing, Shenyang, and Taipei all have “Forbidden City”. Therefore, it is necessary to clarify the user’s intention information, and can return a question of clarification to the user according to the interactive information. “Excuse me. Which is the Forbidden City to go to?".

另外,在步骤S62之前,还可执行步骤S65。In addition, before step S62, step S65 may also be performed.

S65、建立话题图谱。S65. Establish a topic map.

如图13所示,话题图谱中的一个节点表示用户提出的一个话题或一个需求,每个节点中可包含有对应话题的回复和满足用户需求的资源,而有关联的节点之间可通过边进行关联,从而形成网状的话题图谱。As shown in FIG. 13, one node in the topic map represents a topic or a requirement put forward by the user, and each node may include a reply of the corresponding topic and a resource satisfying the user's requirement, and the associated node may pass the edge. Correlation is made to form a meshed topic map.

具体地,建立话题图谱的方法如下:可获取话题关联数据,然后根据话题关联数据建立话题图谱。Specifically, the method for establishing the topic map is as follows: the topic association data can be obtained, and then the topic map is established according to the topic association data.

更具体地,获取话题关联数据可分为两种情况。More specifically, obtaining topic related data can be divided into two cases.

第一种情况:可先获取网络文本数据,并从网络文本数据中获取话题关联数据。其中,网络文本数据可分为非结构化数据、半结构化数据和结构化数据。The first case: the network text data can be obtained first, and the topic related data is obtained from the network text data. Among them, network text data can be divided into unstructured data, semi-structured data and structured data.

当网络文本数据为非结构化数据时,可基于实体提取和句法分析获取话题关联数据。其中,非结构化数据可包括新闻、论坛、博客、视频等。例如:对于网络文本数据“最受瞩目的诺贝尔文学奖花开有主,法国人莫迪亚诺成为新科幸运者。当然,多次提名总是和 诺奖失之交臂的村上春树还是那个“离诺奖最近的人”。中国诗人北岛,也只是让国人狂热了一回。”,可基于实体提取技术提取实体信息“诺贝尔文学奖”、“法国人莫迪亚诺”、“村上春树”、“中国诗人北岛”,并基于句法分析获知上述实体信息之间存在关联。更进一步地,还可分析出法国人莫迪亚诺是诺贝尔文学奖获得者,村上春树和中国诗人北岛没有获得诺贝尔文学奖等。When the network text data is unstructured data, the topic association data can be obtained based on entity extraction and syntax analysis. Among them, unstructured data may include news, forums, blogs, videos, and the like. For example: for the text text data "the most eye-catching Nobel Prize for Literature is the winner, the Frenchman Modiano becomes the lucky winner of the new class. Of course, multiple nominations are always Murakami Haruki, who missed the award, is still the "most recent person who has won the Nobel Prize." The Chinese poet North Island has only made the Chinese people crazy. "Based on entity extraction technology, the entity information "Nobel Prize for Literature", "French Modiano", "Murao Haruki", "Chinese poet North Island" can be extracted, and based on syntactic analysis, the existence of the above entity information is known. Further, it can be analyzed that the Frenchman Modiano is a Nobel laureate in literature, and Haruki Murakami and the Chinese poet North Island did not receive the Nobel Prize in Literature.

当网络文本数据为半结构化数据时,基于页面结构分析、标签提取、实体识别获取话题关联数据。其中,半结构化数据可包括维基百科、百度百科等百科数据,或者专题数据等。例如:如图14所示,可基于页面结构分析、标签提取、实体识别,获取“德约科维奇”的“场下生活”包括“家庭生活”和“慈善活动”。When the network text data is semi-structured data, the topic-related data is obtained based on page structure analysis, tag extraction, and entity recognition. Among them, the semi-structured data may include Wikipedia, Baidu Encyclopedia and other encyclopedic data, or thematic data. For example, as shown in FIG. 14, the "off-the-life" of "Djokovic", including "family life" and "charity activities", can be obtained based on page structure analysis, tag extraction, and entity recognition.

当网络文本数据为结构化数据时,从知识图谱中获取话题关联数据。其中,结构化数据可包括知识图谱数据。例如:如图15所示,电影“盗梦空间”和电影“星际穿越”的导演为“克里斯托弗.诺兰”。When the network text data is structured data, the topic related data is obtained from the knowledge map. Among them, the structured data may include knowledge map data. For example, as shown in Figure 15, the director of the movie "Pirates of the Dream" and the movie "Interstellar Crossing" is "Christopher Nolan".

第二种情况:可先获取用户的搜索行为数据或浏览行为数据,然后根据搜索行为数据或浏览行为数据生成话题关联数据。The second case: the user's search behavior data or browsing behavior data may be obtained first, and then the topic association data is generated according to the search behavior data or the browsing behavior data.

具体地,可获取用户的搜索行为数据,并根据搜索行为数据获取对应的搜索对象,然后根据搜索对象生成话题关联数据。例如:用户连续搜索了“诺兰”、“诺兰的电影”和“克里斯蒂安.贝尔”,则可对上述话题进行关联,从而生成话题关联数据。Specifically, the search behavior data of the user may be acquired, and the corresponding search object is obtained according to the search behavior data, and then the topic association data is generated according to the search object. For example, if the user continuously searches for "Nolan", "Nolan's Movie" and "Christian Bell", the above topics can be correlated to generate topic related data.

当然,也可以获取用户的浏览行为数据,并根据浏览行为数据获取对应的浏览对象,根据浏览对象生成话题关联数据。例如:如图16所示,可将用户浏览网页时点击的多个新闻或视频进行关联,从而生成话题关联数据。Certainly, the browsing behavior data of the user may also be obtained, and the corresponding browsing object is obtained according to the browsing behavior data, and the topic related data is generated according to the browsing object. For example, as shown in FIG. 16, a plurality of news or videos that are clicked when a user browses a webpage may be associated to generate topic related data.

在获取话题关联数据之后,可通过RandomWalk算法、关联分析算法、协同过滤算法中的一种或多种,根据话题关联数据建立话题图谱。举例来说,如图17所示,q1、q2、q3以及q1’、q2’、q3’、q4’为话题,d1、d2、d3和d4为资源数据。从图13中可知,资源数据d1和d2与话题q1相关联;资源数据d1、d2、d3与话题q2相关联;资源数据d4与话题q3相关联,具有关联关系的话题和资源数据之间用实线相连。基于RandomWalk算法可迭代计算出话题q1和资源数据d3之间具有关联关系,它们之间用虚线相连。而话题q1’为用户在浏览了资源数据d1或d4后,根据资源数据d1或d4发出的话题,它们之间的关联关系具有顺序关系。同理,话题q2’为根据资源数据d2发出的话题,话题q3’为根据资源数据d2或d3发出的话题,话题q4’为根据资源数据d3或d4发出的话题。进一步地,可推导出话题q1和话题q1’具有关联关系,话题q1和话题q2’具有关联关系等,最终建立如图13所示的话题图谱。After the topic association data is obtained, the topic map may be established according to the topic association data by one or more of the RandomWalk algorithm, the association analysis algorithm, and the collaborative filtering algorithm. For example, as shown in Fig. 17, q1, q2, q3, and q1', q2', q3', and q4' are topics, and d1, d2, d3, and d4 are resource data. As can be seen from FIG. 13, the resource data d1 and d2 are associated with the topic q1; the resource data d1, d2, d3 are associated with the topic q2; the resource data d4 is associated with the topic q3, and the topic with the associated relationship and the resource data are used. Solid lines are connected. Based on the RandomWalk algorithm, it is possible to iteratively calculate the relationship between the topic q1 and the resource data d3, which are connected by a dotted line. The topic q1' is a topic that the user issues after the resource data d1 or d4 is browsed according to the resource data d1 or d4, and the relationship between them has a sequential relationship. Similarly, the topic q2' is a topic issued based on the resource data d2, the topic q3' is a topic issued based on the resource data d2 or d3, and the topic q4' is a topic issued based on the resource data d3 or d4. Further, it can be inferred that the topic q1 and the topic q1' have an association relationship, the topic q1 and the topic q2' have an association relationship, etc., and finally the topic map as shown in Fig. 13 is established.

另外,引导和推荐服务子系统还可对交互信息进行解析,并获取交互信息中的关键字 段。其中,关键字段可包括时间信息、地点信息、提醒事件的一种或多种。然后可根据关键字段建立提醒信息,在当时间信息达到预设时间时,可向用户发送提醒信息。举例来说,假设用户的交互信息为“明天晚上6点提醒我写工作计划”,则可解析出时间信息“2015年8月8日18:00”,提醒事件是“写工作计划”。当达到这个时间时,可对用户进行提醒。In addition, the guidance and recommendation service subsystem can also parse the interaction information and obtain keywords in the interaction information. segment. The keyword segment may include one or more of time information, location information, and reminder events. Then, the reminder information can be established according to the keyword segment, and when the time information reaches the preset time, the reminder information can be sent to the user. For example, if the user's interaction information is "Remind me to write a work plan at 6 o'clock tomorrow night", the time information "18:00 on August 8, 2015" can be parsed, and the reminder event is "write work plan". When this time is reached, the user can be alerted.

本发明实施例的基于人工智能的人机交互方法,包含以下优点:(1)实现了人机交互系统从工具化转变为拟人化,通过聊天、搜索等服务,让用户在智能交互的过程中获得轻松愉悦的交互体验,而不再仅仅是搜索和问答。(2)从关键词形式的搜索改进为基于自然语言的搜索,用户可以使用灵活自如的自然语言来表达需求,多轮的交互过程更接近人与人之间的交互体验。(3)实现从用户主动搜索演变为全天候的陪伴式服务,基于用户的个性化模型可以随时随地为用户提供推荐等服务。The human intelligence interaction method based on artificial intelligence in the embodiment of the invention comprises the following advantages: (1) realizing the transformation of the human-computer interaction system from instrumentalization to personification, and letting the user interact in the process of intelligent interaction through chat, search and other services. Get a relaxing and enjoyable interactive experience, not just search and question and answer. (2) From keyword-based search to natural-based search, users can express their needs in a flexible and natural language. The multi-round interaction process is closer to the interactive experience between people. (3) To realize the evolution from user active search to all-weather companion service, based on the user's personalized model, you can provide users with recommendations and other services anytime, anywhere.

为实现上述目的,本发明还提出一种基于人工智能的人机交互系统。To achieve the above object, the present invention also proposes a human-computer interaction system based on artificial intelligence.

图18是根据本发明一个实施例的基于人工智能的人机交互系统的结构示意图一。FIG. 18 is a first schematic structural diagram of a human-computer interaction system based on artificial intelligence according to an embodiment of the present invention.

如图18所示,该基于人工智能的人机交互系统可包括:第一接收子系统10000、分发子系统20000、交互服务子系统30000、第二接收子系统40000、生成子系统50000和提供子系统60000。As shown in FIG. 18, the artificial intelligence based human-computer interaction system may include: a first receiving subsystem 10000, a distribution subsystem 20000, an interactive service subsystem 30000, a second receiving subsystem 40000, a generating subsystem 50000, and a provider. System 60000.

第一接收子系统10000可接收用户通过应用终端输入的输入信息。The first receiving subsystem 10000 can receive input information input by the user through the application terminal.

其中,应用终端可包括PC端、移动终端或智能机器人。输入信息可以是文本信息、图像信息或语音信息。The application terminal may include a PC end, a mobile terminal, or an intelligent robot. The input information may be text information, image information, or voice information.

分发子系统20000可根据用户的输入信息获取用户的意图信息,并根据意图信息将输入信息分发至至少一个交互服务子系统30000。The distribution subsystem 20000 may acquire the user's intent information according to the user's input information, and distribute the input information to the at least one interactive service subsystem 30000 according to the intent information.

其中,交互服务子系统可包括需求满足服务子系统31000、聊天服务子系统32000和引导和推荐服务子系统33000。在本发明的一个实施例中,分发子系统20000可根据用户的输入信息获取用户的意图信息,然后根据意图信息将输入信息分发至上述的交互服务子系统30000。The interactive service subsystem may include a demand fulfillment service subsystem 31000, a chat service subsystem 32000, and a boot and recommendation service subsystem 33000. In one embodiment of the present invention, the distribution subsystem 20000 may acquire the user's intent information according to the user's input information, and then distribute the input information to the interactive service subsystem 30000 according to the intent information.

另外,第一接收子系统10000还可接收用户的定制任务信息,分发子系统20000可根据定制任务信息将输入信息分发至至少一个交互服务子系统30000。例如:有的用户的任务只进行搜索,则可只需定制需求满足服务子系统;有的用户的任务既需要搜索,又需要进行聊天,则可定制需求满足服务子系统和聊天服务子系统,以上均可根据用户实际需求进行定制。In addition, the first receiving subsystem 10000 can also receive customized task information of the user, and the distribution subsystem 2000 can distribute the input information to the at least one interactive service subsystem 30000 according to the customized task information. For example, if a user's task only performs a search, then only the customized requirements satisfy the service subsystem; some users' tasks need to search and chat, and the customized requirements satisfy the service subsystem and the chat service subsystem. All of the above can be customized according to the actual needs of users.

第二接收子系统40000可接收至少一个交互服务子系统30000返回的返回结果。The second receiving subsystem 40000 can receive the returned result returned by the at least one interactive service subsystem 30000.

生成子系统50000可按照预设的决策策略根据返回结果生成用户返回结果。 The generating subsystem 50000 may generate a user return result according to the returned result according to a preset decision policy.

提供子系统60000可将用户返回结果提供至用户。A provisioning subsystem 60000 can provide user return results to the user.

如图19所示,生成子系统50000可包括第一获取模块51000、第二获取模块52000、第一决策模块53000。As shown in FIG. 19, the generation subsystem 50000 can include a first acquisition module 51000, a second acquisition module 52000, and a first decision module 53000.

第一获取模块51000可获取输入信息的需求分析特征。The first obtaining module 51000 can acquire a demand analysis feature of the input information.

第二获取模块52000可获取交互服务子系统返回的返回结果的置信度特征、用户的对话交互信息的上下文特征以及用户的个性化模型特征。The second obtaining module 52000 can obtain a confidence feature of the returned result returned by the interactive service subsystem, a context feature of the user's dialog interaction information, and a personalized model feature of the user.

第一决策模块53000可根据需求分析特征、返回结果的置信度特征、用户的对话交互信息的上下文特征以及用户的个性化模型特征对返回结果进行决策以确定用户返回结果。The first decision module 53000 can make a decision on the returned result according to the requirement analysis feature, the confidence feature of the returned result, the context feature of the user's dialogue interaction information, and the personalized model feature of the user to determine the return result of the user.

具体地,对返回结果进行决策以确定用户返回结果主要基于以下几个特征:1、需求分析特征,通过对用户的问题信息进行需求分析,可选择更符合用户需求的问答服务模块提供的问答结果。2、问答结果置信度特征,每个问答服务模块提供的问答结果均具有置信度,可选择置信度高的问答结果。3、用户的对话交互信息的上下文特征,可选择更符合上下文信息的问答结果。4、用户的个性化模型特征,可选择更符合用户个性化需求的问答结果。其中,需求分析特征、问答结果的置信度特征、用户的对话交互信息的上下文特征以及用户的个性化模型特征分别对应有各自的决策权重。基于以上特征对问答结果进行决策,从而确定最终的问答结果。在确定最终的问答结果后,可反馈给用户,从而满足用户的需求。其中,问答结果可通过语音播报的方式,亦可以通过屏幕显示的方式反馈给用户。采用语音播报的方式使得人机交互的过程更加简便、自然。Specifically, the decision is made on the returned result to determine that the user returns the result mainly based on the following characteristics: 1. The requirement analysis feature, by performing a demand analysis on the user's problem information, selecting a question and answer result provided by the question and answer service module that more closely matches the user's needs. . 2, question and answer results confidence characteristics, each Q & A service module provides a question and answer results with confidence, you can choose a high confidence question and answer results. 3. The contextual characteristics of the user's conversational interaction information can select a question and answer result that is more in line with the context information. 4, the user's personalized model features, you can choose a question and answer results that better meet the user's individual needs. Among them, the demand analysis feature, the confidence feature of the question and answer result, the contextual feature of the user's dialogue interaction information, and the user's personalized model feature respectively have their respective decision weights. Based on the above characteristics, the Q&A results are determined to determine the final question and answer results. After determining the final Q&A result, it can be fed back to the user to meet the user's needs. Among them, the question and answer results can be reported to the user by means of voice broadcast or by means of on-screen display. The use of voice broadcast makes the process of human-computer interaction more convenient and natural.

另外,如图20所示,生成子系统50000还可训练模块54000。训练模块54000可根据用户的日志基于增强学习模型对需求分析特征、问答结果的置信度特征、用户的对话交互信息的上下文特征以及用户的个性化模型特征的决策权重进行训练,从而为用户提供更符合用户需求的问答结果。Additionally, as shown in FIG. 20, the generation subsystem 50000 can also train the module 54000. The training module 54000 can train the user based on the enhanced learning model based on the enhanced learning model, the confidence feature of the question and answer result, the contextual feature of the user's dialogue interaction information, and the decision weight of the user's personalized model feature. Q&A results that meet user needs.

在生成用户返回结果之后,提供子系统60000可将用户返回结果转化为自然语言并播报给用户。当然,也可直接将用户返回结果对应的文本展现给用户。After generating the user return result, the provisioning subsystem 60000 can convert the user return result into a natural language and broadcast it to the user. Of course, the text corresponding to the result returned by the user can also be directly presented to the user.

在本发明的一个实施例中,如图21所示,基于人工智能的人机交互系统还可包括发送子系统70000和执行子系统80000。In an embodiment of the present invention, as shown in FIG. 21, the artificial intelligence based human-computer interaction system may further include a transmitting subsystem 70000 and an execution subsystem 80000.

如果用户返回结果中包括有执行指令,发送子系统70000可将执行指令发送至对应的执行子系统80000,执行子系统80000可执行该执行指令。其中,执行指令可包括但不仅限于硬件动作指令、播放音乐指令以及朗读故事指令等。举例来说,硬件动作指令主要针对智能机器人,智能机器人可具有头部、躯干、四肢等硬件组成部件,因此可执行诸如“点点头”、“笑一下”、“举起手来”等操控智能机器人硬件组成部件的指令。播放音乐指令通常可包括开始播放、停止播放、上一首、下一首、大点声、声音小一点等。需要说明的是, 用户对于特定类型或风格等音乐的搜索(如“适合睡前听的音乐”、“周杰伦好听的歌”等)并不属于播放音乐指令。朗读故事指令主要针对的是面向儿童的应用,如智能机器人需要代替父母给儿童讲故事。与播放音乐指令类似,对特定主题、人物、情节等的故事进行搜索也不属于朗读故事指令。If the user returns a result including an execution instruction, the transmitting subsystem 70000 can send the execution instruction to the corresponding execution subsystem 80000, which can execute the execution instruction. The execution instructions may include, but are not limited to, hardware action instructions, playing music instructions, and reading story commands. For example, hardware action instructions are mainly for intelligent robots, which can have hardware components such as heads, torso, limbs, etc., so that control intelligence such as "nodding", "laughing", "lifting hands" can be performed. The instructions of the robot hardware components. Playing music instructions can generally include starting playback, stopping playback, previous, next, big, small, and so on. It should be noted, The user's search for music of a particular type or style (such as "music for listening to bedtime", "good song for Jay Chou", etc.) is not a music instruction. The reading of the story is mainly aimed at children-oriented applications, such as intelligent robots need to replace the parents to tell children stories. Similar to playing music instructions, searching for stories of specific topics, characters, plots, etc. is not a reading of story instructions.

在本发明的一个实施例中,如图22所示,基于人工智能的人机交互系统还可包括补全子系统90000。In an embodiment of the present invention, as shown in FIG. 22, the artificial intelligence based human-computer interaction system may further include a complement subsystem 90000.

在接收用户通过应用终端输入的输入信息之后,补全子系统90000可获取与用户交互的交互上文信息,然后可根据交互上文信息对输入信息进行补全。具体地,在多轮交互过程中,用户通常会基于对话上文省略输入信息中的一部分内容,因此需要对输入信息进行补全,从而澄清用户的需求。例如:对话上文为“北京有什么小吃?”,而输入信息为“那特产呢?”,则需要对输入信息进行补全,生成新的问题信息“北京有什么特产?”。After receiving the input information input by the user through the application terminal, the completion subsystem 90000 can acquire the above information of the interaction with the user, and then can complete the input information according to the above information. Specifically, in a multi-round interaction process, the user usually omits a part of the input information based on the conversation, and therefore needs to complete the input information to clarify the user's needs. For example, the dialogue above is “What snacks are there in Beijing?”, and the input information is “What about special products?”, you need to complete the input information and generate new question information “What special products does Beijing have?”.

在本发明的实施例中,如图23所示,基于人工智能的人机交互系统还可包括记录子系统100000和监控子系统110000。In an embodiment of the present invention, as shown in FIG. 23, the artificial intelligence based human-computer interaction system may further include a recording subsystem 100000 and a monitoring subsystem 110000.

如果在网络资源中不存在满足用户需求的用户返回结果,记录子系统100000可记录用户的输入信息,然后监控子系统110000以预设周期监控网络资源中是否存在满足用户需求的用户返回结果,并当用户返回结果存在时,提供子系统60000可将用户返回结果提供至用户。举例来说,用户搜索一部刚上映的电影,但是目前网络资源中并没有相应资源,则可记录下该用户这一需求,并按照一定周期搜索网络资源中是否出现相应资源。当搜索到相应资源后,可将该资源推送给用户,即实现异步需求满足。If there is no user return result in the network resource that satisfies the user's requirement, the recording subsystem 100000 can record the user's input information, and then the monitoring subsystem 110000 monitors, in a preset period, whether there is a user return result satisfying the user's requirement in the network resource, and The provisioning subsystem 60000 can provide the user return results to the user when the user returns a result. For example, if a user searches for a movie that has just been released, but there is no corresponding resource in the current network resource, the user's demand can be recorded, and the corresponding resource is searched for in the network resource according to a certain period. After the corresponding resource is searched, the resource can be pushed to the user, that is, the asynchronous demand is satisfied.

如图24所示,需求满足服务子系统31000可包括第三获取模块31100、第四获取模块31200、第一分发模块31300、问答服务模块31400和第二决策模块31500。As shown in FIG. 24, the demand satisfaction service subsystem 31000 may include a third acquisition module 31100, a fourth acquisition module 31200, a first distribution module 31300, a question and answer service module 31400, and a second decision module 31500.

第三获取模块31100可获取用户输入的问题信息。其中,问题信息可以是文字信息,也可以是语音信息。例如,用户输入的问题信息“北京有什么小吃?”。The third obtaining module 31100 can acquire problem information input by the user. The problem information may be text information or voice information. For example, the question information entered by the user "What snacks are there in Beijing?"

第四获取模块31200可根据问题信息获取用户的用户需求信息。具体地,可对问题信息进行需求分析,从而获取用户的用户需求信息。举例来说,用户需求信息可以为垂类需求、阿拉丁需求、深度问答需求、信息搜索需求等。The fourth obtaining module 31200 can acquire user demand information of the user according to the problem information. Specifically, the problem information may be analyzed for requirements to obtain user demand information of the user. For example, user demand information can be vertical category requirements, Aladdin requirements, deep question and answer requirements, information search needs, and the like.

第一分发模块31300可根据用户需求信息将问题信息分发至对应的至少一个问答服务模块31400。其中,问答服务模块可包括阿拉丁服务模块31410、垂类服务模块31420、深度问答服务模块31430和信息搜索服务模块31440。The first distribution module 31300 may distribute the problem information to the corresponding at least one question and answer service module 31400 according to the user requirement information. The Q&A service module may include an Aladdin service module 31410, a vertical class service module 31420, a deep question and answer service module 31430, and an information search service module 31440.

在本发明的一个实施例中,当用户需求信息为阿拉丁需求时,可将问题信息分发至阿拉丁服务模块31410;当用户需求信息为垂类需求时,可将问题信息分发至垂类服务模块31420;当用户需求信息为深度问答需求时,可将问题信息分发至深度问答服务模块31430; 当用户需求信息为信息搜索需求时,可将问题分发至信息搜索服务模块31440。In an embodiment of the present invention, when the user demand information is an Aladdin requirement, the problem information may be distributed to the Aladdin service module 31410; when the user demand information is a vertical demand, the problem information may be distributed to the vertical service. Module 31420; when the user requirement information is a deep question and answer requirement, the problem information may be distributed to the deep question and answer service module 31430; When the user demand information is an information search request, the problem can be distributed to the information search service module 31440.

其中,阿拉丁服务是能够为用户需求提供精准满足的一类服务的统称,例如美元兑换人民币、2015年春节放假等。举例来说,用户的问题信息为“刘德华的老婆是谁?”,则阿拉丁服务模块31410可对该问题信息进行分析,可分析出需求类型为“人物”,查询主体为“刘德华”,查询属性为“老婆”,并可将查询属性进行归一,将查询属性归一为“妻子”。然后搜索并获得结果字段为“朱丽倩”,再基于自然语言生成技术(Natural Language Generation)生成问答结果“刘德华的老婆是朱丽倩”。再例如:用户的问题信息为“北京明天热吗?”,通过搜索并获得结果字段为“35摄氏度”,可基于常识知识库和预设的规则,生成问答结果“明天天气很热,最高温度为35摄氏度,建议注意防暑降温。”其中,常识知识库可包括常识类知识,如温度高于30摄氏度属于天气热。Among them, Aladdin service is a general term for a type of service that can provide users with precise satisfaction, such as the exchange of US dollars to RMB, and the Spring Festival holiday in 2015. For example, if the user's question information is "Who is Andy Lau's wife?", the Aladdin service module 31410 can analyze the problem information, and analyze the demand type as "person", and the query subject is "Andy Lau", query The attribute is "wife", and the query attributes can be normalized, and the query attributes are grouped into "wife". Then search and obtain the result field as "Zhu Liqian", and then generate a question and answer based on Natural Language Generation (Andy Lau's wife is Zhu Liqian). For another example: the user's question information is “Would it hot tomorrow in Beijing?”. By searching and obtaining the result field as “35 degrees Celsius”, the question and answer result can be generated based on the common sense knowledge base and preset rules. “The weather is very hot tomorrow, the highest temperature. For 35 degrees Celsius, it is recommended to pay attention to heatstroke prevention." Among them, common sense knowledge base can include common sense knowledge, such as temperature above 30 degrees Celsius is hot weather.

垂类服务是针对垂类需求进行多轮交互的服务,例如“订机票”等。垂类服务主要通过对话控制技术(Dialogue Management)和对话策略技术(Dialogue Policy),对用户的需求进行澄清,从而向用户提供满足用户需求的问答结果。举例来说,用户的问题信息为“北京到上海的机票”,则可对该问题信息进行分析,然后向用户反问“您的出发日期是哪天?”,用户回答“明天”,然后继续反问“您对航空公司是否有要求?”等,逐步澄清用户的需求,并最终返回满足用户需求的问答结果。The vertical service is a service that performs multiple rounds of interaction for the vertical type of demand, such as "booking a ticket". The service of the vertical class mainly clarifies the user's needs through Dialogue Management and Dialogue Policy, so as to provide users with questions and answers that meet the needs of users. For example, if the user's problem information is “Beijing to Shanghai ticket”, the problem information can be analyzed, and then the user is asked “Which day is your departure date?”, the user answers “Tomorrow” and then continues to ask questions. “Do you have any requirements for airlines?”, etc., gradually clarify the needs of users, and finally return the Q&A results that meet the needs of users.

如图25所示,垂类服务模块31420可包括第五获取子模块31421、确定子模块31422和交互子模块31423。As shown in FIG. 25, the drop class service module 31420 can include a fifth get submodule 31421, a determine submodule 31422, and an interaction submodule 31423.

第五获取子模块31421可获取用户输入的查询词。The fifth obtaining sub-module 31421 can obtain the query word input by the user.

在本发明的一个实施例中,用户可通过多种方式输入查询词,例如,用户可以以文本、语音或图像输入查询词。In one embodiment of the invention, the user can enter query terms in a variety of ways, for example, the user can enter query terms in text, voice, or images.

在用户通过语音或者图像输入时,可将输入的语音或者图像转换为用户方便理解的自然语言的查询词,并在交互界面上显示对应的文本。When the user inputs by voice or image, the input voice or image can be converted into a natural language query word that the user can easily understand, and the corresponding text is displayed on the interactive interface.

例如,在用户通过语音方式输入查询词后,可基于语言模型将用户输入的语音转换为对应的文本,并以自然语言的形式在交互界面上显示用户输入的查询词。For example, after the user inputs the query word by voice, the voice input by the user may be converted into a corresponding text based on the language model, and the query word input by the user is displayed on the interactive interface in the form of a natural language.

确定子模块31422可确定查询词属于的垂类。The determination sub-module 31422 can determine the vertical class to which the query term belongs.

具体地,在获得用户输入的查询词后,需要确定查询词属于的垂类,以方便后续在查询词属于的垂类下,与用户进行交互,或者获得查询词的相关信息。目前,可通过多种方式确定查询词所属的垂类,用户可根据实际需求进行选择,举例说明如下:Specifically, after obtaining the query word input by the user, it is necessary to determine the vertical class to which the query word belongs, so as to facilitate subsequent interaction with the user under the vertical class to which the query word belongs, or obtain relevant information of the query word. At present, the categorization class of the query term can be determined in a variety of ways, and the user can select according to actual needs, as illustrated by the following:

(1)基于机器学习方式确定查询词属于的垂类。(1) Determine the vertical class to which the query term belongs based on the machine learning method.

具体地,首先从搜索引擎日志(包含语音搜索)中挖掘和标注与垂类相关的查询(query),构建垂类相关的训练数据集合,然后对训练数据提取特征,训练机器学习分类 器(例如最大熵模型、支持向量机)根据提取到的特征对垂类需求查询进行分类,以确定查询词语与垂类的对应关系,并保存查询词语与垂类的对应关系。Specifically, firstly, mining and annotating the query related to the vertical class from the search engine log (including the voice search), constructing the training data set related to the vertical class, and then extracting the feature from the training data, and training the machine learning classification The device (for example, the maximum entropy model and the support vector machine) classifies the vertical class demand query according to the extracted features to determine the correspondence between the query words and the vertical class, and saves the correspondence between the query words and the vertical class.

其中,需要说明的是,在分类的过程中,对于多个垂类,可以采用所有类别统一模型多分类,也可以采用每个垂类单独模型二分类,最后统一决策。Among them, it should be noted that in the process of classification, for a plurality of vertical classes, all categories of unified model multi-classification can be used, or each vertical class can be used to separate two models, and finally unified decision-making.

具体而言,在获得查询词后,可通过查询词与垂类的对应关系确定查询词对应的垂类。例如,在接收到用户输入的查询词为“天蚕土豆的小说”后,由于查询词中包含作者名,小说等词,通过机器学习方式可确定该查询词对应的垂类为小说垂类。Specifically, after obtaining the query word, the corresponding class of the query word may be determined by the correspondence between the query word and the vertical class. For example, after receiving the query word input by the user as "the novel of the silkworm potato", since the query word includes the author name, the novel and the like, the machine learning method can determine that the vertical class corresponding to the query word is a novel.

(2)基于模式解析方式确定查询词属于的垂类。(2) Determine the vertical class to which the query term belongs based on the mode analysis method.

为了可以基于模式解析方式确定查询词属于的垂类,在确定查询词属于的垂类之前,针对每类垂类(例如小说垂类,美食垂类、地点垂类、餐馆垂类等),可构建关键词列表,并保存垂类与关键词之间的对应关系。In order to determine the vertical class to which the query word belongs based on the mode analysis method, before determining the vertical class to which the query word belongs, for each type of vertical class (for example, novel vertical class, gourmet vertical class, location vertical class, restaurant vertical class, etc.) Build a keyword list and save the correspondence between the vertical class and the keyword.

在接收到用户输入的查询词后,可基于分词、命名实体识别等技术,对查询中的实体和关键词进行解析,并用解析结果匹配垂直类别的模式集合,如果匹配成功,则发到对应的垂直类别。After receiving the query word input by the user, the entity and the keyword in the query may be parsed based on techniques such as word segmentation and named entity recognition, and the parsing result is used to match the pattern set of the vertical category. If the matching is successful, the corresponding image is sent to the corresponding Vertical category.

以找餐馆垂类为例:假定用户当前输入的查询词为“三里屯附近安静的餐厅”,首先对这个query做分词、命名实体识别等基础词法分析,通过分析可确定该query对应的模式为:[地点]_[风格]_[餐厅]。每个类别单独挖掘模式集合。也就是说,对于待分发的query,首先,通过分词、命名实体识别等基础词法分析方式对query进行分析,然后将分析结果与垂直类别的模式集合进行匹配,如果匹配成功,则分发到对应的垂直类别。Take the restaurant as an example: suppose the user's current query is “quiet restaurant near Sanlitun”. Firstly, the basic lexical analysis of the query, named entity identification, etc., can be determined by analysis to determine the corresponding pattern of the query: [Location]_[Style]_[Restaurant]. Each category separates a collection of patterns. That is to say, for the query to be distributed, firstly, the query is analyzed by basic lexical analysis methods such as word segmentation and named entity recognition, and then the analysis result is matched with the pattern set of the vertical category, and if the matching is successful, the corresponding is distributed to the corresponding Vertical category.

交互子模块31423可在查询词属于的垂类中,与用户进行至少一轮的交互,得到用户需要的查询结果,其中,每轮交互时,展示给用户的信息包括:对应查询词的查询结果,以及,引导信息。The interaction sub-module 31423 can perform at least one round of interaction with the user in the vertical class to which the query term belongs, and obtain the query result required by the user. The information displayed to the user in each round of interaction includes: the query result corresponding to the query word. , as well, guide information.

在本发明的一个实施例中,如图26所示,交互子模块31423可以包括:解析单元314231、第四获取单元314232、展示单元314233和第五获取单元314234。In an embodiment of the present invention, as shown in FIG. 26, the interaction sub-module 31423 may include: a parsing unit 314231, a fourth obtaining unit 314232, a displaying unit 314233, and a fifth obtaining unit 314234.

解析单元314231可将查询词解析为查询词属于的垂类的垂类知识体系能够表示的结构化信息。The parsing unit 314231 can parse the query word into structured information that can be represented by the vertical knowledge system of the vertical class to which the query word belongs.

其中,每种垂类的垂类知识体系是预先建立的,垂类知识体系是基于垂直类别结构化网页提供的信息和用户需求表示体系建立起来的。Among them, the vertical knowledge system of each vertical class is pre-established, and the vertical knowledge system is based on the information provided by the vertical category structured webpage and the user demand representation system.

其中,用户需求表示体系是用户需求的语义表示体系,具体地,可从用户需求表示体系中挖掘出语义和结构知识。Among them, the user requirement representation system is a semantic representation system of user requirements. Specifically, semantic and structural knowledge can be mined from the user requirement representation system.

需要说明的是,用户需求是根据查询词确定的。也就是说,用户需求表示体系中包含大量与用户需求对应的查询词,通过对查询词进行分析,可从中获得查询词的语义和结构 知识。It should be noted that the user requirements are determined based on the query terms. That is to say, the user requirement representation system contains a large number of query words corresponding to the user's needs, and the semantics and structure of the query words can be obtained by analyzing the query words. Knowledge.

每种垂类的垂类知识体系的结构形式不同,下面举例说明一下垂类知识体系的结构形式。The structural form of the vertical class knowledge system of each vertical class is different. The following is an example to illustrate the structural form of the vertical class knowledge system.

例如,餐馆垂类的垂类知识体系的结构形式如表1所示。For example, the structure of the vertical knowledge system of the restaurant is shown in Table 1.

表1 餐馆垂类的垂类知识体系的结构形式Table 1 Structure of the vertical knowledge system of the restaurant

Figure PCTCN2015096599-appb-000002
Figure PCTCN2015096599-appb-000002

通过表1可以看出,餐馆垂类的垂类知识体系中包含各餐馆相关的位置、菜系、口味、环境等多个维度信息,以及各维度可能的取值。It can be seen from Table 1 that the vertical knowledge system of the restaurant category includes multiple dimensions of information related to the location, cuisine, taste, environment, etc. of each restaurant, as well as possible values of each dimension.

第四获取单元314232可根据结构化信息、垂类知识体系,以及,查询词属于的垂类的垂类资源库,获取相关信息。The fourth obtaining unit 314232 may acquire related information according to the structured information, the vertical knowledge system, and the vertical resource library of the vertical class to which the query word belongs.

其中,相关信息可以包括但不限于对应查询词的查询结果和引导信息。The related information may include, but is not limited to, a query result and a guide information of the corresponding query word.

在本发明的一个实施例中,如图27所示,交互子模块31423还可以包括组成单元314235。In an embodiment of the present invention, as shown in FIG. 27, the interaction sub-module 31423 may further include a component unit 314235.

为了可以获得查询词属于的垂类的垂类资源,组成单元314235可以获取查询词属于的垂类的结构化资源和非结构化资源,并将结构化资源和非结构化资源组成垂类资源库。In order to obtain the vertical class resource of the vertical class to which the query word belongs, the component unit 314235 can acquire the structured resource and the unstructured resource of the vertical class to which the query word belongs, and form the structured resource and the unstructured resource into the vertical class resource library. .

其中,结构化资源是从多个对应的垂类网站抓取整合数据后得到的全量数据资源,非结构化资源根据用户查询词或互联网文本挖掘得到的结构化资源的补充或扩展信息。The structured resource is a full-scale data resource obtained by fetching integrated data from a plurality of corresponding vertical-type websites, and the unstructured resources are supplemented or expanded according to structured query resources obtained by user query words or Internet text mining.

下面以小说为例说明根据小说垂类的结构化资源和非结构化姿态组成小说垂类的垂类资源的过程。The following takes the novel as an example to illustrate the process of composing the primitive resources of the novel according to the structured resources and unstructured gestures of the novel.

通常垂直类别的结构化资源呈现复杂的体系结构,在组成小说垂类的垂类资源的过程中,可先获取小说垂类的结构化资源,具体地,可通过抓取起点中文网、纵横中文网、晋江、红袖、17K小说网、小说阅读网等主流中文小说网站上小说的信息建立全量数据资源。Generally, the structured resources of the vertical category present a complex architecture. In the process of composing the primitive resources of the novel, the structured resources of the novel can be obtained first. Specifically, the Chinese network can be crawled from the starting point. The information on the novels on the mainstream Chinese novel websites such as Net, Jinjiang, Red Sleeve, 17K Novel Network, and Novel Reading Network establishes a full amount of data resources.

然后,对于小说垂类的非结构化资源,可获取小说名、作者、类别、标签词、资源满足链接、小说简介、小说周边和百科信息等资源,并对所获得的上述资源进行整合。最后可将整合后的资源和上述全量数据资源保存至垂类资源库,以完成小说垂类的垂类资源的入库。Then, for the unstructured resources of the novel, you can get the novel name, author, category, tag words, resource satisfaction links, novel introduction, novel surrounding and encyclopedic information, and integrate the above resources. Finally, the integrated resources and the above-mentioned full data resources can be saved to the vertical resource library to complete the storage of the vertical resources of the novel.

其中,需要理解的是,针对其他垂类,获取其对应的垂类资源的过程与获得小说垂类的垂类资源的过程相同,此处不再赘述。 Among them, it should be understood that, for other vertical classes, the process of obtaining the corresponding vertical class resources is the same as the process of obtaining the vertical class resources of the novel class, and will not be described herein.

在本发明的一个实施例中,如图28所示,第四获取单元314232可以包括更新子单元3142321、生成子单元3142322和匹配子单元3142323。In an embodiment of the present invention, as shown in FIG. 28, the fourth obtaining unit 314232 may include an update subunit 3142321, a generating subunit 3142322, and a matching subunit 3142323.

更新子单元3142321可根据结构化信息和用户前一次的状态信息,更新用户的当前状态信息。The update subunit 3142321 may update the current state information of the user according to the structured information and the previous state information of the user.

根据垂类场景中的常见对话流程,实现对话系统的状态空间构建和交互策略初始化。具体地,在用户第一次输入查询次后,可根据用户的偏好或者交互历史获取用户的初始化状态信息。According to the common dialogue process in the vertical scene, the state space construction and interactive strategy initialization of the dialogue system are implemented. Specifically, after the user inputs the query for the first time, the initialization state information of the user may be obtained according to the user's preference or the interaction history.

生成子单元3142322可根据垂类知识体系和垂类资源库,生成当前状态信息对应的候选动作。The generating subunit 3142322 can generate a candidate action corresponding to the current state information according to the vertical class knowledge system and the vertical class resource library.

其中,上述候选动作可以包括:满足用户需求的动作,或者,进一步澄清用户需求的动作,或者,为用户需求提供横向或纵向的引导信息。其中,用户需求根据查询词确定。The candidate actions may include: an action that satisfies the user's needs, or an action to further clarify the user's needs, or provide horizontal or vertical guidance information for the user's needs. Among them, the user needs are determined according to the query word.

匹配子单元3142323可根据预设模型在候选动作中选择与当前状态信息匹配程度较高的预设个数的候选动作,将选择的候选动作作为相关信息。The matching sub-unit 3142323 may select, in the candidate action, a preset number of candidate actions that match the current state information according to the preset model, and select the selected candidate action as the related information.

具体地,在当前状态信息对应的候选动作后,可基于预设模型例如POMDP(partially observable Markov decision processes,部分可见马尔科夫决策过程)模型从多个候选动作中选择与当前状态信息匹配程度较高的预设个数的候选动作,并将选择的候选动作作为查询词的查询结果和引导信息返回给用户,用户所使用的具有对话功能的应用程序的当前界面中显示查询词的查询结果和引导信息。Specifically, after the candidate action corresponding to the current state information, the degree of matching with the current state information may be selected from the plurality of candidate actions based on a preset model, such as a POMDP (partially observable Markov decision processes) model. a high preset number of candidate actions, and returning the selected candidate action as a query result and guiding information of the query word to the user, and displaying the query result of the query word in the current interface of the application having the dialog function used by the user and Boot information.

其中,满足用户需求的动作,或者,进一步澄清用户需求的动作在被选择后作为查询结果,为用户需求提供横向或纵向的引导信息在被选择后作为引导信息。Among them, the action that satisfies the user's demand, or the action that further clarifies the user's demand is selected as the query result, and the horizontal or vertical guidance information for the user's demand is selected as the guide information after being selected.

其中,预设个数是预先设定的,例如,预设个数为5,假定根据垂类知识体系和垂类资源库,生成当前状态信息的候选动作为10,此时,可通过POMDP模型选择出与当前状态信息匹配程度较高的5个候选动作,并将选择的候选动作作为相关信息返回给用户。The preset number is preset, for example, the preset number is 5. It is assumed that the candidate action for generating current state information is 10 according to the vertical knowledge system and the vertical resource library, and at this time, the POMDP model can be adopted. Five candidate actions that match the current state information are selected, and the selected candidate actions are returned to the user as related information.

展示单元314233可向用户展示查询结果和引导信息。The presentation unit 314233 can present the query results and the guidance information to the user.

第五获取单元314234可在用户根据引导信息再次输入查询词后,重复上述根据查询词获取相关信息的流程,直至得到用户需要的查询结果。The fifth obtaining unit 314234 may repeat the above process of acquiring related information according to the query word after the user inputs the query word again according to the guiding information, until the query result required by the user is obtained.

在本发明的一个实施例中,还可以根据用户的反馈更新预设模型的参数,以便在参数不同时选择不同的候选动作。也就是说,在用户再次输入查询词后,可根据用户再次输入的查询词调整预设模型的参数,以使预设模型根据调整后的参数为用户选择不同的候选动作。即根据当前状态信息提供引导信息和满足信息,不同状态信息对应的引导信息和满足信息不同,系统会根据当前状态信息和用户需求提供最优的满足信息和引导信息,以引导用户查询垂类信息。 In an embodiment of the present invention, the parameters of the preset model may also be updated according to the feedback of the user, so as to select different candidate actions when the parameters are different. That is to say, after the user inputs the query word again, the parameters of the preset model may be adjusted according to the query word input by the user, so that the preset model selects different candidate actions for the user according to the adjusted parameters. That is, the guidance information and the satisfaction information are provided according to the current state information, and the guidance information and the satisfaction information corresponding to the different state information are different, and the system provides the optimal satisfaction information and the guidance information according to the current state information and the user requirements, so as to guide the user to query the vertical information. .

例如,当前用户输入的查询词为“西餐厅”,可确定该查询词对应的垂类为美食垂类,同时通过查询词可确定当前用户的用户需求是找一家西餐厅吃饭,由于时根据查询词不能确定用户需要什么类型的西餐厅,此时,根据垂类知识体系和垂类资源库可多种候选动作,并通过POMDP模型选择出与当前状态信息匹配程度较高的13个候选动作,并将选择的13个候选动作为查询的相关信息返回给用户。其中,当前用户的当前界面中显示的查询结果如图7所示,根据查询词不能确定用户需要什么类型的西餐厅,此时,可引导用户提供更加详细的第一引导信息,并提供与第一引导信息相对应的可能的回答,即第二引导信息,以方便用户选择或者输入。其中,用户还可通过点击下一条指示按键查看与第一指导信息相对应的其他回答。在用户点击“请客户吃饭后”,可根据用户当前输入的查询词确定符合用户需求的一家餐馆,并获得与当前查询词的查询结果和引导信息,其中,包含当前查询词的相关信息的界面,如图8所示,此时,用户可根据引导信息,进步一提问更多关于餐馆的问题,如是否有wifi,是否方便停车等问题。For example, the query word input by the current user is “Western Restaurant”, and it can be determined that the vertical category corresponding to the query word is a gourmet category, and at the same time, the query user can determine that the current user's user demand is to find a western restaurant to eat, due to the query according to the query. The word cannot determine what type of western restaurant the user needs. At this time, according to the vertical knowledge system and the vertical resource library, a plurality of candidate actions can be performed, and 13 candidate actions with a higher degree of matching with the current state information are selected through the POMDP model. The selected 13 candidate actions are returned to the user for related information of the query. The query result displayed in the current user's current interface is as shown in FIG. 7. According to the query word, it is not possible to determine what type of western restaurant the user needs. At this time, the user may be guided to provide more detailed first guiding information, and provide A possible answer corresponding to the guidance information, that is, the second guidance information, is convenient for the user to select or input. Among them, the user can also view other answers corresponding to the first guidance information by clicking the next instruction button. After the user clicks “please ask the customer to eat”, a restaurant that meets the user's needs can be determined according to the query word currently input by the user, and the query result and the guiding information with the current query word are obtained, wherein the interface containing the relevant information of the current query word is obtained. As shown in Figure 8, at this time, the user can make a question based on the guidance information, and ask questions about the restaurant, such as whether there is wifi, whether it is convenient to park or not.

再例如,如果当前用户输入的查询词为“天蚕土豆的小说”,在接收到用户的查询词后,通过语义分析可确定查询词中包含小说作者的名称,根据查询词可确定查询词对应的垂类为小说垂类,同时通过查询词可确定用户是想要根据作者名查询图书,可根据作者名获得对应的候选动作,并在用户所使用的应用程序中显示查询词对应的相关信息,包含查询词的相关信息的界面形式如图9所示,此时,用户可根据需求点击对应的书名。另外,用户还以通过点击第一按键,进行账号登录,或者清空消息记录。For another example, if the query word input by the current user is "Fiction of Tianshen Potato", after receiving the query word of the user, the semantic analysis can determine the name of the novel including the author of the novel, and the query word can be determined according to the query word. The vertical class is a novel class. At the same time, it can be determined by the query word that the user wants to query the book according to the author name, and the corresponding candidate action can be obtained according to the author name, and the related information corresponding to the query word is displayed in the application used by the user. The interface form of the related information including the query word is as shown in FIG. 9. At this time, the user can click the corresponding book name according to the requirement. In addition, the user also logs in by using the first button, or clears the message record.

再例如,如果当前用户输入的查询词为“好吃的韩国烤肉”,在接收到用户输入的查询词后,可将查询词对应的垂类为餐馆美食垂类,具体而言,可将查询词解析为垂类知识体系能够表示的结构化信息,并根据结构化信息、垂类知识体系和查询词属于的垂类的垂类资源库获取查询词对应的查询结果和引导信息,并将所获得的查询词的查询结果和引导信息返回给用户,其中,包含查询词的相关信息的用户界面,如图10所示,此时,用户可根据引导信息另选一个,也可以根据需求直接确定这家店。另外,用户还可通过点击下一条提示按键查看其他引导信息。For another example, if the query word input by the current user is “a delicious Korean barbecue”, after receiving the query word input by the user, the vertical category corresponding to the query word may be a restaurant gourmet category, specifically, the query may be The word analysis is structured information that can be represented by the vertical knowledge system, and the query result and guidance information corresponding to the query word are obtained according to the structured information, the vertical knowledge system and the vertical class resource library of the vertical class. The obtained query result and the guiding information of the query word are returned to the user, wherein the user interface including the related information of the query word is as shown in FIG. 10, and at this time, the user may select one according to the guiding information, or may directly determine according to the requirement. This store. In addition, users can also view other boot information by clicking the next prompt button.

综上可知,该实施例的基于人工智能的信息查询方法具有以下有益效果:(1)与通过搜索引擎查找相比,在查询过程中,该实施例的信息查询方式不需要用户对垂直类别有较深的了解,通过多轮交互的方式,引导用户准确描述需求,并根据需求为用户提供对应的查询结果和引导信息。(2)对比垂类网站浏览方式,该实施例的信息查询方式,不需要用户浏览大量的网页,且无需人工过滤无用的信息,该查询方式智能过滤无用的信息,仅为用户提供与查询词的相关信息。In summary, the artificial intelligence-based information query method of the embodiment has the following beneficial effects: (1) Compared with searching through the search engine, the information query mode of the embodiment does not require the user to have a vertical category in the query process. Deeper understanding, through multiple rounds of interaction, guide users to accurately describe the requirements, and provide users with corresponding query results and guidance information according to needs. (2) Compared with the browsing method of the vertical website, the information query mode of the embodiment does not require the user to browse a large number of web pages, and does not need to manually filter useless information. The query method intelligently filters useless information and provides only the user with the query words. Related information.

(3)对比相关的对话系统,该实施例的信息查询方式,针对垂直类别资源结构的复杂 性做特定处理,产生基于垂类实体结构的状态空间,可以对垂类内的深层次问题进行满足,并通过引导信息提示用户再次输入查询词,以进行下一轮的查询,也就是说,该实施的信息查询方式通过显示引导信息可有效引导用户提供正确的问题。(3) Comparing the related dialogue system, the information query mode of this embodiment is complex for the vertical category resource structure Sexually doing specific processing, generating a state space based on the primitive structure of the vertical class, can satisfy the deep-level problems in the vertical class, and prompt the user to input the query word again through the guiding information to perform the next round of query, that is, The information query mode of the implementation can effectively guide the user to provide the correct question by displaying the guide information.

在本发明的一个实施例中,如图29所示,深度问答服务模块31430可包括第一接收子模块31431、第一获取子模块31432、生成子模块31433。In an embodiment of the present invention, as shown in FIG. 29, the deep question and answer service module 31430 may include a first receiving submodule 31431, a first obtaining submodule 31432, and a generating submodule 31433.

具体地,深度问答服务为针对用户输入的问题信息,基于深入的语义分析和知识挖掘技术,从而为用户提供精准的问答结果的服务。当用户需求信息为深度问答需求时,第一接收子模块31431可接收问题信息,第一获取子模块31432根据问题信息获取对应的问题类型,然后生成子模块31433根据问题类型选择对应的问答模式,以及根据选择的答案生成模式和问题信息生成对应的问答结果。其中,问题类型可包括实体类型、观点类型和片段类型。Specifically, the deep question-and-answer service is a service for user input, based on in-depth semantic analysis and knowledge mining technology, to provide users with accurate Q&A results. When the user requirement information is a deep question and answer requirement, the first receiving submodule 31431 can receive the problem information, the first obtaining submodule 31432 obtains the corresponding question type according to the problem information, and then the generating submodule 31433 selects the corresponding question and answer mode according to the question type. And generating a corresponding question and answer result according to the selected answer generation mode and the problem information. Among them, the problem type may include an entity type, a viewpoint type, and a fragment type.

更具体地,当问题类型为实体类型时,如图30所示,生成子模块31433可包括第一生成单元314331、扩展单元314332、抽取单元314333、第一计算单元314334和第一反馈单元314335。More specifically, when the question type is an entity type, as shown in FIG. 30, the generation sub-module 31433 may include a first generation unit 314331, an extension unit 314332, an extraction unit 314333, a first calculation unit 314334, and a first feedback unit 314335.

第一生成单元314331可根据问题信息生成实体类问题信息,扩展单元314332基于搜索引擎抓取的摘要和历史展现日志对实体类问题信息进行扩展以生成同族实体问题信息簇。其中,同族实体问题信息簇分别对应候选答案。然后抽取单元314333从同族实体问题信息簇分别对应候选答案中抽取候选实体,第一计算单元314334计算候选实体的置信度,第一反馈单元314335将置信度大于预设置信度阈值的候选实体作为问答结果进行反馈。举例来说,问题信息为“刘德华老婆是谁?”,候选答案为“其实早在九二年时就有报道,刘德华和朱丽倩已经在加拿大秘密注册结婚…”,其中,候选实体为“刘德华”、“朱丽倩”、“加拿大”。然后基于实体知识库和问答语义匹配模型计算各候选实体的置信度,可计算出候选实体“朱丽倩”的置信度大于预设置信度阈值,则可确定“朱丽倩”为问答结果。另外,还可将候选答案中首次出现“朱丽倩”的分句作为答案摘要。The first generating unit 314331 may generate entity class problem information according to the problem information, and the expanding unit 314332 expands the entity class problem information based on the digest and history presentation log captured by the search engine to generate a congenital entity problem information cluster. Among them, the information cluster of the same family entity problem corresponds to the candidate answer. Then, the extracting unit 314333 extracts candidate entities from the corresponding candidate answers in the same family entity problem information cluster, the first calculating unit 314334 calculates the confidence of the candidate entity, and the first feedback unit 314335 uses the candidate entity with the confidence greater than the preset reliability threshold as the question and answer. The results are fed back. For example, the question information is "Who is the wife of Andy Lau?" The candidate's answer is "In fact, as early as 1992, there were reports that Andy Lau and Zhu Liqian have been secretly registered in Canada...", in which the candidate entity is "Andy Lau" "Zhu Liqian" and "Canada". Then, based on the entity knowledge base and the question-and-answer semantic matching model, the confidence of each candidate entity is calculated. It can be calculated that the confidence level of the candidate entity “Zhu Liqian” is greater than the pre-set reliability threshold, and then “Zhu Liqian” can be determined as the question and answer result. In addition, the phrase "Zhu Liqian" for the first time in the candidate answer can also be used as the answer summary.

当问题类型为观点类型时,如图31所示,生成子模块31433可包括第一获取单元314336、第一切分单元314337、第一聚合单元314338、判断单元314339、选择单元3143310、评分反馈单元3143311。When the problem type is a viewpoint type, as shown in FIG. 31, the generating submodule 31433 may include a first obtaining unit 314336, a first segmentation unit 314337, a first aggregation unit 314338, a determination unit 314339, a selection unit 3143310, and a score feedback unit. 3143311.

第一获取单元314336可获取问题信息对应的候选答案,第一切分单元314337对候选答案进行切分以生成多个候选答案短句,然后第一聚合单元314338对多个候选答案短句进行聚合以生成观点聚合簇。具体地,可根据短句中词汇的IDF(反文档频率)得分提取候选答案短句中的关键词,并对包含否定词的关键词进行泛化并生成否定标签,然后基于否定标签将关键词用向量进行表示,计算每两个关键词之间的向量夹角和/或语义相似度,然 后对向量夹角小于预设角度或语义相似度大于预设阈值的候选答案进行聚合以生成观点聚合簇。The first obtaining unit 314336 may acquire candidate answers corresponding to the problem information, the first segmentation unit 314337 segments the candidate answers to generate a plurality of candidate answer phrases, and then the first aggregating unit 314338 aggregates the plurality of candidate answer phrases. To generate a view aggregation cluster. Specifically, the keyword in the candidate answer phrase can be extracted according to the IDF (anti-document frequency) score of the vocabulary in the short sentence, and the keyword including the negative word is generalized and a negative label is generated, and then the keyword is based on the negative label Representing a vector, calculating the vector angle and/or semantic similarity between each two keywords, The candidate answers whose vector angle is smaller than the preset angle or whose semantic similarity is greater than the preset threshold are aggregated to generate a view aggregation cluster.

在此之后,判断单元314339可判断观点聚合簇的观点类型。其中,观点可包括是非类、评价类、建议类等。具体地,可通过预先设定的规则或者基于统计模型确定观点聚合簇的观点类型。然后选择单元3143310根据观点类型从对应的观点聚合簇中选择出答案观点。其中,选择答案观点的规则可包括但不仅限于选取信息覆盖最全面的答案观点、选取IDF*log(IDF)值最低的答案观点和选取在候选答案对应的文章中出现次数最多的答案观点。其中,IDF为反文档频率。在此之后,可生成答案观点对应的摘要,然后可对答案观点进行评分,并将评分大于预设评分阈值的答案观点作为问答结果进行反馈。举例来说,问题信息为“怀孕注意事项”,其中一个候选答案为“怀孕时应谨守医、多、战原则,亦即定期看医师,多卧床休息,战胜自己的不良习惯。”,可将该候选答案切分为“怀孕时应谨守医、多、战原则”、“亦即定期看医师”、“多卧床休息”、“战胜自己的不良习惯”四个候选答案短句。然后可将候选答案短句中重复的内容或者近似的内容进行聚合生成观点聚合簇,并选出答案观点。之后,评分反馈单元3143311可根据信息丰富度、论据充分度、信息冗余度等对答案观点进行评分,并将评分大于预设评分阈值的答案观点作为问答结果进行反馈。此外,在选出答案观点后,可获取其在来源文章中所在的句子,然后按照预定长度截取句子,从而生成该答案观点对应的摘要。之后可根据内容丰富度、答案权威性对摘要进行排序。After that, the judging unit 314339 can judge the opinion type of the opinion aggregation cluster. Among them, opinions may include non-classes, evaluation classes, suggestion classes, and the like. Specifically, the opinion type of the opinion aggregation cluster may be determined by a preset rule or based on a statistical model. The selection unit 3143310 then selects an answer point from the corresponding view aggregation cluster based on the opinion type. Among them, the rules for selecting the answer point of view may include, but are not limited to, selecting the information to cover the most comprehensive answer point of view, selecting the answer point with the lowest IDF*log (IDF) value, and selecting the answer point with the most occurrence in the article corresponding to the candidate answer. Among them, IDF is the anti-document frequency. After that, a summary corresponding to the answer point can be generated, and then the answer point can be scored, and the answer point with the score greater than the preset score threshold is fed back as a question and answer result. For example, the problem information is “Precautions for Pregnancy”. One of the candidate answers is “When you are pregnant, you should follow the principle of medical treatment, and the principle of war, that is, see a doctor regularly, rest in bed, and overcome your bad habits.” The candidate answers are divided into four short answer phrases: "When you are pregnant, you should follow the doctor's principles, and you should follow the principle of regular treatment," "that is, see the doctor regularly," "multiple bed rest," and "to defeat your bad habits." The duplicated content or approximate content in the candidate answer phrase can then be aggregated to generate a view aggregation cluster and the answer opinion can be selected. Thereafter, the score feedback unit 3143311 can score the answer viewpoint according to the information richness, the argument sufficient degree, the information redundancy degree, and the like, and feedback the answer viewpoint whose score is greater than the preset score threshold as the question and answer result. In addition, after selecting the answer point, the sentence in the source article can be obtained, and then the sentence is intercepted according to the predetermined length, thereby generating a summary corresponding to the answer point. The summary can then be sorted based on content richness and authoritative answers.

当问题类型为片段类型时,如图32所示,生成子模块31433可包括第二获取单元3143312、第二切分单元3143313、打分单元3143314、第二生成单元3143315、第一排序单元3143316、第二反馈单元3143317。When the problem type is a segment type, as shown in FIG. 32, the generating submodule 31433 may include a second obtaining unit 1413132, a second segmentation unit 3143313, a scoring unit 3143314, a second generating unit 3114315, a first sorting unit 3143316, and a Two feedback units 3143317.

第二获取单元3143312可获取问题信息对应的候选答案,第二切分单元3143313对候选答案进行切分以生成多个候选答案短句,然后打分单元3143314对多个候选答案短句进行重要度打分以生成候选答案短句对应的短句重要度特征,第二生成单元3143315根据短句重要度特征生成答案摘要,然后第一排序单元3143316可根据答案摘要的短句重要度特征、答案权威性、问题信息的相关性和答案的丰富度对答案质量进行打分。其中,短句重要度特征可包括聚合特征、相关度特征、类型特征和问题答案匹配度特征。其中,聚合特征用于衡量短句的重复度,例如:词向量质心特征、NGram(计算出现概率)特征、Lexrank(多文本自动摘要)特征等。类型特征为问题的类型特征,如WHAT(什么)类型、WHY(为什么)类型、HOW(如何)类型等。答案权威性为答案来源的网站的权威度。在此之后,可获取用户的行为数据,然后根据用户的行为数据和打分结果对候选答案进行排序,最终第二反馈单元3143317将排序结果作为问答结果进行反馈。其中,用户的行为数据是 可包括用户对问答结果的点击行为、在问答结果上停留的时间、通过当前的问答结果跳转至其他问答结果等用户的历史行为信息。The second obtaining unit 3143312 may acquire candidate answers corresponding to the problem information, the second segmentation unit 3143313 splits the candidate answers to generate a plurality of candidate answer phrases, and then the scoring unit 3143314 scores the importance of the plurality of candidate answer sentences. To generate a short sentence importance feature corresponding to the candidate answer phrase, the second generating unit 3143315 generates an answer summary according to the short sentence importance feature, and then the first sorting unit 3143316 can according to the short sentence importance feature of the answer summary, the answer authority, The relevance of the problem information and the richness of the answers score the quality of the answer. The short sentence importance feature may include an aggregate feature, a relevance feature, a type feature, and a question answer matching feature. Among them, the aggregation feature is used to measure the repetition of short sentences, such as: word vector centroid feature, NGram (calculated probability of occurrence) feature, Lexrank (multi-text automatic summary) feature. Type characteristics are type characteristics of the problem, such as WHAT (what) type, WHY (why) type, HOW (how) type, and so on. The authoritativeness of the answer is the authority of the website from which the answer originated. After that, the user's behavior data can be obtained, and then the candidate answers are sorted according to the user's behavior data and the scoring result, and finally the second feedback unit 3143317 feeds back the sorting result as a question and answer result. Among them, the user's behavior data is It may include the user's click behavior on the question and answer results, the time spent on the question and answer results, and the historical behavior information of the user who jumped to other question and answer results through the current question and answer results.

在本发明的一个实施例中,如图33所示,信息搜索服务模块31440可包括第二接收子模块31441、第一搜索子模块31442和分析反馈子模块31443。其中,分析反馈子模块31443可包括分析单元314431、第二排序单元314432和第三生成单元314433。In an embodiment of the present invention, as shown in FIG. 33, the information search service module 31440 may include a second receiving submodule 31441, a first search submodule 31442, and an analysis feedback submodule 31443. The analysis feedback sub-module 31443 may include an analysis unit 314431, a second sorting unit 314432, and a third generation unit 314433.

当用户需求信息为信息搜索需求时,第二接收子模块31441可接收问题信息,第一搜索子模块31442根据问题信息进行搜索以生成多个候选网页,然后分析反馈子模块31443的分析单元314431可对候选网页进行篇章分析以生成对应的候选篇章。具体地,可对候选网页进行篇章内容抽取、篇章主题分割和篇章关系分析生成对应的候选篇章。其中,篇章内容抽取主要为识别候选网页的正文部分,删除与用户需求信息无关的内容。篇章主题分割为对篇章的主题结构进行分析,可将篇章划分为多个子主题。篇章关系分析为分析篇章中多个子主题之间的关系,例如并列关系等。在生成候选篇章之后,第二排序单元314432可对候选篇章中的句子进行打分排序。其中,打分排序主要基于句子在候选篇章中的重要度以及句子与用户需求信息之间的相关度。在此之后,第三生成单元314433可获取用户的需求场景信息,并根据需求场景信息和打分排序结果生成摘要,最终将摘要作为问答结果进行反馈。其中场景信息可包括移动终端场景、电脑场景。当场景信息为移动终端场景时,则可对句子进行压缩简写,使生成的摘要尽量简明扼要;当场景信息为电脑场景时,可对句子进行拼接融合,使得生成的摘要详细清楚。When the user requirement information is the information search requirement, the second receiving sub-module 31441 can receive the problem information, the first search sub-module 31442 performs a search according to the problem information to generate a plurality of candidate web pages, and then the analyzing unit 314431 of the analyzing feedback sub-module 31443 can Perform chapter analysis on the candidate web pages to generate corresponding candidate chapters. Specifically, the candidate web page may be subjected to chapter content extraction, chapter topic segmentation, and chapter relationship analysis to generate corresponding candidate chapters. Among them, the chapter content extraction mainly identifies the body part of the candidate webpage, and deletes the content irrelevant to the user demand information. The chapter topic is divided into the analysis of the topic structure of the chapter, which can divide the chapter into multiple sub-themes. Chapter relationship analysis is to analyze the relationship between multiple subtopics in a chapter, such as a side-by-side relationship. After generating the candidate chapter, the second sorting unit 314432 may sort the sentences in the candidate chapters. Among them, the scoring order is mainly based on the importance of the sentence in the candidate chapter and the correlation between the sentence and the user's demand information. After that, the third generating unit 314433 can acquire the user's demand scene information, generate a summary according to the demand scene information and the score sorting result, and finally feed the summary as a question and answer result. The scene information may include a mobile terminal scene and a computer scene. When the scene information is a mobile terminal scene, the sentence can be compressed and abbreviated, so that the generated summary is as concise as possible; when the scene information is a computer scene, the sentences can be spliced and merged, so that the generated abstract is clear and detailed.

如图34所示,分析反馈子模块31443还可包括第二聚合单元314434。As shown in FIG. 34, the analysis feedback sub-module 31443 can also include a second aggregation unit 314434.

具体地,生成候选篇章时,由于候选篇章中的内容均与用户需求信息具有相关性,则可能会有重复或互补的内容,则需要第二聚合单元314434对多个候选篇章的信息进行聚合。Specifically, when the candidate chapter is generated, since the content in the candidate chapter has relevance to the user requirement information, there may be repeated or complementary content, and the second aggregating unit 314434 is required to aggregate the information of the plurality of candidate chapters.

第二决策模块31500可接收至少一个问答服务模块返回的问答结果,并对问答结果进行决策以确定最终的问答结果。The second decision module 31500 can receive the Q&A results returned by the at least one Q&A service module and make a decision on the Q&A results to determine the final Q&A result.

如图35所示,聊天服务子系统32000可包括第一接收模块32100、第二分发模块32200、聊天服务模块32300、第二接收模块32400和排序模块32500。As shown in FIG. 35, the chat service subsystem 32000 can include a first receiving module 32100, a second distribution module 32200, a chat service module 32300, a second receiving module 32400, and a sorting module 32500.

第一接收模块32100用于接收用户输入的输入信息。The first receiving module 32100 is configured to receive input information input by a user.

其中,输入信息可以是语音信息,也可以是文本信息。The input information may be voice information or text information.

第二分发模块32200用于将输入信息分发至聊天服务模块32300。The second distribution module 32200 is configured to distribute the input information to the chat service module 32300.

第二接收模块32400用于接收多个聊天服务模块32300返回的候选回复。其中,候选回复具有对应的置信度。The second receiving module 32400 is configured to receive candidate responses returned by the plurality of chat service modules 32300. Among them, the candidate reply has a corresponding confidence.

排序模块32500用于基于置信度对待选回复进行排序,并根据排序结果生成聊天信息, 并向用户提供聊天信息。The sorting module 32500 is configured to sort the responses to be selected based on the confidence, and generate chat information according to the sorting result. And provide chat information to the user.

具体地,排序模块32500可获取用户的输入信息的特征,并基于输入信息的特征和置信度对待选回复进行排序。其中,输入信息的特征可包括分类特征、字面特征、话题特征等。置信度越高,则待选回复质量越好,可按照置信度从高到低的顺序对待选回复进行排序,最终向用户提供符合用户需求的聊天信息。Specifically, the ranking module 32500 can acquire features of the user's input information, and sort the responses to be selected based on the characteristics and confidence of the input information. The characteristics of the input information may include classification features, literal features, topic features, and the like. The higher the confidence, the better the quality of the candidate reply is. The responses can be sorted according to the order of confidence, and finally the chat information that meets the user's needs is provided to the user.

如图36所示,聊天服务模块32300可包括基于搜索的聊天模块32310、富知识聊天模块32320、基于画像的聊天模块32330和基于众包的聊天模块32340。As shown in FIG. 36, the chat service module 32300 can include a search-based chat module 32310, a rich knowledge chat module 32320, a portrait-based chat module 32330, and a crowdsourcing-based chat module 32340.

其中,如图37所示,基于搜索的聊天模块32310可包括切词子模块32311、查询子模块32312、过滤子模块32313、分类子模块32314和第一重排序子模块32315。其中,过滤子模块32313可包括第二计算单元323131、过滤单元323132、保留单元323133,分类子模块32314可包括第三计算单元323141、分类单元323142,第一重排序子模块32315可包括第三获取单元323151和重排序单元323152。As shown in FIG. 37, the search-based chat module 32310 may include a word-cutting sub-module 32311, a query sub-module 32312, a filtering sub-module 32313, a classification sub-module 32314, and a first re-sorting sub-module 32315. The filtering submodule 32313 may include a second computing unit 323131, a filtering unit 323132, a retaining unit 323133, the classification submodule 32314 may include a third computing unit 323141, a classification unit 323142, and the first reordering submodule 32315 may include a third acquisition. Unit 323151 and reordering unit 323152.

具体地,切词子模块32311可对输入信息进行切词以生成多个聊天短句,然后查询子模块32312可根据多个聊天短句查询聊天语料库从而生成多个聊天语料上句和多个聊天语料上句对应的多个聊天语料下句。其中,聊天语料库为预先建立,语聊来源可包括但不限于贴吧等论坛数据中的“发帖-回帖”、微博中的“博文-回复”、问答社区中的“问题-答案”等。Specifically, the word-cutting sub-module 32311 can perform a word-cutting on the input information to generate a plurality of chat phrases, and then the query sub-module 32312 can query the chat corpus according to the plurality of chat phrases to generate a plurality of chat corpus and multiple chats. The corpus corresponds to multiple chat corpora sentences. The chat corpus is pre-established, and the source of the chat can include, but is not limited to, “posting-replying” in the forum data such as posting bar, “blog-reply” in the microblog, and “question-answer” in the question and answer community.

在此之后,过滤子模块32313可对多个聊天语料上句进行过滤。具体地,第二计算单元323131可计算输入信息与多个聊天语料上句之间的相似度。如果相似度小于第一预设相似度阈值,则过滤单元323132可将对应的聊天语料上句过滤;如果相似度大于或等于第一预设相似度阈值,则保留单元323133可将对应的聊天语料上句保留。After that, the filtering sub-module 32313 can filter a plurality of chat corpus sentences. Specifically, the second calculating unit 323131 may calculate the similarity between the input information and the plurality of chat corpus sentences. If the similarity is less than the first preset similarity threshold, the filtering unit 323132 may filter the corresponding chat corpus upper sentence; if the similarity is greater than or equal to the first preset similarity threshold, the retaining unit 323133 may compare the corresponding chat corpus The last sentence is reserved.

在对聊天语料上句进行过滤之后,分类子模块32314可对过滤之后的聊天语料上句对应的聊天语料下句进行分类。具体地,第三计算单元323141可计算输入信息与多个聊天语料下句之间的相似度,分类单元323142根据相似度基于GBDT(梯度升压决策树,Gradient Boost Decision Tree)、SVM(支持向量机,Support Vector Machine)等机器学习模型对多个聊天语料下句进行分类。其中,输入信息与多个聊天语料下句之间的相似度可以是输入信息与聊天语料下句之间字面的相似度,也可以是输入信息与聊天语料下句基于深度神经网络训练得到的相似度,也可以是输入信息与聊天语料下句基于机器翻译模型训练得到的相似度。应当理解的是,本实施例中输入信息与多个聊天语料下句之间的相似度以及GBDT、SVM等机器学习模型为公知技术,此处不赘述。After filtering the upper corpus of the chat corpus, the categorization sub-module 31314 may classify the chat corpus corresponding to the sentence in the chat corpus after filtering. Specifically, the third calculating unit 323141 may calculate the similarity between the input information and the plurality of chat corpus sentences, and the classifying unit 323142 is based on the similarity based on the GBDT (Gradient Boost Decision Tree), SVM (Support Vector) Machine learning model, such as Support Vector Machine, classifies multiple chat corpora sentences. The similarity between the input information and the sentences of the plurality of chat corpora may be a literal similarity between the input information and the sentence of the chat corpus, or may be similar to the input information and the chat corpus based on the deep neural network training. Degree can also be the similarity between the input information and the chat corpus based on the machine translation model. It should be understood that the similarity between the input information and the plurality of chat corpus in the present embodiment and the machine learning model such as GBDT and SVM are well-known technologies, and are not described herein.

然后第一重排序子模块32315可对分类之后的聊天语料下句进行重排序,并根据排序结果生成候选回复。具体地,第三获取单元323151可根据用户聊天的上文信息获取用户的 聊天属性,重排序单元323152根据聊天属性对分类之后的聊天语料下句基于学习排序模型(Learning-To-Rank)进行重排序。其中,聊天属性可包括聊天的场合如时间地点等、聊天的趣味性、聊天的风格等。当然,聊天属性不仅限于从用户聊天的上文信息中获取,也可以根据用户长期的历史聊天记录获取。应当理解的是,本实施例中学习排序模型为公知技术,此处不赘述。The first reordering sub-module 32315 can then reorder the chat corpus in the classified category and generate a candidate reply according to the sorting result. Specifically, the third obtaining unit 323151 may acquire the user's information according to the above information of the user chat. The chat attribute, the reordering unit 323152 reorders the chat corpus after the classification according to the chat attribute based on the learning-to-rank. The chat attribute may include a chat occasion such as a time and place, a chat fun, a chat style, and the like. Of course, the chat attribute is not limited to the above information from the user chat, but also can be obtained according to the user's long-term historical chat record. It should be understood that the learning ordering model in this embodiment is a well-known technology, and details are not described herein.

如图38所示,富知识聊天模块32320可包括第二搜索子模块32321、抽取子模块32322、改写子模块32323和第二重排序子模块32324。具体地,第二搜索子模块32321可根据输入信息生成搜索词,并根据搜索词进行搜索以生成多个搜索结果,然后抽取子模块32322对多个搜索结果进行句子抽取,以获取与搜索词的相似度大于第二预设相似度阈值的句子的候选句子集合。在此之后,改写子模块32323可对候选句子集合中的句子进行改写以生成候选回复。此外,第二重排序子模块32324可根据用户的聊天属性对候选句子集合中的句子进行重排序。举例来说,输入信息为“希望有机会能到富士山旅游”,可对输入信息进行解析并生成对应的搜索词“富士山、旅游”,然后根据搜索词获得多个搜索结果,并抽取与搜索词相似度高的句子。其中,有的句子可能包括如“记者了解到”等明显节选自网页文本,因此需要对这些句子进行改写,使其更加流畅,更像自然语言聊天的句子,最终生成的候选回复为“富士山由于天气原因,一年中只有规定的夏季的一段时间可以登山”,相对于传统的回复“我也想去富士山,一起吧。”,具有一定的知识性,且具有一定时效性,可使用户能在聊天过程中获取有用的知识。As shown in FIG. 38, the rich knowledge chat module 32320 can include a second search submodule 32321, an extraction submodule 32322, a rewrite submodule 32323, and a second reorder submodule 32324. Specifically, the second search sub-module 32321 may generate a search term according to the input information, and perform a search according to the search term to generate a plurality of search results, and then the extraction sub-module 32322 performs sentence extraction on the plurality of search results to obtain the search term. A set of candidate sentences of a sentence whose similarity is greater than a second preset similarity threshold. Thereafter, the rewrite sub-module 32323 can rewrite the sentences in the set of candidate sentences to generate a candidate reply. In addition, the second reordering sub-module 32324 may reorder the sentences in the candidate sentence set according to the chat attribute of the user. For example, if the input information is "I hope to have the opportunity to travel to Mount Fuji", the input information can be parsed and the corresponding search term "Mount Fuji, Tourism" can be generated, and then multiple search results can be obtained according to the search term, and the search words can be extracted and searched. A sentence with high similarity. Among them, some sentences may include obvious texts such as “journalists”, so these sentences need to be rewritten to make them more fluid, more like the sentences of natural language chat, and the final candidate for the response is “Mt. Fuji due to Due to the weather, there is only a certain period of summer in the year to climb the mountain. Compared with the traditional reply, "I also want to go to Mount Fuji, let's go together.", with certain knowledge and timeliness, can enable users to Get useful knowledge during the chat.

如图39所示,基于画像的聊天模块32330可包括第二获取子模块32331、第一判断子模块32332、发送子模块32333、第一更新子模块32334。As shown in FIG. 39, the portrait-based chat module 32330 may include a second acquisition sub-module 32231, a first determination sub-module 32332, a transmission sub-module 32333, and a first update sub-module 32334.

为了更好地实现拟人化,以及为用户提供个性化服务,人机聊天系统可设定自身的属性、状态、兴趣等,即系统画像模型。还可设定用户的属性、状态、兴趣等,即用户画像模型。当然,在面对不同的用户时,使用的系统画像模型可以是同一个,也可以针对每个用户均可设置与之对应的系统画像模型。系统画像模型和用户画像模型均基于画像知识图谱。画像知识图谱是一个层次化的知识体系。举例来说,“家庭成员”节点可包括“兄弟姐妹”和“父母”两个子节点,“父母”子节点包括“父亲”和“母亲”两个子节点。每个节点均对应有多个输入信息模板簇,例如“你父亲是谁”、“谁是你父亲”、“你的父亲叫什么”属于同一个输入信息模板簇。每个输入信息模板簇对应一个或多个候选回复。输入信息模板簇和候选回复可包含变量,例如兴趣、爱好、嗜好对应同一属性“INTEREST”,而“INTEREST”的属性值可包括爬山、音乐、读书、运动等。In order to better realize anthropomorphization and provide personalized services for users, the human-machine chat system can set its own attributes, status, interests, etc., that is, the system portrait model. It is also possible to set the user's attributes, status, interests, etc., that is, the user portrait model. Of course, in the case of facing different users, the system portrait model used may be the same, or a system image model corresponding to each user may be set. Both the system portrait model and the user portrait model are based on the image knowledge map. The portrait knowledge map is a hierarchical knowledge system. For example, the "family member" node may include two child nodes "brothers" and "parents", and the "parent" child nodes include two child nodes "father" and "mother". Each node corresponds to a plurality of input information template clusters, such as "Who is your father", "Who is your father", and "What is your father's name" belongs to the same input information template cluster. Each input information template cluster corresponds to one or more candidate responses. The input information template cluster and the candidate reply may include variables, such as interest, hobbies, and preferences corresponding to the same attribute "INTEREST", and the attribute values of "INTEREST" may include climbing, music, reading, sports, and the like.

具体地,第二获取子模块32331可获取用户的聊天语境,第一判断子模块32332根据聊天语境判断是否满足收集条件。如果判断满足收集条件,则发送子模块32333可向用户 发送问题。在此之后,第一更新子模块32334可接收用户根据问题的回答信息,并根据回答信息对用户画像模型进行更新。例如:在与用户聊电影相关的话题时,可向用户发送问题“你喜欢什么电影?”或者用户问人机聊天系统“你喜欢吃什么?”,人机聊天系统可反问用户“你喜欢吃什么?”,在用户回答后,可基于用户的回答信息对用户画像模型进行更新,更加符合用户个性化的需求。Specifically, the second obtaining submodule 32231 can obtain the chat context of the user, and the first determining submodule 32332 determines whether the collection condition is satisfied according to the chat context. If it is determined that the collection condition is met, the sending sub-module 32333 can provide the user to the user. Send a question. After that, the first update sub-module 32334 can receive the user's answer information according to the question and update the user portrait model according to the answer information. For example, when talking to a user about a movie-related topic, you can send the question "What movie do you like?" or the user asks the human-machine chat system "What do you like to eat?", the human-machine chat system can ask the user "You like to eat What? After the user answers, the user portrait model can be updated based on the user's answer information, which is more in line with the user's personalized needs.

此外,如图40所示,基于画像的聊天模块32330还可包括第三获取子模块32335、提取子模块32336、第二更新子模块32337。In addition, as shown in FIG. 40, the portrait-based chat module 32330 may further include a third acquisition sub-module 32335, an extraction sub-module 32336, and a second update sub-module 32337.

具体地,第三获取子模块32335可获取用户的聊天内容,提取子模块32336根据聊天内容提取用户画像数据,然后第二更新子模块32337根据提取的用户画像数据对用户画像模型进行更新。例如:用户在聊天过程中说道“我没事的时候喜欢爬爬山、钓钓鱼。”,可提取用户画像数据“爱好爬山、爱好钓鱼”,从而对用户画像模型进行更新。同时,可基于用户画像数据抽取合适的答案,向用户返回合适的回答信息。Specifically, the third obtaining sub-module 32335 can acquire the chat content of the user, the extracting sub-module 32336 extracts the user portrait data according to the chat content, and then the second update sub-module 32337 updates the user portrait model according to the extracted user portrait data. For example, the user said in the chat process, "I like to climb the mountain and catch fishing when I am fine." I can extract the user portrait data "hobby climbing, hobby fishing" to update the user portrait model. At the same time, an appropriate answer can be extracted based on the user's portrait data, and the appropriate answer information can be returned to the user.

众包(crowdsourcing)是一种将特定任务外包给互联网中非特定用户的方法,对于人机聊天中,机器难以回答的问题,可分发给执行者在线地实时地进行人工回复,从而满足用户的实际需求。Crowdsourcing is a method of outsourcing specific tasks to non-specific users on the Internet. For human-machine chat, the problem that the machine is difficult to answer can be distributed to the performer to manually respond in real time to meet the user's needs. Actual demand.

具体地,基于众包的聊天模块32340可判断输入信息是否适合众包完成,例如用户情绪低落需要安慰等,则适合众包完成。例如用户的输入信息中包含有个人身份信息、密码、电话等隐私信息,则不适合众包完成。Specifically, the crowdsourcing-based chat module 32340 can determine whether the input information is suitable for crowdsourcing completion, for example, if the user's mood is low and needs comfort, etc., it is suitable for crowdsourcing completion. For example, if the user's input information contains personal information such as personal identification information, password, and telephone, it is not suitable for crowdsourcing.

如果判断适合众包完成,则基于众包的聊天模块32340可将输入信息分发至对应的执行者。当然,同时也可将上文信息一同发送给执行者,执行者可根据上文信息和输入信息进行回复。然后基于众包的聊天模块可接收执行者的回复信息,并对回复信息进行质量判断。如果满足质量要求,则将回复信息作为候选回复。例如:回复信息中如果包含低俗、反动、色情内容,则质量不过关。或者执行者回复的时间超过了预定时长,则该执行者的回复信息将不被采用,同时可将该回复信息保存至聊天语料库中。If it is determined that the crowdsourcing is complete, the crowdsourcing based chat module 32340 can distribute the input information to the corresponding performer. Of course, the above information can also be sent to the performer together, and the performer can reply according to the above information and input information. The crowd-based chat module can then receive the reply information of the performer and make a quality judgment on the reply information. If the quality requirement is met, the reply message is used as a candidate reply. For example, if the reply message contains vulgar, reactionary, or pornographic content, the quality is not enough. Or if the executor replies for more than the predetermined duration, the replies of the executor will not be used, and the reply information can be saved to the chat corpus.

另外,如图41所示,聊天服务子系统32000还可包括纠错模块32600。Additionally, as shown in FIG. 41, the chat service subsystem 32000 can also include an error correction module 32600.

纠错模块32600用于在接收用户输入的输入信息之后,对输入信息进行纠错和/或改写,用于纠正输入信息中的错别字,改写不规则的口语化表达等。The error correction module 32600 is configured to perform error correction and/or rewriting of the input information after receiving the input information input by the user, for correcting the typo in the input information, rewriting the irregular colloquial expression, and the like.

另外,如图42所示,聊天服务子系统32000还可包括分析模块32700。Additionally, as shown in FIG. 42, the chat service subsystem 32000 can also include an analysis module 32700.

分析模块32700用于在接收用户输入的输入信息之后,对输入信息进行领域分析以获取输入信息对应的领域,然后第二分发模块32200可根据输入信息对应的领域将输入信息分发至具有相同或相近似领域的聊天服务模块。The analyzing module 32700 is configured to perform domain analysis on the input information to obtain an area corresponding to the input information after receiving the input information input by the user, and then the second distribution module 32200 may distribute the input information to have the same or phase according to the domain corresponding to the input information. Approximate domain chat service module.

另外,如图43所示,聊天服务子系统32000还可包括第五获取模块32800、第一判断 模块32900、补全模块321000。In addition, as shown in FIG. 43, the chat service subsystem 32000 may further include a fifth obtaining module 32800, the first judgment Module 32900, completion module 321000.

第五获取模块32800用于在接收用户输入的输入信息之后,获取与用户聊天的上文信息,并根据上文信息获取用户当前的话题信息。然后,第一判断模块32900可根据上文信息判断输入信息与上文信息的依赖关系是否大于预设关系阈值。在依赖关系大于预设关系阈值时,补全模块321000可根据上文信息对输入信息进行补全,从而保证人机聊天的流畅度。具体地,对输入信息进行补全可包括指代消解。举例来说,输入信息为“他结婚了么?”,则可根据上文信息“刘德华”将输入信息中的“他”替代为“刘德华”。对输入信息进行补全还可包括省略补全。举例来说,上文信息“刘德华老婆叫朱丽倩。”,输入信息为“我不认识。”,则可将输入信息补全为“我不认识朱丽倩。”。The fifth obtaining module 32800 is configured to acquire the above information that is chatted with the user after receiving the input information input by the user, and obtain the current topic information of the user according to the above information. Then, the first determining module 32900 can determine, according to the above information, whether the dependency of the input information and the above information is greater than a preset relationship threshold. When the dependency relationship is greater than the preset relationship threshold, the completion module 321000 can complete the input information according to the above information, thereby ensuring the smoothness of the human-machine chat. Specifically, complementing the input information may include referring to the digestion. For example, if the input information is "Has he got married?", he can replace "He" in the input information with "Andy Lau" according to the above information "Andy Lau". Completing the input information may also include omitting the completion. For example, the above information "Andy Lau's wife is called Zhu Liqian.", the input information is "I don't know.", then the input information can be completed as "I don't know Zhu Liqian."

此外,如图44所示,聊天服务子系统32000还可包括第二判断模块321100、第七获取模块321200、第一生成模块321300和第二生成模块321400。In addition, as shown in FIG. 44, the chat service subsystem 32000 may further include a second determining module 32110, a seventh obtaining module 321200, a first generating module 321300, and a second generating module 321400.

第二判断模块321100用于判断输入信息是否属于无实际内容的聊天信息,如“呵呵”、“hoho”等。如果判断是属于无实际内容的聊天信息,则第七获取模块321200可获取当前话题,即基于话题模型(Topic Model)根据历史聊天记录计算出当前话题。在获取当前话题之后,第一生成模块321300可基于话题聊天图谱根据当前话题生成引导话题。其中,话题聊天图谱是一个以话题为节点的有向图。例如,节点“休闲”可指向节点“看电影”和节点“听歌”,则说明可从话题“休闲”引导至话题“看电影”或者话题“听歌”。话题“看电影”和话题“听歌”均具有一定的引导概率,可根据引导概率实现话题的引导,从而保证引导话题的多样性。然后,第二生成模块321400可根据引导话题生成候选回复。具体地,可基于自然语言生成模型(Natural Language Generation),生成候选回复的模板,将引导话题填充至该模板中生成候选回复;也可以选取包含引导话题的句子作为候选回复,从而实现对用户进行主动地聊天话题引导。The second determining module 321100 is configured to determine whether the input information belongs to chat information without actual content, such as “hehe”, “hoho”, and the like. If it is determined that the chat information belongs to the chat information without the actual content, the seventh obtaining module 321200 may acquire the current topic, that is, calculate the current topic according to the historical chat record based on the Topic Model. After acquiring the current topic, the first generation module 321300 may generate a guidance topic according to the current topic based on the topic chat map. Among them, the topic chat map is a directed graph with a topic as a node. For example, the node "leisure" can point to the node "watching movies" and the node "listening to the song", which means that the topic "leisure" can be guided to the topic "watching movies" or the topic "listening to the songs". The topic "watching movies" and the topic "listening songs" all have a certain guiding probability, and the guiding of the topic can be realized according to the guiding probability, thereby ensuring the diversity of guiding topics. Then, the second generation module 321400 can generate a candidate reply according to the guidance topic. Specifically, the template of the candidate reply may be generated based on the Natural Language Generation, and the guide topic may be filled into the template to generate a candidate reply; or the sentence including the guide topic may be selected as a candidate reply, thereby implementing the user. Actively chat topic guide.

如图45所示,引导和推荐服务子系统33000可包括:确定模块33100、获得模块33200、第八获取模块33300和反馈模块33400。As shown in FIG. 45, the guidance and recommendation service subsystem 33000 may include a determination module 33100, an acquisition module 33200, an eighth acquisition module 33300, and a feedback module 33400.

确定模块33100用于接收用户输入的交互信息,并根据交互信息确定当前话题。The determining module 33100 is configured to receive interaction information input by the user, and determine a current topic according to the interaction information.

具体地,确定模块33100可先接收用户输入的交互信息例如:“盗梦空间好看吗?”,然后对该交互信息进行需求识别以及相关性计算,从而确定当前话题为“盗梦空间评价”。Specifically, the determining module 33100 may first receive the interaction information input by the user, for example: “Is the dream space look good?”, and then perform the requirement identification and the correlation calculation on the interaction information, thereby determining that the current topic is “the dream space evaluation”.

获得模块33200用于基于话题图谱获得多个与当前话题相关的待选引导话题。The obtaining module 33200 is configured to obtain a plurality of candidate guiding topics related to the current topic based on the topic map.

其中,话题图谱可包括多个话题及话题之间的关联关系。The topic map may include a plurality of topics and associations between topics.

具体地,获得模块33200可基于预先建立的话题图谱获取多个与当前话题相关的待选引导话题。例如:当前话题为“盗梦空间评价”,则可根据话题图谱获取多个与“盗梦空间评价”相关的引导话题如“诺兰导演的电影”、“莱昂纳多主演的电影”等,及它们与“盗 梦空间评价”之间的关联关系。Specifically, the obtaining module 33200 may acquire a plurality of candidate guiding topics related to the current topic based on the pre-established topic map. For example, if the current topic is "Privacy Dream Space Evaluation", you can obtain a number of guiding topics related to "Pirates of Dreams" based on the topic map, such as "Nolan's film", "Leonardo's movie", etc. And they are stolen The relationship between the evaluation of dream space.

第八获取模块33300用于获取用户的用户画像数据。The eighth obtaining module 33300 is configured to acquire user image data of the user.

其中,用户画像数据为用户的属性、状态、兴趣等数据的集合,可通过用户主动输入或者根据用户的历史交互记录获取,然后对其进行整合,从而生成关于用户的个性化的用户画像数据。The user portrait data is a collection of data such as attributes, states, interests, and the like of the user, and can be acquired by the user actively input or according to the historical interaction record of the user, and then integrated, thereby generating personalized user portrait data about the user.

反馈模块33400用于根据用户画像数据从多个与当前话题相关的待选引导话题中选择引导话题,并向用户反馈引导话题。The feedback module 33400 is configured to select a guiding topic from a plurality of candidate guiding topics related to the current topic according to the user portrait data, and feed back the guiding topic to the user.

具体地,反馈模块33400可根据用户画像数据和交互信息的上下文信息确定用户的意图信息,然后根据用户的意图信息从多个与当前话题相关的待选引导话题中选择引导话题,并向用户反馈引导话题。Specifically, the feedback module 33400 may determine the user's intent information according to the user portrait data and the context information of the interaction information, and then select a guiding topic from the plurality of to-be-selected guiding topics related to the current topic according to the user's intention information, and provide feedback to the user. Guide the topic.

举例来说,引导话题可以是当前话题的延伸。例如:交互信息为“鸡肉怎么做?”,则当前话题可为“鸡肉的做法”。当前话题交互结束后,可对当前话题延伸,结合用户画像数据如“用户为孕妇”,则可向用户反馈引导话题“孕妇如何吃鸡肉比较好”。For example, a guided topic can be an extension of the current topic. For example, if the interactive information is “How do chickens do?”, the current topic can be “Chicken Practice”. After the current topic interaction ends, the current topic can be extended. In combination with the user portrait data such as “user is a pregnant woman”, the user can be fed back to the topic “How to eat chicken in a pregnant woman is better”.

当然,引导话题也可以是基于当前话题的推荐。例如:交互信息为“盗梦空间好看吗?”,则当前话题可为“盗梦空间评价”。当前话题交互结束后,可基于当前话题,并结合用户画像数据如“用户喜欢看电影”,则可向用户反馈引导话题“诺兰的电影”。Of course, the guiding topic can also be a recommendation based on the current topic. For example, if the interactive information is “Is the dream space to look good?”, the current topic can be “the dream space evaluation”. After the current topic interaction ends, based on the current topic, and combined with user portrait data such as "users like to watch movies", the user can be fed back to guide the topic "Nolan's movie."

而当无法根据用户画像数据和交互信息的上下文信息确定用户的意图信息时,则需要对用户的意图信息进行澄清。例如:交互信息为“去故宫怎么走?”,而北京、沈阳和台北都有“故宫”,因此需要对用户的意图信息进行澄清,可根据交互信息向用户返回意图澄清的问句“请问您是要去哪个故宫?”。When the user's intention information cannot be determined based on the user image data and the context information of the interaction information, the user's intention information needs to be clarified. For example, the interactive information is “How to go to the Forbidden City?”, while Beijing, Shenyang, and Taipei all have “Forbidden City”. Therefore, it is necessary to clarify the user’s intention information, and can return a question of clarification to the user according to the interactive information. “Excuse me. Which is the Forbidden City to go to?".

另外,如图46所示,引导和推荐服务子系统33000还可包括建立模块33500。Additionally, as shown in FIG. 46, the guidance and recommendation service subsystem 33000 can also include an establishing module 33500.

建立模块33500用于建立话题图谱。The setup module 33500 is used to build a topic map.

话题图谱中的一个节点表示用户提出的一个话题或一个需求,每个节点中可包含有对应话题的回复和满足用户需求的资源,而有关联的节点之间可通过边进行关联,从而形成网状的话题图谱。A node in the topic map represents a topic or a requirement put forward by the user, and each node may include a reply of the corresponding topic and a resource satisfying the user's requirement, and the associated nodes may be associated by the edge to form a network. Topographical map.

具体地,建立模块33500可包括第四获取子模块33510和建立子模块33520。Specifically, the establishing module 33500 can include a fourth obtaining submodule 33510 and a establishing submodule 33520.

第四获取子模块33510可获取话题关联数据。The fourth obtaining submodule 33510 can acquire topic related data.

第四获取子模块33510获取话题关联数据可分为两种情况。The fourth acquisition sub-module 33510 can obtain the topic association data in two cases.

第一种情况:可先获取网络文本数据,并从网络文本数据中获取话题关联数据。其中,网络文本数据可分为非结构化数据、半结构化数据和结构化数据。The first case: the network text data can be obtained first, and the topic related data is obtained from the network text data. Among them, network text data can be divided into unstructured data, semi-structured data and structured data.

当网络文本数据为非结构化数据时,可基于实体提取和句法分析获取话题关联数据。其中,非结构化数据可包括新闻、论坛、博客、视频等。例如:对于网络文本数据“最受 瞩目的诺贝尔文学奖花开有主,法国人莫迪亚诺成为新科幸运者。当然,多次提名总是和诺奖失之交臂的村上春树还是那个“离诺奖最近的人”。中国诗人北岛,也只是让国人狂热了一回。”,可基于实体提取技术提取实体信息“诺贝尔文学奖”、“法国人莫迪亚诺”、“村上春树”、“中国诗人北岛”,并基于句法分析获知上述实体信息之间存在关联。更进一步地,还可分析出法国人莫迪亚诺是诺贝尔文学奖获得者,村上春树和中国诗人北岛没有获得诺贝尔文学奖等。When the network text data is unstructured data, the topic association data can be obtained based on entity extraction and syntax analysis. Among them, unstructured data may include news, forums, blogs, videos, and the like. For example: for web text data "most affected The eye-catching Nobel Prize for Literature is the winner of the New Ork. Of course, Murakami Haruki, who has repeatedly missed the Nobel Prize, is still the "most recent person who has won the Nobel Prize." The Chinese poet North Island has only made the Chinese people crazy. "Based on entity extraction technology, the entity information "Nobel Prize for Literature", "French Modiano", "Murao Haruki", "Chinese poet North Island" can be extracted, and based on syntactic analysis, the existence of the above entity information is known. Further, it can be analyzed that the Frenchman Modiano is a Nobel laureate in literature, and Haruki Murakami and the Chinese poet North Island did not receive the Nobel Prize in Literature.

当网络文本数据为半结构化数据时,基于页面结构分析、标签提取、实体识别获取话题关联数据。其中,半结构化数据可包括维基百科、百度百科等百科数据,或者专题数据等。例如:可基于页面结构分析、标签提取、实体识别,获取“德约科维奇”的“场下生活”包括“家庭生活”和“慈善活动”。When the network text data is semi-structured data, the topic-related data is obtained based on page structure analysis, tag extraction, and entity recognition. Among them, the semi-structured data may include Wikipedia, Baidu Encyclopedia and other encyclopedic data, or thematic data. For example, based on page structure analysis, tag extraction, and entity recognition, the “downstairs life” of “Djokovic” is included, including “family life” and “charity activities”.

当网络文本数据为结构化数据时,从知识图谱中获取话题关联数据。其中,结构化数据可包括知识图谱数据。例如:电影“盗梦空间”和电影“星际穿越”的导演为“克里斯托弗.诺兰”。When the network text data is structured data, the topic related data is obtained from the knowledge map. Among them, the structured data may include knowledge map data. For example, the film "The Pirates of the Dream" and the film "Interstellar Cross" are directed by "Christopher Nolan."

第二种情况:可先获取用户的行为数据,然后根据行为数据生成话题关联数据。其中,行为数据可包括搜索行为数据和浏览行为数据。The second case: the user's behavior data can be obtained first, and then the topic association data is generated according to the behavior data. Among them, the behavior data may include search behavior data and browsing behavior data.

具体地,可获取用户的搜索行为数据,并根据搜索行为数据获取对应的搜索对象,然后根据搜索对象生成话题关联数据。例如:用户连续搜索了“诺兰”、“诺兰的电影”和“克里斯蒂安.贝尔”,则可对上述话题进行关联,从而生成话题关联数据。Specifically, the search behavior data of the user may be acquired, and the corresponding search object is obtained according to the search behavior data, and then the topic association data is generated according to the search object. For example, if the user continuously searches for "Nolan", "Nolan's Movie" and "Christian Bell", the above topics can be correlated to generate topic related data.

当然,也可以获取用户的浏览行为数据,并根据浏览行为数据获取对应的浏览对象,根据浏览对象生成话题关联数据。例如:可将用户浏览网页时点击的多个新闻或视频进行关联,从而生成话题关联数据。Certainly, the browsing behavior data of the user may also be obtained, and the corresponding browsing object is obtained according to the browsing behavior data, and the topic related data is generated according to the browsing object. For example, multiple news or videos that are clicked when a user browses a webpage can be associated to generate topic related data.

在获取话题关联数据之后,建立子模块33520可通过RandomWalk算法、关联分析算法、协同过滤算法中的一种或多种,根据话题关联数据建立话题图谱。After acquiring the topic association data, the establishing sub-module 33520 may establish a topic map according to the topic association data by one or more of a RandomWalk algorithm, an association analysis algorithm, and a collaborative filtering algorithm.

举例来说,如图17所示,q1、q2、q3以及q1’、q2’、q3’、q4’为话题,d1、d2、d3和d4为资源数据。从图13中可知,资源数据d1和d2与话题q1相关联;资源数据d1、d2、d3与话题q2相关联;资源数据d4与话题q3相关联,具有关联关系的话题和资源数据之间用实线相连。基于RandomWalk算法可迭代计算出话题q1和资源数据d3之间具有关联关系,它们之间用虚线相连。而话题q1’为用户在浏览了资源数据d1或d4后,根据资源数据d1或d4发出的话题,它们之间的关联关系具有顺序关系。同理,话题q2’为根据资源数据d2发出的话题,话题q3’为根据资源数据d2或d3发出的话题,话题q4’为根据资源数据d3或d4发出的话题。进一步地,可推导出话题q1和话题q1’具有关联关系,话题q1和话题q2’具有关联关系等,最终建立如图13所示的话题图谱。 For example, as shown in Fig. 17, q1, q2, q3, and q1', q2', q3', and q4' are topics, and d1, d2, d3, and d4 are resource data. As can be seen from FIG. 13, the resource data d1 and d2 are associated with the topic q1; the resource data d1, d2, d3 are associated with the topic q2; the resource data d4 is associated with the topic q3, and the topic with the associated relationship and the resource data are used. Solid lines are connected. Based on the RandomWalk algorithm, it is possible to iteratively calculate the relationship between the topic q1 and the resource data d3, which are connected by a dotted line. The topic q1' is a topic that the user issues after the resource data d1 or d4 is browsed according to the resource data d1 or d4, and the relationship between them has a sequential relationship. Similarly, the topic q2' is a topic issued based on the resource data d2, the topic q3' is a topic issued based on the resource data d2 or d3, and the topic q4' is a topic issued based on the resource data d3 or d4. Further, it can be inferred that the topic q1 and the topic q1' have an association relationship, the topic q1 and the topic q2' have an association relationship, etc., and finally the topic map as shown in Fig. 13 is established.

另外,如图47所示,引导和推荐服务子系统33000还可包括解析模块33600、建立提醒模块33700和提醒模块33800。In addition, as shown in FIG. 47, the guidance and recommendation service subsystem 33000 may further include a parsing module 33600, a re-establishment reminder module 33700, and a reminder module 33800.

具体地,解析模块33600可对交互信息进行解析,并获取交互信息中的关键字段。其中,关键字段可包括时间信息、地点信息、提醒事件的一种或多种。然后建立提醒模块33700可根据关键字段建立提醒信息,在当时间信息达到预设时间时,提醒模块33800可向用户发送提醒信息。举例来说,假设用户的交互信息为“明天晚上6点提醒我写工作计划”,则可解析出时间信息“2015年8月8日18:00”,提醒事件是“写工作计划”。当达到这个时间时,可对用户进行提醒。Specifically, the parsing module 33600 can parse the interaction information and obtain a key segment in the interaction information. The keyword segment may include one or more of time information, location information, and reminder events. The establishing reminding module 33700 can then establish reminding information according to the key segment. When the time information reaches the preset time, the reminding module 33800 can send the reminding information to the user. For example, if the user's interaction information is "Remind me to write a work plan at 6 o'clock tomorrow night", the time information "18:00 on August 8, 2015" can be parsed, and the reminder event is "write work plan". When this time is reached, the user can be alerted.

本发明实施例的基于人工智能的人机交互系统,包含以下优点:(1)实现了人机交互系统从工具化转变为拟人化,通过聊天、搜索等服务,让用户在智能交互的过程中获得轻松愉悦的交互体验,而不再仅仅是搜索和问答。(2)从关键词形式的搜索改进为基于自然语言的搜索,用户可以使用灵活自如的自然语言来表达需求,多轮的交互过程更接近人与人之间的交互体验。(3)实现从用户主动搜索演变为全天候的陪伴式服务,基于用户的个性化模型可以随时随地为用户提供推荐等服务。The human intelligence interaction system based on artificial intelligence in the embodiment of the invention comprises the following advantages: (1) realizing the transformation of the human-computer interaction system from instrumentalization to personification, and allowing users to interact in the process of intelligent interaction through chat, search and other services. Get a relaxing and enjoyable interactive experience, not just search and question and answer. (2) From keyword-based search to natural-based search, users can express their needs in a flexible and natural language. The multi-round interaction process is closer to the interactive experience between people. (3) To realize the evolution from user active search to all-weather companion service, based on the user's personalized model, you can provide users with recommendations and other services anytime, anywhere.

在本发明中,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。In the present invention, the terms "first" and "second" are used for descriptive purposes only, and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" or "second" may include at least one of the features, either explicitly or implicitly. In the description of the present invention, the meaning of "a plurality" is at least two, such as two, three, etc., unless specifically defined otherwise.

在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the present specification, the schematic representation of the above terms is not necessarily directed to the same embodiment or example. Furthermore, the particular features or characteristics described may be combined in a suitable manner in any one or more embodiments or examples. In addition, various embodiments or examples described in the specification, as well as features of various embodiments or examples, may be combined and combined.

尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。 Although the embodiments of the present invention have been shown and described, it is understood that the above-described embodiments are illustrative and are not to be construed as limiting the scope of the invention. The embodiments are subject to variations, modifications, substitutions and variations.

Claims (124)

  1. 一种基于人工智能的人机交互方法,其特征在于,包括以下步骤:A human-computer interaction method based on artificial intelligence, characterized in that it comprises the following steps:
    接收用户通过应用终端输入的输入信息;Receiving input information input by the user through the application terminal;
    根据所述用户的输入信息获取所述用户的意图信息,并根据所述意图信息将所述输入信息分发至至少一个交互服务子系统;Obtaining the intent information of the user according to the input information of the user, and distributing the input information to at least one interaction service subsystem according to the intent information;
    接收所述至少一个交互服务子系统返回的返回结果;以及Receiving a return result returned by the at least one interactive service subsystem;
    按照预设的决策策略根据所述返回结果生成用户返回结果,并将所述用户返回结果提供至所述用户。Generating a user return result according to the returned result according to a preset decision policy, and providing the user return result to the user.
  2. 如权利要求1所述的方法,其特征在于,所述将所述用户返回结果提供至所述用户具体包括:The method of claim 1, wherein the providing the user return result to the user comprises:
    将所述用户返回结果转化为自然语言并播报给所述用户。Converting the user return result into a natural language and broadcasting to the user.
  3. 如权利要求1或2所述的方法,其特征在于,还包括:The method of claim 1 or 2, further comprising:
    接收用户的定制任务信息;以及Receiving customized user task information;
    根据所述定制任务信息将所述输入信息分发至至少一个交互服务子系统。The input information is distributed to at least one interactive service subsystem based on the customized task information.
  4. 如权利要求1-3任一项所述的方法,其特征在于,所述应用终端包括PC端、移动终端或智能机器人。The method according to any one of claims 1 to 3, wherein the application terminal comprises a PC end, a mobile terminal or an intelligent robot.
  5. 如权利要求1-4任一项所述的方法,其特征在于,如果所述用户返回结果包括执行指令,则所述方法还包括:The method of any of claims 1-4, wherein if the user returns a result comprising executing an instruction, the method further comprises:
    将所述执行指令发送至对应的执行子系统,并通过所述执行子系统进行执行。The execution instructions are sent to a corresponding execution subsystem and executed by the execution subsystem.
  6. 如权利要求1-5任一项所述的方法,其特征在于,所述按照预设的决策策略根据所述返回结果生成用户返回结果具体包括:The method according to any one of claims 1-5, wherein the generating a user return result according to the returned result according to a preset decision policy specifically includes:
    获取所述输入信息的需求分析特征;Obtaining a demand analysis feature of the input information;
    获取所述交互服务子系统返回的返回结果的置信度特征、所述用户的对话交互信息的上下文特征以及所述用户的个性化模型特征;Obtaining a confidence feature of the returned result returned by the interactive service subsystem, a context feature of the user's dialog interaction information, and a personalized model feature of the user;
    根据所述需求分析特征、所述返回结果的置信度特征、所述用户的对话交互信息的上下文特征以及所述用户的个性化模型特征对所述返回结果进行决策以确定所述用户返回结果。Determining the returned result to determine the user return result according to the requirement analysis feature, the confidence feature of the returned result, the context feature of the user's dialog interaction information, and the personalized model feature of the user.
  7. 如权利要求6所述的方法,其特征在于,所述需求分析特征、所述返回结果的置信度特征、所述用户的对话交互信息的上下文特征以及所述用户的个性化模型特征分别对应有各自的决策权重。The method according to claim 6, wherein the demand analysis feature, the confidence feature of the returned result, the contextual feature of the user's dialog interaction information, and the personalized model feature of the user respectively correspond to Their respective decision weights.
  8. 如权利要求7所述的方法,其特征在于,还包括: The method of claim 7 further comprising:
    根据所述用户的日志基于增强学习模型对所述需求分析特征、所述返回结果的置信度特征、所述用户的对话交互信息的上下文特征以及所述用户的个性化模型特征的决策权重进行训练。Training according to the user's log based on the enhanced learning model, the demand analysis feature, the confidence feature of the returned result, the contextual feature of the user's dialog interaction information, and the decision weight of the user's personalized model feature .
  9. 如权利要求1-8任一项所述的方法,其特征在于,还包括:The method of any of claims 1-8, further comprising:
    获取与所述用户交互的交互上文信息;Obtaining the above information of interaction with the user;
    根据所述交互上文信息对所述输入信息进行补全。The input information is complemented according to the interaction of the above information.
  10. 如权利要求1-9任一项所述的方法,其特征在于,所述交互服务子系统包括需求满足服务子系统、引导和推荐服务子系统和聊天服务子系统中的一种或多种。The method of any of claims 1-9, wherein the interactive services subsystem comprises one or more of a requirements fulfillment service subsystem, a bootstrap and recommendation service subsystem, and a chat service subsystem.
  11. 如权利要求10所述的方法,其特征在于,还包括:The method of claim 10, further comprising:
    所述需求满足服务子系统获取用户输入的问题信息;The requirement satisfies the service subsystem to obtain problem information input by the user;
    所述需求满足服务子系统根据所述问题信息获取用户的用户需求信息;The requirement satisfaction service subsystem acquires user demand information of the user according to the problem information;
    所述需求满足服务子系统根据所述用户需求信息将所述问题信息分发至对应的至少一个问答服务模块;以及The demand satisfaction service subsystem distributes the problem information to a corresponding at least one question and answer service module according to the user demand information;
    所述需求满足服务子系统接收所述至少一个问答服务模块返回的问答结果,并对所述问答结果进行决策以确定最终的问答结果。The demand satisfaction service subsystem receives the question and answer results returned by the at least one question and answer service module, and makes a decision on the question and answer result to determine a final question and answer result.
  12. 如权利要求11所述的方法,其特征在于,所述问答服务模块包括阿拉丁服务模块、垂类服务模块、深度问答服务模块和信息搜索服务模块。The method of claim 11, wherein the question and answer service module comprises an Aladdin service module, a vertical service module, a deep question and answer service module, and an information search service module.
  13. 如权利要求12所述的方法,其特征在于,还包括:The method of claim 12, further comprising:
    所述深度问答服务模块接收所述问题信息;The deep question and answer service module receives the problem information;
    所述深度问答服务模块根据所述问题信息获取对应的问题类型;The deep question answering service module acquires a corresponding question type according to the problem information;
    所述深度问答服务模块根据所述问题类型选择对应的问答模式,并根据选择的答案生成模式和所述问题信息生成对应的问答结果。The deep question and answer service module selects a corresponding question and answer mode according to the type of the question, and generates a corresponding question and answer result according to the selected answer generation mode and the problem information.
  14. 如权利要求13所述的方法,其特征在于,当所述问题类型为实体类型时,所述根据选择的答案生成模式和所述问题信息生成对应的问答结果具体包括:The method according to claim 13, wherein when the question type is an entity type, the generating the corresponding question and answer result according to the selected answer generation mode and the problem information specifically includes:
    根据所述问题信息生成实体类问题信息;Generating entity class problem information according to the problem information;
    基于搜索引擎抓取的摘要和历史展现日志对所述实体类问题信息进行扩展以生成同族实体问题信息簇,其中,所述同族实体问题信息簇分别对应候选答案;Generating the entity class problem information based on the summary and historical presentation logs captured by the search engine to generate the same family entity problem information cluster, wherein the same family entity problem information cluster respectively corresponds to the candidate answer;
    从所述同族实体问题信息簇分别对应候选答案中抽取候选实体;Extracting candidate entities from the candidate answers of the same family entity problem information cluster respectively;
    计算所述候选实体的置信度;以及Calculating the confidence of the candidate entity;
    将所述置信度大于预设置信度阈值的候选实体作为问答结果进行反馈。The candidate entity with the confidence greater than the preset reliability threshold is fed back as a question and answer result.
  15. 如权利要求13所述的方法,其特征在于,当所述问题类型为观点类型时,所述根据选择的答案生成模式和所述问题信息生成对应的问答结果具体包括: The method according to claim 13, wherein when the question type is a viewpoint type, the generating a corresponding question and answer result according to the selected answer generation mode and the problem information specifically includes:
    获取所述问题信息对应的候选答案;Obtaining candidate answers corresponding to the problem information;
    对所述候选答案进行切分以生成多个候选答案短句;Segmenting the candidate answers to generate a plurality of candidate answer phrases;
    对所述多个候选答案短句进行聚合以生成观点聚合簇;Aggregating the plurality of candidate answer phrases to generate a view aggregation cluster;
    判断所述观点聚合簇的观点类型;Determining the type of opinion of the view aggregation cluster;
    根据所述观点类型从所述观点聚合簇中选择出答案观点,并生成所述答案观点对应的摘要;Selecting an answer point from the view aggregation cluster according to the view type, and generating a summary corresponding to the answer view;
    对所述答案观点进行评分,并将评分大于预设评分阈值的答案观点作为问答结果进行反馈。The answer point is scored, and the answer point whose score is greater than the preset score threshold is used as a question and answer result.
  16. 如权利要求15所述的方法,其特征在于,所述对所述多个候选答案短句进行聚合以生成观点聚合簇具体包括:The method according to claim 15, wherein said aggregating said plurality of candidate answer phrases to generate a view aggregation cluster comprises:
    提取所述多个候选答案短句中的关键词;Extracting keywords in the plurality of candidate answer phrases;
    计算每两个所述关键词之间的向量夹角和/或语义相似度;Calculating a vector angle and/or a semantic similarity between each of the two keywords;
    对所述向量夹角小于预设角度或语义相似度大于预设阈值的所述候选答案进行聚合以生成观点聚合簇。The candidate answers whose vector angle is smaller than a preset angle or whose semantic similarity is greater than a preset threshold are aggregated to generate a view aggregation cluster.
  17. 如权利要求13所述的方法,其特征在于,当所述问题类型为片段类型时,所述根据选择的答案生成模式和所述问题信息生成对应的问答结果具体包括:The method according to claim 13, wherein when the question type is a segment type, the generating a corresponding question and answer result according to the selected answer generation mode and the problem information specifically includes:
    获取所述问题信息对应的候选答案;Obtaining candidate answers corresponding to the problem information;
    对所述候选答案进行切分以生成多个候选答案短句;Segmenting the candidate answers to generate a plurality of candidate answer phrases;
    对所述多个候选答案短句进行重要度打分以生成所述候选答案短句对应的短句重要度特征;Performing an importance score on the plurality of candidate answer short sentences to generate a short sentence importance feature corresponding to the candidate answer short sentence;
    根据所述短句重要度特征生成答案摘要;Generating an answer summary according to the short sentence importance feature;
    根据所述答案摘要的短句重要度特征对答案质量进行打分,并根据打分结果对候选答案进行排序;The quality of the answer is scored according to the short sentence importance feature of the answer summary, and the candidate answers are sorted according to the score result;
    将排序结果作为问答结果进行反馈。The sort result is fed back as a question and answer result.
  18. 如权利要求17所述的方法,其特征在于,所述根据所述答案摘要的短句重要度特征对答案质量进行打分具体包括:The method according to claim 17, wherein said scoring the quality of the answer according to the short sentence importance feature of the answer summary comprises:
    根据所述答案摘要的短句重要度特征、答案权威性、问题信息的相关性和答案的丰富度对答案质量进行打分。The quality of the answer is scored according to the short sentence importance feature of the answer summary, the authority of the answer, the relevance of the question information, and the richness of the answer.
  19. 如权利要求17所述的方法,其特征在于,所述根据打分结果对候选答案进行排序具体包括:The method of claim 17, wherein the sorting the candidate answers according to the scoring result comprises:
    获取用户的行为数据;以及Get user behavior data; and
    根据所述用户的行为数据和所述打分结果对所述候选答案进行排序。 The candidate answers are sorted according to the user's behavior data and the scoring result.
  20. 如权利要求12-19任一项所述的方法,其特征在于,还包括:The method of any of claims 12-19, further comprising:
    所述信息搜索服务模块接收所述问题信息;The information search service module receives the problem information;
    所述信息搜索服务模块根据所述问题信息进行搜索以生成多个候选网页;The information search service module performs a search according to the problem information to generate a plurality of candidate web pages;
    所述信息搜索服务模块对所述候选网页进行篇章分析以生成对应的摘要,并将摘要作为问答结果进行反馈。The information search service module performs chapter analysis on the candidate webpage to generate a corresponding abstract, and feeds the abstract as a question and answer result.
  21. 如权利要求12-20任一项所述的方法,其特征在于,所述对所述候选网页进行篇章分析以生成对应的摘要具体包括:The method according to any one of claims 12 to 20, wherein the performing chapter analysis on the candidate webpage to generate a corresponding abstract comprises:
    对所述候选网页进行篇章分析以生成对应的候选篇章;Performing chapter analysis on the candidate web page to generate a corresponding candidate chapter;
    对所述候选篇章中的句子进行打分排序;以及Sorting sentences in the candidate chapters; and
    根据打分排序结果生成所述摘要。The summary is generated according to the score sorting result.
  22. 如权利要求21所述的方法,其特征在于,所述根据打分排序结果生成所述摘要具体包括:The method of claim 21, wherein the generating the summary according to the ranking result comprises:
    获取用户的需求场景信息;Obtaining the user's demand scenario information;
    根据所述需求场景信息和所述打分排序结果生成所述摘要。Generating the digest according to the demand scenario information and the scoring ranking result.
  23. 如权利要求21所述的方法,其特征在于,还包括:The method of claim 21, further comprising:
    对多个候选篇章的信息进行聚合。Aggregate information for multiple candidate chapters.
  24. 如权利要求10所述的方法,其特征在于,还包括:The method of claim 10, further comprising:
    所述聊天服务子系统接收用户输入的输入信息;The chat service subsystem receives input information input by a user;
    所述聊天服务子系统将所述输入信息分发至聊天服务模块;The chat service subsystem distributes the input information to a chat service module;
    所述聊天服务子系统接收所述多个聊天服务模块返回的候选回复,其中,所述候选回复具有对应的置信度;The chat service subsystem receives candidate responses returned by the plurality of chat service modules, wherein the candidate responses have corresponding confidence levels;
    所述聊天服务子系统基于所述置信度对所述待选回复进行排序,并根据排序结果生成聊天信息,并向所述用户提供所述聊天信息。The chat service subsystem sorts the to-be-selected responses based on the confidence, and generates chat information according to the sorting result, and provides the chat information to the user.
  25. 如权利要求24所述的方法,其特征在于,在所述接收用户输入的输入信息之后,还包括:The method of claim 24, after the receiving the input information input by the user, further comprising:
    对所述输入信息进行纠错和/或改写。The input information is error corrected and/or overwritten.
  26. 如权利要求24所述的方法,其特征在于,在所述接收用户输入的输入信息之后,还包括:The method of claim 24, after the receiving the input information input by the user, further comprising:
    对所述输入信息进行领域分析以获取所述输入信息对应的领域,其中,根据所述输入信息对应的领域将所述输入信息分发至具有相同或相近似领域的聊天服务模块。Performing domain analysis on the input information to obtain an area corresponding to the input information, wherein the input information is distributed to a chat service module having the same or similar domain according to the domain corresponding to the input information.
  27. 如权利要求24所述的方法,其特征在于,在所述接收用户输入的输入信息之后,还包括: The method of claim 24, after the receiving the input information input by the user, further comprising:
    获取与所述用户聊天的上文信息;Obtaining the above information of chatting with the user;
    根据所述上文信息判断所述输入信息与所述上文信息的依赖关系是否大于预设关系阈值;以及Determining, according to the above information, whether the dependency of the input information and the above information is greater than a preset relationship threshold;
    如果大于所述预设关系阈值,则根据所述上文信息对所述输入信息进行补全。If it is greater than the preset relationship threshold, the input information is complemented according to the above information.
  28. 如权利要求27所述的方法,其特征在于,还包括:The method of claim 27, further comprising:
    根据所述上文信息获取所述用户当前的话题信息。Obtaining current topic information of the user according to the above information.
  29. 如权利要求24所述的方法,其特征在于,所述聊天服务模块包括基于搜索的聊天模块、富知识聊天模块、基于画像的聊天模块和基于众包的聊天模块中的一种或多种。The method of claim 24, wherein the chat service module comprises one or more of a search-based chat module, a rich knowledge chat module, a portrait-based chat module, and a crowdsourcing-based chat module.
  30. 如权利要求29所述的方法,其特征在于,还包括:The method of claim 29, further comprising:
    所述基于搜索的聊天模块对所述输入信息进行切词以生成多个聊天短句;The search-based chat module performs a word-cutting on the input information to generate a plurality of chat phrases;
    所述基于搜索的聊天模块根据所述多个聊天短句查询聊天语料库以生成多个聊天语料上句,以及所述多个聊天语料上句对应的多个聊天语料下句;The search-based chat module queries the chat corpus according to the plurality of chat phrases to generate a plurality of chat corpus sentences, and a plurality of chat corpus sentences corresponding to the plurality of chat corpora sentences;
    所述基于搜索的聊天模块对所述多个聊天语料上句进行过滤;The search-based chat module filters the plurality of chat corpus sentences;
    所述基于搜索的聊天模块对过滤之后的聊天语料上句对应的聊天语料下句进行分类;以及The search-based chat module classifies the chat corpus corresponding to the upper sentence of the chat corpus after filtering;
    所述基于搜索的聊天模块对分类之后的所述聊天语料下句进行重排序,并根据排序结果生成所述候选回复。The search-based chat module reorders the chat corpora sentence after classification, and generates the candidate reply according to the sort result.
  31. 如权利要求30所述的方法,其特征在于,所述基于搜索的聊天模块对所述多个聊天语料上句进行过滤具体包括:The method of claim 30, wherein the filtering of the plurality of chat corpus by the search-based chat module comprises:
    计算所述输入信息与所述多个聊天语料上句之间的相似度;Calculating a similarity between the input information and the plurality of chat corpus sentences;
    如果所述相似度小于第一预设相似度阈值,则将对应的聊天语料上句过滤;以及If the similarity is less than the first preset similarity threshold, filtering the corresponding chat corpus upper sentence;
    如果所述相似度大于或等于所述第一预设相似度阈值,则将对应的聊天语料上句保留。If the similarity is greater than or equal to the first preset similarity threshold, the corresponding chat corpus is retained.
  32. 如权利要求30所述的方法,其特征在于,所述对过滤之后的聊天语料上句对应的聊天语料下句进行分类具体包括:The method according to claim 30, wherein the classifying the chat corpus corresponding to the upper sentence of the chat corpus after filtering comprises:
    计算所述输入信息与所述多个聊天语料下句之间的相似度;以及Calculating a similarity between the input information and the plurality of chat corpus sentences;
    根据所述相似度对所述多个聊天语料下句进行分类。Sorting the plurality of chat corpora sentences according to the similarity.
  33. 如权利要求32所述的方法,其特征在于,所述输入信息与所述多个聊天语料下句之间的相似度包括:The method of claim 32, wherein the similarity between the input information and the plurality of chat corpus sentences comprises:
    所述输入信息与所述聊天语料下句之间字面的相似度;a literal similarity between the input information and the sentence of the chat corpus;
    或者,所述输入信息与所述聊天语料下句基于深度神经网络训练得到的相似度;Or the similarity between the input information and the chat corpus based on deep neural network training;
    或者,所述输入信息与所述聊天语料下句基于机器翻译模型训练得到的相似度。Alternatively, the input information is similar to the training of the chat corpus based on the machine translation model.
  34. 如权利要求30所述的方法,其特征在于,所述对分类之后的所述聊天语料下句进 行重排序具体包括:The method according to claim 30, wherein said chat corpus after the classification is sentenced Row reordering specifically includes:
    根据所述用户聊天的上文信息获取所述用户的聊天属性;Obtaining a chat attribute of the user according to the above information of the user chat;
    根据所述聊天属性对所述分类之后的所述聊天语料下句进行重排序。And re-sorting the chat corpus sentence after the classification according to the chat attribute.
  35. 如权利要求29所述的方法,其特征在于,还包括:The method of claim 29, further comprising:
    所述富知识聊天模块根据所述输入信息生成搜索词,并根据所述搜索词进行搜索以生成多个搜索结果;The rich knowledge chat module generates a search word according to the input information, and searches according to the search word to generate a plurality of search results;
    所述富知识聊天模块对所述多个搜索结果进行句子抽取,以获取候选句子集合;The rich knowledge chat module performs sentence extraction on the plurality of search results to obtain a candidate sentence set;
    所述富知识聊天模块对所述候选句子集合中的句子进行改写以生成所述候选回复。The rich knowledge chat module rewrites sentences in the candidate sentence set to generate the candidate reply.
  36. 如权利要求35所述的方法,其特征在于,所述候选句子集合中的句子与所述搜索词的相似度大于第二预设相似度阈值。The method of claim 35, wherein the similarity between the sentence in the set of candidate sentences and the search term is greater than a second predetermined similarity threshold.
  37. 如权利要求35所述的方法,其特征在于,还包括:The method of claim 35, further comprising:
    根据所述用户的聊天属性对所述候选句子集合中的句子进行重排序。The sentences in the candidate sentence set are reordered according to the chat attribute of the user.
  38. 如权利要求29所述的方法,其特征在于,还包括:The method of claim 29, further comprising:
    所述基于画像的聊天模块获取所述用户的聊天语境;The portrait-based chat module acquires a chat context of the user;
    所述基于画像的聊天模块根据所述聊天语境判断是否满足收集条件;The portrait-based chat module determines whether the collection condition is satisfied according to the chat context;
    如果判断满足所述收集条件,则向所述用户发送问题;Sending a question to the user if it is determined that the collection condition is satisfied;
    接收所述用户根据所述问题的回答信息,并根据所述回答信息对用户画像模型进行更新。Receiving the answer information of the user according to the question, and updating the user portrait model according to the answer information.
  39. 如权利要求29所述的方法,其特征在于,还包括:The method of claim 29, further comprising:
    所述基于画像的聊天模块获取所述用户的聊天内容;The portrait-based chat module acquires chat content of the user;
    所述基于画像的聊天模块根据所述聊天内容提取用户画像数据;The portrait-based chat module extracts user portrait data according to the chat content;
    所述基于画像的聊天模块根据提取的所述用户画像数据对所述用户画像模型进行更新。The portrait-based chat module updates the user portrait model according to the extracted user portrait data.
  40. 如权利要求29所述的方法,其特征在于,还包括:The method of claim 29, further comprising:
    所述基于众包的聊天模块判断所述输入信息是否适合众包完成;The crowdsourcing-based chat module determines whether the input information is suitable for crowdsourcing completion;
    如果判断适合众包完成,则将所述输入信息分发至对应的执行者;If it is determined that the crowdsourcing is completed, the input information is distributed to the corresponding performer;
    接收所述执行者的回复信息,并对所述回复信息进行质量判断;Receiving reply information of the performer, and performing quality judgment on the reply information;
    如果满足质量要求,则将所述回复信息作为所述候选回复。If the quality requirement is met, the reply message is replied as the candidate.
  41. 如权利要求24所述的方法,其特征在于,还包括:The method of claim 24, further comprising:
    判断所述输入信息是否属于无实际内容的聊天信息;Determining whether the input information belongs to chat information without actual content;
    如果判断是属于无实际内容的聊天信息,则获取当前话题;If it is determined that the chat information belongs to the content without actual content, the current topic is obtained;
    根据所述当前话题生成引导话题;以及 Generating a boot topic based on the current topic;
    根据所述引导话题生成所述候选回复。The candidate reply is generated according to the guiding topic.
  42. 如权利要求24所述的方法,其特征在于,所述基于所述置信度对所述待选回复进行排序具体包括:The method according to claim 24, wherein the sorting the candidate responses based on the confidence level specifically comprises:
    获取所述用户的所述输入信息的特征;以及Obtaining characteristics of the input information of the user;
    基于所述输入信息的特征和所述置信度对所述待选回复进行排序。The candidate responses are sorted based on the characteristics of the input information and the confidence.
  43. 如权利要求10所述的方法,其特征在于,还包括:The method of claim 10, further comprising:
    所述引导和推荐服务子系统接收用户输入的交互信息,并根据所述交互信息确定当前话题;The guidance and recommendation service subsystem receives interaction information input by the user, and determines a current topic according to the interaction information;
    所述引导和推荐服务子系统基于话题图谱获得多个与所述当前话题相关的待选引导话题,其中,所述话题图谱包括多个话题及所述话题之间的关联关系;The guidance and recommendation service subsystem obtains a plurality of candidate guidance topics related to the current topic based on the topic map, wherein the topic map includes a plurality of topics and an association relationship between the topics;
    所述引导和推荐服务子系统获取所述用户的用户画像数据;以及The guidance and recommendation service subsystem acquires user image data of the user;
    所述引导和推荐服务子系统根据所述用户画像数据从所述多个与所述当前话题相关的待选引导话题中选择引导话题,并向所述用户反馈所述引导话题。The guidance and recommendation service subsystem selects a guidance topic from the plurality of candidate guidance topics related to the current topic according to the user portrait data, and feeds back the guidance topic to the user.
  44. 如权利要求43所述的方法,其特征在于,在所述引导和推荐服务子系统基于话题图谱获得多个与所述当前话题相关的待选引导话题之前,还包括:The method according to claim 43, wherein before the guiding and recommending service subsystem obtains a plurality of candidate guidance topics related to the current topic based on the topic map, the method further comprises:
    建立所述话题图谱。Establish the topic map.
  45. 如权利要求44所述的方法,其特征在于,所述建立所述话题图谱具体包括:The method of claim 44, wherein the establishing the topic map specifically comprises:
    获取话题关联数据;以及Get topic association data; and
    根据所述话题关联数据建立所述话题图谱。The topic map is established based on the topic association data.
  46. 如权利要求45所述的方法,其特征在于,所述获取话题关联数据,具体包括:The method of claim 45, wherein the obtaining the topic association data comprises:
    获取网络文本数据;Obtain network text data;
    当所述网络文本数据为非结构化数据时,基于实体提取和句法分析获取所述话题关联数据;或者When the network text data is unstructured data, acquiring the topic association data based on entity extraction and syntax analysis; or
    当所述网络文本数据为半结构化数据时,基于页面结构分析、标签提取、实体识别获取所述话题关联数据;或者When the network text data is semi-structured data, acquiring the topic-related data based on page structure analysis, label extraction, and entity identification; or
    当所述网络文本数据为结构化数据时,从知识图谱中获取所述话题关联数据。When the network text data is structured data, the topic related data is obtained from the knowledge map.
  47. 如权利要求45所述的方法,其特征在于,所述获取话题关联数据,包括:The method of claim 45, wherein the obtaining topic association data comprises:
    获取所述用户的搜索行为数据,并根据所述搜索行为数据获取对应的搜索对象,以及根据所述搜索对象生成所述话题关联数据;或者Obtaining search behavior data of the user, and acquiring a corresponding search object according to the search behavior data, and generating the topic association data according to the search object; or
    获取所述用户的浏览行为数据,并根据所述浏览行为数据获取对应的浏览对象,根据所述浏览对象生成所述话题关联数据。Acquiring the browsing behavior data of the user, and acquiring a corresponding browsing object according to the browsing behavior data, and generating the topic association data according to the browsing object.
  48. 如权利要求45所述的方法,其特征在于,所述根据所述话题关联数据建立所述话 题图谱,具体包括:The method of claim 45, wherein said establishing said said message based on said topic association data The map, specifically including:
    通过RandomWalk算法、关联分析算法、协同过滤算法中的一种或多种,根据所述话题关联数据建立所述话题图谱。The topic map is established according to the topic association data by one or more of a RandomWalk algorithm, an association analysis algorithm, and a collaborative filtering algorithm.
  49. 如权利要求43所述的方法,其特征在于,所述根据所述交互信息确定当前话题,具体包括:The method according to claim 43, wherein the determining the current topic according to the interaction information comprises:
    对所述交互信息进行需求识别以及相关性计算以确定所述当前话题。A requirement identification and a correlation calculation are performed on the interaction information to determine the current topic.
  50. 如权利要求43所述的方法,其特征在于,所述引导和推荐服务子系统根据所述用户画像数据从所述多个与所述当前话题相关的待选引导话题中选择引导话题,并向所述用户反馈所述引导话题,具体包括:The method according to claim 43, wherein said guidance and recommendation service subsystem selects a guidance topic from said plurality of candidate guidance topics related to said current topic based on said user portrait data, and The user feedbacks the guiding topic, which specifically includes:
    根据所述用户画像数据和所述交互信息的上下文信息确定所述用户的意图信息;Determining the intent information of the user according to the user portrait data and the context information of the interaction information;
    根据所述用户的意图信息从所述多个与所述当前话题相关的待选引导话题中选择引导话题,并向所述用户反馈所述引导话题。Selecting a guiding topic from the plurality of candidate guiding topics related to the current topic according to the intention information of the user, and feeding back the guiding topic to the user.
  51. 如权利要求43所述的方法,其特征在于,还包括:The method of claim 43 further comprising:
    对所述交互信息进行解析,并获取所述交互信息中的关键字段,所述关键字段包括时间信息、地点信息、提醒事件的一种或多种;Parsing the interaction information, and acquiring a key field in the interaction information, where the keyword segment includes one or more of time information, location information, and reminder event;
    根据所述关键字段建立提醒信息;Establishing reminder information according to the keyword segment;
    当所述时间信息达到预设时间时,向用户发送所述提醒信息。When the time information reaches a preset time, the reminder information is sent to the user.
  52. 如权利要求1所述的方法,其特征在于,还包括:The method of claim 1 further comprising:
    如果在网络资源中不存在满足所述用户需求的所述用户返回结果,则记录所述用户的输入信息;If the user returns a result that satisfies the user requirement in the network resource, the input information of the user is recorded;
    以预设周期监控所述网络资源中是否存在满足所述用户需求的所述用户返回结果;Monitoring, by the preset period, whether the user returns a result that satisfies the user requirement in the network resource;
    当所述用户返回结果存在时,将所述用户返回结果提供至所述用户。The user return result is provided to the user when the user returns a result.
  53. 如权利要求12所述的方法,其特征在于,还包括:The method of claim 12, further comprising:
    所述垂类服务模块获取用户输入的查询词;The vertical service module obtains a query word input by a user;
    所述垂类服务模块确定所述查询词属于的垂类;The vertical class service module determines a vertical class to which the query term belongs;
    所述垂类服务模块在所述查询词属于的垂类中,与用户进行至少一轮的交互,得到用户需要的查询结果,其中,每轮交互时,展示给用户的信息包括:对应查询词的查询结果,以及,引导信息。The vertical service module performs at least one round of interaction with the user in the vertical class to which the query term belongs, and obtains a query result required by the user, wherein, in each round of interaction, the information displayed to the user includes: corresponding query word The results of the query, as well, the boot information.
  54. 根据权利要求53所述的方法,其特征在于,所述查询词是自然语言表示的,所述在所述查询词属于的垂类中,与用户进行至少一轮的交互,得到用户需要的查询结果,包括:The method according to claim 53, wherein the query word is expressed in a natural language, and the at least one round of interaction with the user is performed in the vertical class to which the query word belongs, to obtain a query required by the user. The results include:
    将所述查询词解析为所述查询词属于的垂类的垂类知识体系能够表示的结构化信息; Parsing the query term into structured information that can be represented by a vertical knowledge system of the vertical class to which the query term belongs;
    根据所述结构化信息、所述垂类知识体系,以及,所述查询词属于的垂类的垂类资源库,获取相关信息,所述相关信息包括:对应所述查询词的查询结果,以及,引导信息;Obtaining related information according to the structured information, the vertical knowledge system, and the vertical resource library of the vertical class to which the query term belongs, the related information includes: a query result corresponding to the query word, and , guidance information;
    向用户展示所述查询结果和所述引导信息;Presenting the query result and the guidance information to a user;
    在用户根据所述引导信息再次输入查询词后,重复上述根据查询词获取相关信息的流程,直至得到用户需要的查询结果。After the user inputs the query word again according to the guiding information, the process of obtaining the related information according to the query word is repeated until the query result required by the user is obtained.
  55. 根据权利要求54所述的方法,其特征在于,所述根据所述结构化信息、所述垂类知识体系,以及,所述查询词属于的垂类的垂类资源库,获取相关信息,包括:The method according to claim 54, wherein the obtaining, according to the structured information, the vertical knowledge system, and the vertical class resource library of the vertical class to which the query word belongs, acquire relevant information, including :
    根据所述结构化信息和用户前一次的状态信息,更新用户的当前状态信息;Updating the current state information of the user according to the structured information and the previous state information of the user;
    根据所述垂类知识体系和所述垂类资源库,生成所述当前状态信息对应的候选动作;Generating a candidate action corresponding to the current state information according to the vertical knowledge system and the vertical resource library;
    根据预设模型在所述候选动作中选择与所述当前状态信息匹配程度较高的预设个数的候选动作,将选择的候选动作作为相关信息。Selecting a predetermined number of candidate actions that match the current state information to the candidate action according to the preset model, and selecting the selected candidate action as the related information.
  56. 根据权利要求55所述的方法,其特征在于,还包括:The method of claim 55, further comprising:
    根据用户的反馈更新预设模型的参数,以便在参数不同时选择不同的候选动作。The parameters of the preset model are updated according to the feedback of the user, so that different candidate actions are selected when the parameters are different.
  57. 根据权利要求55所述的方法,其特征在于,还包括:The method of claim 55, further comprising:
    根据用户的偏好或者交互历史,获取用户的初始化状态信息。The initialization status information of the user is obtained according to the user's preference or interaction history.
  58. 根据权利要求55所述的方法,其特征在于,所述候选动作包括:满足用户需求的动作,或者,进一步澄清用户需求的动作,或者,为用户需求提供横向或纵向的引导信息,其中,用户需求根据查询词确定,所述满足用户需求的动作,或者,进一步澄清用户需求的动作在被选择后作为查询结果,为用户需求提供横向或纵向的引导信息在被选择后作为引导信息。The method according to claim 55, wherein the candidate action comprises: an action that satisfies a user's needs, or an action of further clarifying a user's demand, or providing horizontal or vertical guidance information for the user's needs, wherein the user The requirement is determined according to the query term, the action that satisfies the user's requirement, or the action that further clarifies the user's demand is selected as the query result, and the horizontal or vertical guidance information is provided as the guide information after being selected.
  59. 根据权利要求54所述的方法,其特征在于,还包括:The method of claim 54 further comprising:
    获取所述查询词属于的垂类的结构化资源和非结构化资源,将所述结构化资源和所述非结构化资源组成所述垂类资源库,其中,所述结构化资源是从多个对应的垂类网站抓取整合数据后得到的全亮数据资源,所述非结构化资源根据用户查询词或互联网文本挖掘得到的结构化资源的补充或扩展信息。Obtaining a structured resource and an unstructured resource of the vertical class to which the query word belongs, and the structured resource and the unstructured resource are configured into the vertical resource library, wherein the structured resource is from a plurality of The corresponding vertical website captures the full-light data resource obtained by integrating the data, and the unstructured resource supplements or expands the information according to the structured resource obtained by the user query word or the Internet text mining.
  60. 根据权利要求53-59任一项所述的方法,其特征在于,所述获取用户输入的查询词,包括:The method according to any one of claims 53-59, wherein the obtaining a query word input by a user comprises:
    获取用户以文本、语音或图像输入的查询词。Get the query words that the user enters in text, voice, or image.
  61. 根据权利要求53-60任一项所述的方法,其特征在于,所述确定所述查询词属于的垂类,包括:The method according to any one of claims 53-60, wherein the determining the vertical class to which the query term belongs comprises:
    基于机器学习方式,或者,基于模式解析方式,确定所述查询词属于的垂类。The vertical class to which the query term belongs is determined based on a machine learning manner or based on a mode analysis manner.
  62. 一种基于人工智能的人机交互系统,其特征在于,包括: A human-computer interaction system based on artificial intelligence, characterized in that it comprises:
    第一接收子系统,用于接收用户通过应用终端输入的输入信息;a first receiving subsystem, configured to receive input information input by a user through the application terminal;
    分发子系统,用于根据所述用户的输入信息获取所述用户的意图信息,并根据所述意图信息将所述输入信息分发至至少一个交互服务子系统;a distribution subsystem, configured to acquire the intent information of the user according to the input information of the user, and distribute the input information to at least one interaction service subsystem according to the intent information;
    第二接收子系统,用于接收所述至少一个交互服务子系统返回的返回结果;a second receiving subsystem, configured to receive a return result returned by the at least one interactive service subsystem;
    生成子系统,用于按照预设的决策策略根据所述返回结果生成用户返回结果;以及Generating a subsystem for generating a user return result according to the returned result according to a preset decision policy;
    提供子系统,用于将所述用户返回结果提供至所述用户。A subsystem is provided for providing the user return result to the user.
  63. 如权利要求62所述的系统,其特征在于,所述提供子系统,具体用于:The system of claim 62 wherein said providing subsystem is specifically for:
    将所述用户返回结果转化为自然语言并播报给所述用户。Converting the user return result into a natural language and broadcasting to the user.
  64. 如权利要求62或63所述的系统,其特征在于,还包括:The system of claim 62 or 63, further comprising:
    所述第一接收子系统,还用于接收用户的定制任务信息;以及The first receiving subsystem is further configured to receive customized task information of the user;
    所述分发子系统,还用于根据所述定制任务信息将所述输入信息分发至至少一个交互服务子系统。The distribution subsystem is further configured to distribute the input information to the at least one interactive service subsystem according to the customized task information.
  65. 如权利要求62-64任一项所述的系统,其特征在于,所述应用终端包括PC端、移动终端或智能机器人。A system according to any one of claims 62 to 64, wherein the application terminal comprises a PC terminal, a mobile terminal or an intelligent robot.
  66. 如权利要求62-65任一项所述的系统,其特征在于,所述系统还包括:The system of any of claims 62-65, wherein the system further comprises:
    发送子系统,当所述用户返回结果包括执行指令时,将所述执行指令发送至对应的执行子系统;a sending subsystem, when the user returns a result including executing an instruction, sending the execution instruction to a corresponding execution subsystem;
    所述执行子系统执行所述执行指令。The execution subsystem executes the execution instructions.
  67. 如权利要求62-66任一项所述的系统,其特征在于,所述生成子系统,具体包括:The system of any one of claims 62-66, wherein the generating subsystem comprises:
    第一获取模块,用于获取所述输入信息的需求分析特征;a first obtaining module, configured to acquire a demand analysis feature of the input information;
    第二获取模块,用于获取所述交互服务子系统返回的返回结果的置信度特征、所述用户的对话交互信息的上下文特征以及所述用户的个性化模型特征;a second obtaining module, configured to acquire a confidence feature of the returned result returned by the interaction service subsystem, a context feature of the user's dialog interaction information, and a personalized model feature of the user;
    第一决策模块,用于根据所述需求分析特征、所述返回结果的置信度特征、所述用户的对话交互信息的上下文特征以及所述用户的个性化模型特征对所述返回结果进行决策以确定所述用户返回结果。a first decision module, configured to determine the return result according to the requirement analysis feature, a confidence feature of the returned result, a context feature of the user's dialog interaction information, and a personalized model feature of the user Determining that the user returns a result.
  68. 如权利要求67所述的系统,其特征在于,所述需求分析特征、所述返回结果的置信度特征、所述用户的对话交互信息的上下文特征以及所述用户的个性化模型特征分别对应有各自的决策权重。The system according to claim 67, wherein said demand analysis feature, a confidence feature of said return result, a contextual feature of said user's dialog interaction information, and said user's personalized model feature respectively correspond to Their respective decision weights.
  69. 如权利要求68所述的系统,其特征在于,所述生成子系统还包括:The system of claim 68, wherein the generating subsystem further comprises:
    训练模块,用于根据所述用户的日志基于增强学习模型对所述需求分析特征、所述返回结果的置信度特征、所述用户的对话交互信息的上下文特征以及所述用户的个性化模型特征的决策权重进行训练。 a training module, configured to analyze, according to the user's log, the requirement analysis feature, the confidence feature of the returned result, the context feature of the user's dialog interaction information, and the personalized model feature of the user based on the enhanced learning model The decision weights are trained.
  70. 如权利要求62-69任一项所述的系统,其特征在于,所述系统还包括:The system of any of claims 62-69, wherein the system further comprises:
    补全子系统,用于获取与所述用户交互的交互上文信息,并根据所述交互上文信息对所述输入信息进行补全。Completing the subsystem for acquiring the above information of the interaction with the user, and completing the input information according to the information of the interaction.
  71. 如权利要求62-70任一项所述的系统,其特征在于,所述交互服务子系统包括需求满足服务子系统、引导和推荐服务子系统和聊天服务子系统中的一种或多种。The system of any of claims 62-70, wherein the interactive services subsystem comprises one or more of a demand fulfillment service subsystem, a bootstrap and recommendation service subsystem, and a chat service subsystem.
  72. 如权利要求71所述的系统,其特征在于,所述需求满足服务子系统,具体包括:The system of claim 71, wherein the demand meets a service subsystem, specifically comprising:
    第三获取模块,用于获取用户输入的问题信息;a third obtaining module, configured to obtain problem information input by the user;
    第四获取模块,用于根据所述问题信息获取用户的用户需求信息;a fourth obtaining module, configured to acquire user demand information of the user according to the problem information;
    第一分发模块,用于根据所述用户需求信息将所述问题信息分发至对应的至少一个问答服务模块;以及a first distribution module, configured to distribute the problem information to the corresponding at least one question and answer service module according to the user requirement information;
    第二决策模块,用于接收所述至少一个问答服务模块返回的问答结果,并对所述问答结果进行决策以确定最终的问答结果。And a second decision module, configured to receive a Q&A result returned by the at least one Q&A service module, and make a decision on the Q&A result to determine a final Q&A result.
  73. 如权利要求72所述的系统,其特征在于,所述问答服务模块包括阿拉丁服务模块、垂类服务模块、深度问答服务模块和信息搜索服务模块。The system according to claim 72, wherein said question answering service module comprises an Aladdin service module, a vertical service module, a deep question and answer service module, and an information search service module.
  74. 如权利要求73所述的系统,其特征在于,所述深度问答服务模块,具体包括:The system of claim 73, wherein the deep question and answer service module comprises:
    第一接收子模块,用于接收所述问题信息;a first receiving submodule, configured to receive the problem information;
    第一获取子模块,用于根据所述问题信息获取对应的问题类型;a first obtaining submodule, configured to obtain a corresponding problem type according to the problem information;
    生成子模块,用于根据所述问题类型选择对应的问答模式,并根据选择的答案生成模式和所述问题信息生成对应的问答结果。And generating a sub-module, configured to select a corresponding question and answer mode according to the type of the problem, and generate a corresponding question and answer result according to the selected answer generation mode and the problem information.
  75. 如权利要求74所述的系统,其特征在于,当所述问题类型为实体类型时,所述生成子模块,具体包括:The system of claim 74, wherein when the problem type is an entity type, the generating the sub-module specifically includes:
    第一生成单元,用于根据所述问题信息生成实体类问题信息;a first generating unit, configured to generate entity class problem information according to the problem information;
    扩展单元,用于基于搜索引擎抓取的摘要和历史展现日志对所述实体类问题信息进行扩展以生成同族实体问题信息簇,其中,所述同族实体问题信息簇分别对应候选答案;And an extension unit, configured to expand the entity class problem information to generate a cluster of the same family entity problem information according to the summary and historical presentation logs captured by the search engine, where the same family entity problem information cluster respectively corresponds to the candidate answer;
    抽取单元,用于从所述同族实体问题信息簇分别对应候选答案中抽取候选实体;And an extracting unit, configured to extract a candidate entity from the corresponding candidate answers of the same family entity problem information cluster;
    第一计算单元,用于计算所述候选实体的置信度;以及a first calculating unit, configured to calculate a confidence level of the candidate entity;
    第一反馈单元,用于将所述置信度大于预设置信度阈值的候选实体作为问答结果进行反馈。The first feedback unit is configured to feed back the candidate entity whose confidence is greater than the preset reliability threshold as a question and answer result.
  76. 如权利要求74所述的系统,其特征在于,当所述问题类型为观点类型时,所述生成子模块,具体包括:The system of claim 74, wherein when the question type is a view type, the generating the sub-module specifically includes:
    第一获取单元,用于获取所述问题信息对应的候选答案;a first obtaining unit, configured to acquire a candidate answer corresponding to the problem information;
    第一切分单元,用于对所述候选答案进行切分以生成多个候选答案短句; a first sub-unit for segmenting the candidate answers to generate a plurality of candidate answer phrases;
    第一聚合单元,用于对所述多个候选答案短句进行聚合以生成观点聚合簇;a first aggregating unit, configured to aggregate the plurality of candidate answer phrases to generate a view aggregation cluster;
    判断单元,用于判断所述观点聚合簇的观点类型;a determining unit, configured to determine a view type of the view aggregation cluster;
    选择单元,用于根据所述观点类型从所述观点聚合簇中选择出答案观点,并生成所述答案观点对应的摘要;a selecting unit, configured to select an answer point from the view aggregation cluster according to the view type, and generate a summary corresponding to the answer view;
    评分反馈单元,用于对所述答案观点进行评分,并将评分大于预设评分阈值的答案观点作为问答结果进行反馈。A score feedback unit is configured to score the answer point of view and feed back an answer point whose score is greater than a preset score threshold as a question and answer result.
  77. 如权利要求76所述的系统,其特征在于,所述第一聚合单元,具体用于:The system of claim 76, wherein the first aggregating unit is specifically configured to:
    提取所述多个候选答案短句中的关键词,并计算每两个所述关键词之间的向量夹角和/或语义相似度,以及对所述向量夹角小于预设角度或语义相似度大于预设阈值的所述候选答案进行聚合以生成观点聚合簇。Extracting keywords in the plurality of candidate answer phrases, and calculating a vector angle and/or a semantic similarity between each of the two keywords, and the angle between the vectors is less than a preset angle or semantic similarity The candidate answers whose degree is greater than a preset threshold are aggregated to generate a view aggregation cluster.
  78. 如权利要求74所述的系统,其特征在于,当所述问题类型为片段类型时,所述生成子模块,具体包括:The system of claim 74, wherein when the problem type is a segment type, the generating the sub-module specifically includes:
    第二获取单元,用于获取所述问题信息对应的候选答案;a second obtaining unit, configured to acquire a candidate answer corresponding to the problem information;
    第二切分单元,用于对所述候选答案进行切分以生成多个候选答案短句;a second segmentation unit, configured to segment the candidate answer to generate a plurality of candidate answer phrases;
    打分单元,用于对所述多个候选答案短句进行重要度打分以生成所述候选答案短句对应的短句重要度特征;a scoring unit, configured to perform an importance score on the plurality of candidate answer short sentences to generate a short sentence importance feature corresponding to the candidate answer short sentence;
    第二生成单元,用于根据所述短句重要度特征生成答案摘要;a second generating unit, configured to generate an answer summary according to the short sentence importance feature;
    第一排序单元,用于根据所述答案摘要的短句重要度特征对答案质量进行打分,并根据打分结果对候选答案进行排序;a first sorting unit, configured to score the quality of the answer according to the short sentence importance feature of the answer summary, and sort the candidate answers according to the score result;
    第二反馈单元,用于将排序结果作为问答结果进行反馈。The second feedback unit is configured to feed back the sorting result as a question and answer result.
  79. 如权利要求78所述的系统,其特征在于,所述打分单元,具体用于:The system of claim 78, wherein the scoring unit is specifically configured to:
    根据所述答案摘要的短句重要度特征、答案权威性、问题信息的相关性和答案的丰富度对答案质量进行打分。The quality of the answer is scored according to the short sentence importance feature of the answer summary, the authority of the answer, the relevance of the question information, and the richness of the answer.
  80. 如权利要求78所述的系统,其特征在于,所述第一排序单元,具体用于:The system of claim 78, wherein the first sorting unit is specifically configured to:
    获取用户的行为数据,并根据所述用户的行为数据和所述打分结果对所述候选答案进行排序。Obtaining behavior data of the user, and sorting the candidate answers according to the behavior data of the user and the scoring result.
  81. 如权利要求73-80任一项所述的系统,其特征在于,所述信息搜索服务模块,具体包括:The system of any one of claims 73 to 80, wherein the information search service module comprises:
    第二接收子模块,用于接收所述问题信息;a second receiving submodule, configured to receive the problem information;
    第一搜索子模块,用于根据所述问题信息进行搜索以生成多个候选网页;a first search submodule, configured to perform a search according to the problem information to generate a plurality of candidate webpages;
    分析反馈子模块,用于对所述候选网页进行篇章分析以生成对应的摘要,并将摘要作为问答结果进行反馈。 The analysis feedback sub-module is configured to perform chapter analysis on the candidate webpage to generate a corresponding abstract, and feed the abstract as a question and answer result.
  82. 如权利要求73-81任一项所述的系统,其特征在于,所述分析反馈子模块,具体包括:The system of any one of claims 73-81, wherein the analyzing the feedback sub-module comprises:
    分析单元,用于对所述候选网页进行篇章分析以生成对应的候选篇章;An analyzing unit, configured to perform chapter analysis on the candidate webpage to generate a corresponding candidate chapter;
    第二排序单元,用于对所述候选篇章中的句子进行打分排序;以及a second sorting unit, configured to score the sentences in the candidate chapter; and
    第三生成单元,用于根据打分排序结果生成所述摘要。And a third generating unit, configured to generate the digest according to the scoring ranking result.
  83. 如权利要求82所述的系统,其特征在于,所述第三生成单元,具体用于:The system of claim 82, wherein the third generating unit is specifically configured to:
    获取用户的需求场景信息,并根据所述需求场景信息和所述打分排序结果生成所述摘要。Obtaining the user's demand scenario information, and generating the summary according to the demand scenario information and the score sorting result.
  84. 如权利要求82所述的系统,其特征在于,所述分析反馈子模块,还包括:The system of claim 82, wherein the analyzing the feedback sub-module further comprises:
    第二聚合单元,用于对多个候选篇章的信息进行聚合。The second aggregating unit is configured to aggregate information of the plurality of candidate chapters.
  85. 如权利要求71所述的系统,其特征在于,所述聊天服务子系统,具体包括:The system of claim 71, wherein the chat service subsystem comprises:
    第一接收模块,用于接收用户输入的输入信息;a first receiving module, configured to receive input information input by a user;
    第二分发模块,用于将所述输入信息分发至聊天服务模块;a second distribution module, configured to distribute the input information to a chat service module;
    第二接收模块,用于接收所述多个聊天服务模块返回的候选回复,其中,所述候选回复具有对应的置信度;a second receiving module, configured to receive a candidate reply returned by the multiple chat service modules, where the candidate reply has a corresponding confidence level;
    排序模块,用于基于所述置信度对所述待选回复进行排序,并根据排序结果生成聊天信息,并向所述用户提供所述聊天信息。a sorting module, configured to sort the to-be-selected responses based on the confidence, and generate chat information according to the sorting result, and provide the chat information to the user.
  86. 如权利要求85所述的系统,其特征在于,还包括:The system of claim 85, further comprising:
    纠错模块,用于在所述接收用户输入的输入信息之后,对所述输入信息进行纠错和/或改写。And an error correction module, configured to perform error correction and/or rewriting on the input information after receiving the input information input by the user.
  87. 如权利要求85所述的系统,其特征在于,还包括:The system of claim 85, further comprising:
    分析模块,用于在所述接收用户输入的输入信息之后,对所述输入信息进行领域分析以获取所述输入信息对应的领域;An analysis module, configured to perform domain analysis on the input information to obtain an area corresponding to the input information after receiving the input information input by the user;
    所述第二分发模块,用于根据所述输入信息对应的领域将所述输入信息分发至具有相同或相近似领域的聊天服务模块。The second distribution module is configured to distribute the input information to a chat service module having the same or similar domain according to an area corresponding to the input information.
  88. 如权利要求85所述的系统,其特征在于,还包括:The system of claim 85, further comprising:
    第五获取模块,用于在所述接收用户输入的输入信息之后,获取与所述用户聊天的上文信息;a fifth obtaining module, configured to acquire, after receiving the input information input by the user, the above information that is chat with the user;
    第一判断模块,用于根据所述上文信息判断所述输入信息与所述上文信息的依赖关系是否大于预设关系阈值;以及a first determining module, configured to determine, according to the foregoing information, whether a dependency relationship between the input information and the foregoing information is greater than a preset relationship threshold;
    补全模块,用于当所述输入信息与所述上文信息的依赖关系大于所述预设关系阈值时,根据所述上文信息对所述输入信息进行补全。 And a completion module, configured to: when the dependency of the input information and the information above is greater than the preset relationship threshold, complete the input information according to the foregoing information.
  89. 如权利要求88所述的系统,其特征在于,还包括:The system of claim 88, further comprising:
    第六获取模块,用于根据所述上文信息获取所述用户当前的话题信息。And a sixth acquiring module, configured to acquire current topic information of the user according to the foregoing information.
  90. 如权利要求85所述的系统,其特征在于,所述聊天服务模块包括基于搜索的聊天模块、富知识聊天模块、基于画像的聊天模块和基于众包的聊天模块中的一种或多种。The system of claim 85, wherein the chat service module comprises one or more of a search-based chat module, a rich knowledge chat module, a portrait-based chat module, and a crowdsourcing-based chat module.
  91. 如权利要求90所述的系统,其特征在于,所述基于搜索的聊天模块,具体包括:The system of claim 90, wherein the search-based chat module comprises:
    切词子模块,用于对所述输入信息进行切词以生成多个聊天短句;a word sub-module, configured to perform a word-cutting on the input information to generate a plurality of chat phrases;
    查询子模块,用于根据所述多个聊天短句查询聊天语料库以生成多个聊天语料上句,以及所述多个聊天语料上句对应的多个聊天语料下句;a query sub-module, configured to query a chat corpus according to the plurality of chat phrases to generate a plurality of chat corpus sentences, and a plurality of chat corpus sentences corresponding to the plurality of chat corpora sentences;
    过滤子模块,用于对所述多个聊天语料上句进行过滤;a filtering submodule, configured to filter the plurality of chat corpus sentences;
    分类子模块,用于对过滤之后的聊天语料上句对应的聊天语料下句进行分类;以及a classification sub-module for classifying the chat corpus corresponding to the upper sentence of the chat corpus after filtering;
    第一重排序子模块,用于对分类之后的所述聊天语料下句进行重排序,并根据排序结果生成所述候选回复。And a first reordering sub-module, configured to reorder the chat corpora sentence after the classification, and generate the candidate reply according to the sorting result.
  92. 如权利要求91所述的系统,其特征在于,所述过滤子模块,具体包括:The system of claim 91, wherein the filtering sub-module comprises:
    第二计算单元,用于计算所述输入信息与所述多个聊天语料上句之间的相似度;a second calculating unit, configured to calculate a similarity between the input information and the plurality of chat corpus sentences;
    过滤单元,用于如果所述相似度小于第一预设相似度阈值,则将对应的聊天语料上句过滤;以及a filtering unit, configured to: if the similarity is less than the first preset similarity threshold, filter the corresponding chat corpus;
    保留单元,用于如果所述相似度大于或等于所述第一预设相似度阈值,则将对应的聊天语料上句保留。And a saving unit, configured to reserve the corresponding chat corpus if the similarity is greater than or equal to the first preset similarity threshold.
  93. 如权利要求91所述的系统,其特征在于,所述分类子模块,具体包括:The system of claim 91, wherein the classification sub-module comprises:
    第三计算单元,用于计算所述输入信息与所述多个聊天语料下句之间的相似度;以及a third calculating unit, configured to calculate a similarity between the input information and the plurality of chat corpus sentences;
    分类单元,用于根据所述相似度对所述多个聊天语料下句进行分类。a classifying unit, configured to classify the plurality of chat corpus sentences according to the similarity.
  94. 如权利要求93所述的系统,其特征在于,所述输入信息与所述多个聊天语料下句之间的相似度包括:The system of claim 93, wherein the similarity between the input information and the plurality of chat corpus sentences comprises:
    所述输入信息与所述聊天语料下句之间字面的相似度;a literal similarity between the input information and the sentence of the chat corpus;
    或者,所述输入信息与所述聊天语料下句基于深度神经网络训练得到的相似度;Or the similarity between the input information and the chat corpus based on deep neural network training;
    或者,所述输入信息与所述聊天语料下句基于机器翻译模型训练得到的相似度。Alternatively, the input information is similar to the training of the chat corpus based on the machine translation model.
  95. 如权利要求91所述的系统,其特征在于,所述第一重排序子模块,具体包括:The system of claim 91, wherein the first reordering sub-module comprises:
    第三获取单元,用于根据所述用户聊天的上文信息获取所述用户的聊天属性;a third acquiring unit, configured to acquire, according to the foregoing information of the user chat, a chat attribute of the user;
    重排序单元,用于根据所述聊天属性对所述分类之后的所述聊天语料下句进行重排序。a reordering unit, configured to reorder the chat corpus sentence after the classification according to the chat attribute.
  96. 如权利要求90所述的系统,其特征在于,所述富知识聊天模块,具体包括:The system of claim 90, wherein the rich knowledge chat module comprises:
    第二搜索子模块,用于根据所述输入信息生成搜索词,并根据所述搜索词进行搜索以生成多个搜索结果; a second search submodule, configured to generate a search term according to the input information, and perform a search according to the search term to generate a plurality of search results;
    抽取子模块,用于对所述多个搜索结果进行句子抽取,以获取候选句子集合;An extraction submodule, configured to perform sentence extraction on the plurality of search results to obtain a candidate sentence set;
    改写子模块,用于对所述候选句子集合中的句子进行改写以生成所述候选回复。Rewriting a sub-module for rewriting a sentence in the candidate sentence set to generate the candidate reply.
  97. 如权利要求96所述的系统,其特征在于,所述候选句子集合中的句子与所述搜索词的相似度大于第二预设相似度阈值。The system of claim 96, wherein the similarity between the sentence in the set of candidate sentences and the search term is greater than a second predetermined similarity threshold.
  98. 如权利要求96所述的系统,其特征在于,还包括:The system of claim 96, further comprising:
    第二重排序子模块,用于根据所述用户的聊天属性对所述候选句子集合中的句子进行重排序。And a second reordering submodule, configured to reorder the sentences in the candidate sentence set according to the chat attribute of the user.
  99. 如权利要求90所述的系统,其特征在于,所述基于画像的聊天模块,具体包括:The system of claim 90, wherein the image-based chat module comprises:
    第二获取子模块,用于获取所述用户的聊天语境;a second obtaining submodule, configured to acquire a chat context of the user;
    第一判断子模块,用于根据所述聊天语境判断是否满足收集条件;a first determining sub-module, configured to determine, according to the chat context, whether a collection condition is met;
    发送子模块,用于如果判断满足所述收集条件,则向所述用户发送问题;a sending submodule, configured to send a question to the user if it is determined that the collection condition is met;
    第一更新子模块,用于接收所述用户根据所述问题的回答信息,并根据所述回答信息对用户画像模型进行更新。The first update submodule is configured to receive the answer information of the user according to the question, and update the user portrait model according to the answer information.
  100. 如权利要求90所述的系统,其特征在于,所述基于画像的聊天模块,具体包括:The system of claim 90, wherein the image-based chat module comprises:
    第三获取子模块,用于获取所述用户的聊天内容;a third obtaining submodule, configured to acquire chat content of the user;
    提取子模块,用于根据所述聊天内容提取用户画像数据;An extraction submodule, configured to extract user portrait data according to the chat content;
    第二更新子模块,用于根据提取的所述用户画像数据对所述用户画像模型进行更新。And a second update submodule, configured to update the user portrait model according to the extracted user portrait data.
  101. 如权利要求90所述的系统,其特征在于,所述基于众包的聊天模块,具体用于:The system of claim 90, wherein the crowdsourcing-based chat module is specifically configured to:
    判断所述输入信息是否适合众包完成,如果判断适合众包完成,则将所述输入信息分发至对应的执行者,以及接收所述执行者的回复信息,并对所述回复信息进行质量判断,如果满足质量要求,则将所述回复信息作为所述候选回复。Determining whether the input information is suitable for crowdsourcing completion, if it is determined that the crowdsourcing is completed, distributing the input information to a corresponding performer, and receiving the reply information of the performer, and performing quality judgment on the reply information If the quality requirement is met, the reply message is replied as the candidate.
  102. 如权利要求85所述的系统,其特征在于,所述聊天服务子系统,还包括:The system of claim 85, wherein the chat service subsystem further comprises:
    第二判断模块,用于判断所述输入信息是否属于无实际内容的聊天信息;a second determining module, configured to determine whether the input information belongs to chat information without actual content;
    第七获取模块,用于如果判断是属于无实际内容的聊天信息,则获取当前话题;a seventh obtaining module, configured to acquire a current topic if it is determined to belong to chat information without actual content;
    第一生成模块,用于根据所述当前话题生成引导话题;以及a first generating module, configured to generate a guiding topic according to the current topic;
    第二生成模块,用于根据所述引导话题生成所述候选回复。And a second generating module, configured to generate the candidate reply according to the guiding topic.
  103. 如权利要求85所述的系统,其特征在于,所述排序模块,具体用于:The system of claim 85, wherein the ranking module is specifically configured to:
    获取所述用户的所述输入信息的特征,并基于所述输入信息的特征和所述置信度对所述待选回复进行排序。Acquiring characteristics of the input information of the user, and sorting the to-be-selected responses based on characteristics of the input information and the confidence.
  104. 如权利要求71所述的系统,其特征在于,所述引导和推荐服务子系统,具体包括:The system of claim 71, wherein the guiding and recommending service subsystem comprises:
    确定模块,用于接收用户输入的交互信息,并根据所述交互信息确定当前话题; a determining module, configured to receive interaction information input by the user, and determine a current topic according to the interaction information;
    获得模块,用于基于话题图谱获得多个与所述当前话题相关的待选引导话题,其中,所述话题图谱包括多个话题及所述话题之间的关联关系;Obtaining a module, configured to obtain, according to a topic map, a plurality of candidate guidance topics related to the current topic, where the topic map includes a plurality of topics and an association relationship between the topics;
    第八获取模块,用于获取所述用户的用户画像数据;以及An eighth obtaining module, configured to acquire user image data of the user;
    反馈模块,用于根据所述用户画像数据从所述多个与所述当前话题相关的待选引导话题中选择引导话题,并向所述用户反馈所述引导话题。And a feedback module, configured to select a guiding topic from the plurality of candidate guiding topics related to the current topic according to the user portrait data, and feed back the guiding topic to the user.
  105. 如权利要求104所述的系统,其特征在于,还包括:The system of claim 104, further comprising:
    建立模块,用于在所述引导和推荐服务子系统基于话题图谱获得多个与所述当前话题相关的待选引导话题之前,建立所述话题图谱。And a establishing module, configured to establish the topic map before the guiding and recommendation service subsystem obtains a plurality of candidate guiding topics related to the current topic based on the topic map.
  106. 如权利要求105所述的系统,其特征在于,所述建立模块,具体包括:The system of claim 105, wherein the establishing module comprises:
    第四获取子模块,用于获取话题关联数据;以及a fourth obtaining submodule for acquiring topic related data;
    建立子模块,用于根据所述话题关联数据建立所述话题图谱。Establishing a sub-module for establishing the topic map according to the topic association data.
  107. 如权利要求106所述的系统,其特征在于,所述第四获取子模块,具体用于:The system of claim 106, wherein the fourth obtaining sub-module is specifically configured to:
    获取网络文本数据;Obtain network text data;
    当所述网络文本数据为非结构化数据时,基于实体提取和句法分析获取所述话题关联数据;或者When the network text data is unstructured data, acquiring the topic association data based on entity extraction and syntax analysis; or
    当所述网络文本数据为半结构化数据时,基于页面结构分析、标签提取、实体识别获取所述话题关联数据;或者When the network text data is semi-structured data, acquiring the topic-related data based on page structure analysis, label extraction, and entity identification; or
    当所述网络文本数据为结构化数据时,从知识图谱中获取所述话题关联数据。When the network text data is structured data, the topic related data is obtained from the knowledge map.
  108. 如权利要求106所述的系统,其特征在于,所述第四获取子模块,具体用于:The system of claim 106, wherein the fourth obtaining sub-module is specifically configured to:
    获取所述用户的搜索行为数据,并根据所述搜索行为数据获取对应的搜索对象,以及根据所述搜索对象生成所述话题关联数据;或者Obtaining search behavior data of the user, and acquiring a corresponding search object according to the search behavior data, and generating the topic association data according to the search object; or
    获取所述用户的浏览行为数据,并根据所述浏览行为数据获取对应的浏览对象,根据所述浏览对象生成所述话题关联数据。Acquiring the browsing behavior data of the user, and acquiring a corresponding browsing object according to the browsing behavior data, and generating the topic association data according to the browsing object.
  109. 如权利要求106所述的系统,其特征在于,所述建立子模块,具体用于:The system of claim 106, wherein the establishing submodule is specifically configured to:
    通过RandomWalk算法、关联分析算法、协同过滤算法中的一种或多种,根据所述话题关联数据建立所述话题图谱。The topic map is established according to the topic association data by one or more of a RandomWalk algorithm, an association analysis algorithm, and a collaborative filtering algorithm.
  110. 如权利要求104所述的系统,其特征在于,所述确定模块,具体用于:The system of claim 104, wherein the determining module is specifically configured to:
    对所述交互信息进行需求识别以及相关性计算以确定所述当前话题。A requirement identification and a correlation calculation are performed on the interaction information to determine the current topic.
  111. 如权利要求104所述的系统,其特征在于,所述反馈模块,具体用于:The system of claim 104, wherein the feedback module is specifically configured to:
    根据所述用户画像数据和所述交互信息的上下文信息确定所述用户的意图信息,以及根据所述用户的意图信息从所述多个与所述当前话题相关的待选引导话题中选择引导话题,并向所述用户反馈所述引导话题。 Determining the intent information of the user according to the user portrait data and the context information of the interaction information, and selecting a guiding topic from the plurality of candidate guiding topics related to the current topic according to the intent information of the user And feeding back the guiding topic to the user.
  112. 如权利要求104所述的系统,其特征在于,还包括:The system of claim 104, further comprising:
    解析模块,用于对所述交互信息进行解析,并获取所述交互信息中的关键字段,所述关键字段包括时间信息、地点信息、提醒事件的一种或多种;a parsing module, configured to parse the interaction information, and obtain a key segment in the interaction information, where the keyword segment includes one or more of time information, location information, and reminder events;
    建立提醒模块,用于根据所述关键字段建立提醒信息;Establishing a reminding module, configured to establish reminding information according to the key segment;
    提醒模块,用于当所述时间信息达到预设时间时,向用户发送所述提醒信息。The reminding module is configured to send the reminding information to the user when the time information reaches a preset time.
  113. 如权利要求62所述的系统,其特征在于,还包括:The system of claim 62, further comprising:
    记录子系统,用于如果在网络资源中不存在满足所述用户需求的所述用户返回结果,则记录所述用户的输入信息;a recording subsystem, configured to record input information of the user if the user returns a result that satisfies the user requirement in the network resource;
    监控子系统,用于以预设周期监控所述网络资源中是否存在满足所述用户需求的所述用户返回结果;a monitoring subsystem, configured to monitor, by using a preset period, whether the user returns a result that meets the user requirement in the network resource;
    提供子系统,还用于当所述用户返回结果存在时,将所述用户返回结果提供至所述用户。Providing a subsystem, further configured to provide the user return result to the user when the user returns a result.
  114. 如权利要求73所述的系统,其特征在于,所述垂类服务模块,具体包括:The system of claim 73, wherein the service module comprises:
    第五获取子模块,用于获取用户输入的查询词;a fifth obtaining submodule, configured to obtain a query word input by the user;
    确定子模块,用于确定所述查询词属于的垂类;Determining a sub-module for determining a vertical class to which the query term belongs;
    交互子模块,用于在所述查询词属于的垂类中,与用户进行至少一轮的交互,得到用户需要的查询结果,其中,每轮交互时,展示给用户的信息包括:对应查询词的查询结果,以及,引导信息。The interaction sub-module is configured to perform at least one round of interaction with the user in the vertical class to which the query term belongs, to obtain a query result required by the user, wherein, in each round of interaction, the information displayed to the user includes: corresponding query word The results of the query, as well, the boot information.
  115. 根据权利要求114所述的系统,其特征在于,所述查询词是自然语言表示的,所述交互子模块,具体包括:The system according to claim 114, wherein the query term is expressed in a natural language, and the interaction sub-module specifically includes:
    解析单元,用于将所述查询词解析为所述查询词属于的垂类的垂类知识体系能够表示的结构化信息;a parsing unit, configured to parse the query word into structured information that can be represented by a vertical knowledge system of the vertical class to which the query term belongs;
    第四获取单元,用于根据所述结构化信息、所述垂类知识体系,以及,所述查询词属于的垂类的垂类资源库,获取相关信息,所述相关信息包括:对应所述查询词的查询结果,以及,引导信息;a fourth obtaining unit, configured to acquire related information according to the structured information, the vertical knowledge system, and a vertical resource library of the vertical class to which the query word belongs, where the related information includes: corresponding to the The query result of the query word, and the guide information;
    展示单元,用于向用户展示所述查询结果和所述引导信息;a display unit, configured to display the query result and the guiding information to a user;
    第五获取单元,用于在用户根据所述引导信息再次输入查询词后,重复上述根据查询词获取相关信息的流程,直至得到用户需要的查询结果。The fifth obtaining unit is configured to repeat the foregoing process of acquiring related information according to the query word after the user inputs the query word again according to the guiding information, until the query result required by the user is obtained.
  116. 根据权利要求115所述的系统,其特征在于,所述第四获取单元,具体包括:The system of claim 115, wherein the fourth obtaining unit comprises:
    更新子单元,用于根据所述结构化信息和用户前一次的状态信息,更新用户的当前状态信息;Updating a subunit, configured to update current status information of the user according to the structured information and a previous status information of the user;
    生成子单元,用于根据所述垂类知识体系和所述垂类资源库,生成所述当前状态信息 对应的候选动作;Generating a subunit, configured to generate the current state information according to the vertical knowledge system and the vertical resource library Corresponding candidate action;
    匹配子单元,用于根据预设模型在所述候选动作中选择与所述当前状态信息匹配程度较高的预设个数的候选动作,将选择的候选动作作为相关信息。And a matching sub-unit, configured to select, in the candidate action, a preset number of candidate actions that match the current state information according to the preset model, and select the selected candidate action as related information.
  117. 根据权利要求116所述的系统,其特征在于,还包括:The system of claim 116, further comprising:
    更新参数单元,用于根据用户的反馈更新预设模型的参数,以便在参数不同时选择不同的候选动作。The parameter unit is updated to update the parameters of the preset model according to the feedback of the user, so as to select different candidate actions when the parameters are different.
  118. 根据权利要求116所述的系统,其特征在于,还包括:The system of claim 116, further comprising:
    获取子单元,用于根据用户的偏好或者交互历史,获取用户的初始化状态信息。The obtaining subunit is configured to obtain initialization state information of the user according to the preference of the user or the interaction history.
  119. 根据权利要求116所述的系统,其特征在于,所述候选动作包括:满足用户需求的动作,或者,进一步澄清用户需求的动作,或者,为用户需求提供横向或纵向的引导信息,其中,用户需求根据查询词确定,所述满足用户需求的动作,或者,进一步澄清用户需求的动作在被选择后作为查询结果,为用户需求提供横向或纵向的引导信息在被选择后作为引导信息。The system according to claim 116, wherein said candidate actions comprise: actions that satisfy user requirements, or actions that further clarify user requirements, or provide horizontal or vertical guidance information for user needs, wherein the user The requirement is determined according to the query term, the action that satisfies the user's requirement, or the action that further clarifies the user's demand is selected as the query result, and the horizontal or vertical guidance information is provided as the guide information after being selected.
  120. 根据权利要求115所述的系统,其特征在于,还包括:The system of claim 115, further comprising:
    组成单元,用于获取所述查询词属于的垂类的结构化资源和非结构化资源,将所述结构化资源和所述非结构化资源组成所述垂类资源库,其中,所述结构化资源是从多个对应的垂类网站抓取整合数据后得到的全亮数据资源,所述非结构化资源根据用户查询词或互联网文本挖掘得到的结构化资源的补充或扩展信息。a component unit, configured to acquire a structured resource and an unstructured resource of a vertical class to which the query term belongs, and the structured resource and the unstructured resource are configured into the vertical resource library, wherein the structure The resource is a full-light data resource obtained by fetching integrated data from a plurality of corresponding web sites, and the unstructured resource is supplemented or expanded according to a structured resource obtained by user query words or Internet text mining.
  121. 根据权利要求114-120任一项所述的系统,其特征在于,所述第五获取子模块,具体用于:The system according to any one of claims 114 to 120, wherein the fifth obtaining submodule is specifically configured to:
    获取用户以文本、语音或图像输入的查询词。Get the query words that the user enters in text, voice, or image.
  122. 根据权利要求114-121任一项所述的系统,其特征在于,所述确定子模块,具体用于:The system according to any one of claims 114-121, wherein the determining sub-module is specifically configured to:
    基于机器学习方式,或者,基于模式解析方式,确定所述查询词属于的垂类。The vertical class to which the query term belongs is determined based on a machine learning manner or based on a mode analysis manner.
  123. 一种设备,其特征在于,包括:An apparatus, comprising:
    一个或者多个处理器;One or more processors;
    存储器;Memory
    一个或者多个程序,所述一个或者多个程序存储在所述存储器中,当被所述一个或者多个处理器执行时,执行如权利要求1-61任一项所述的基于人工智能的人机交互方法。One or more programs, the one or more programs being stored in the memory, when executed by the one or more processors, performing the artificial intelligence based method of any one of claims 1-61 Human-computer interaction method.
  124. 一种非易失性计算机存储介质,其特征在于,所述计算机存储介质存储有一个或者多个程序,当所述一个或者多个程序被一个设备执行时,使得所述设备执行如 权利要求1-61任一项所述的基于人工智能的人机交互方法。 A non-volatile computer storage medium, characterized in that the computer storage medium stores one or more programs, when the one or more programs are executed by a device, causing the device to perform The artificial intelligence based human-computer interaction method according to any one of claims 1-61.
PCT/CN2015/096599 2015-09-07 2015-12-07 Man-machine interaction method and system based on artificial intelligence WO2017041372A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201510563338.2A CN105068661B (en) 2015-09-07 2015-09-07 Man-machine interaction method based on artificial intelligence and system
CN201510563338.2 2015-09-07

Publications (1)

Publication Number Publication Date
WO2017041372A1 true WO2017041372A1 (en) 2017-03-16

Family

ID=54498047

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/096599 WO2017041372A1 (en) 2015-09-07 2015-12-07 Man-machine interaction method and system based on artificial intelligence

Country Status (2)

Country Link
CN (1) CN105068661B (en)
WO (1) WO2017041372A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018213485A1 (en) * 2017-05-17 2018-11-22 Google Llc Determining agents for performing actions based at least in part on image data
WO2019193479A1 (en) * 2018-04-05 2019-10-10 Venkata Krishna Pratyusha Challa Cognitive robotic system for test data management activities and method employed thereof

Families Citing this family (157)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105068661B (en) * 2015-09-07 2018-09-07 百度在线网络技术(北京)有限公司 Man-machine interaction method based on artificial intelligence and system
CN105183848A (en) * 2015-09-07 2015-12-23 百度在线网络技术(北京)有限公司 Human-computer chatting method and device based on artificial intelligence
CN105513593B (en) * 2015-11-24 2019-09-17 南京师范大学 A kind of intelligent human-machine interaction method of voice driven
CN106803092B (en) * 2015-11-26 2020-07-10 阿里巴巴集团控股有限公司 Method and device for determining standard problem data
CN105512228B (en) * 2015-11-30 2018-12-25 北京光年无限科技有限公司 A kind of two-way question and answer data processing method and system based on intelligent robot
CN105511608B (en) * 2015-11-30 2018-12-25 北京光年无限科技有限公司 Exchange method and device, intelligent robot based on intelligent robot
CN106844368B (en) * 2015-12-03 2020-06-16 华为技术有限公司 Method for man-machine conversation, neural network system and user equipment
CN105491126A (en) * 2015-12-07 2016-04-13 百度在线网络技术(北京)有限公司 Service providing method and service providing device based on artificial intelligence
CN105446491B (en) * 2015-12-16 2018-09-18 北京光年无限科技有限公司 A kind of exchange method and device based on intelligent robot
CN105589848A (en) * 2015-12-28 2016-05-18 百度在线网络技术(北京)有限公司 Dialog management method and device
CN105677636A (en) * 2015-12-30 2016-06-15 上海智臻智能网络科技股份有限公司 Information processing method and device for intelligent question-answering system
CN105608221B (en) * 2016-01-11 2018-08-21 北京光年无限科技有限公司 A kind of self-learning method and device towards question answering system
CN105701208B (en) * 2016-01-13 2018-11-30 北京光年无限科技有限公司 A kind of question and answer evaluation method and device towards question answering system
CN107015983A (en) * 2016-01-27 2017-08-04 阿里巴巴集团控股有限公司 A kind of method and apparatus for being used in intelligent answer provide knowledge information
CN105677896B (en) * 2016-02-03 2019-08-02 北京光年无限科技有限公司 Exchange method and interactive system based on Active Learning
CN105740948B (en) * 2016-02-04 2019-05-21 北京光年无限科技有限公司 A kind of exchange method and device towards intelligent robot
CN105786977B (en) * 2016-02-05 2020-03-03 北京百度网讯科技有限公司 Mobile search method and device based on artificial intelligence
CN105653738B (en) * 2016-03-01 2020-05-22 北京百度网讯科技有限公司 Search result broadcasting method and device based on artificial intelligence
CN105808695A (en) * 2016-03-03 2016-07-27 陈包容 Method and device for obtaining chat reply contents
CN107180037B (en) * 2016-03-09 2020-09-01 北京京东尚科信息技术有限公司 Man-machine interaction method and device
CN105701254B (en) * 2016-03-09 2020-11-13 北京搜狗科技发展有限公司 Information processing method and device for information processing
CN105824757B (en) * 2016-03-18 2018-05-01 北京光年无限科技有限公司 Test method and system based on robot operating system
CN105787560B (en) * 2016-03-18 2018-04-03 北京光年无限科技有限公司 Dialogue data interaction processing method and device based on Recognition with Recurrent Neural Network
CN105824935A (en) * 2016-03-18 2016-08-03 北京光年无限科技有限公司 Method and system for information processing for question and answer robot
US9984772B2 (en) * 2016-04-07 2018-05-29 Siemens Healthcare Gmbh Image analytics question answering
CN105930367B (en) * 2016-04-12 2020-06-09 华南师范大学 Intelligent chat robot control method and control device
CN105930372B (en) * 2016-04-12 2019-05-31 华南师范大学 Based on emotional robot dialogue method, system and the robot repeatedly fed back
CN107301188B (en) * 2016-04-15 2020-11-10 北京搜狗科技发展有限公司 Method for acquiring user interest and electronic equipment
CN105931638B (en) * 2016-04-26 2019-12-24 北京光年无限科技有限公司 Intelligent robot-oriented dialogue system data processing method and device
CN105913039B (en) * 2016-04-26 2020-08-18 北京光年无限科技有限公司 Interactive processing method and device for dialogue data based on vision and voice
CN105975511A (en) * 2016-04-27 2016-09-28 乐视控股(北京)有限公司 Intelligent dialogue method and apparatus
CN105798918B (en) * 2016-04-29 2018-08-21 北京光年无限科技有限公司 A kind of exchange method and device towards intelligent robot
US9864431B2 (en) * 2016-05-11 2018-01-09 Microsoft Technology Licensing, Llc Changing an application state using neurological data
CN106021403B (en) * 2016-05-12 2019-06-04 北京奔影网络科技有限公司 Client service method and device
CN106055641B (en) * 2016-05-31 2020-01-14 北京光年无限科技有限公司 Intelligent robot-oriented man-machine interaction method and device
CN106095833B (en) * 2016-06-01 2019-04-16 竹间智能科技(上海)有限公司 Human-computer dialogue content processing method
CN106020488A (en) * 2016-06-03 2016-10-12 北京光年无限科技有限公司 Man-machine interaction method and device for conversation system
CN106202165B (en) * 2016-06-24 2020-03-17 北京小米移动软件有限公司 Intelligent learning method and device for man-machine interaction
CN106202186B (en) * 2016-06-27 2020-10-30 百度在线网络技术(北京)有限公司 Service recommendation method and device based on artificial intelligence
CN106202270B (en) * 2016-06-28 2020-03-20 广州幽联信息技术有限公司 Man-machine conversation method and device based on natural language
CN106462647A (en) * 2016-06-28 2017-02-22 深圳狗尾草智能科技有限公司 Multi-intention-based multi-skill-package questioning and answering method, system and robot
WO2018000208A1 (en) * 2016-06-28 2018-01-04 深圳狗尾草智能科技有限公司 Method and system for searching for and positioning skill packet, and robot
WO2018000281A1 (en) * 2016-06-29 2018-01-04 深圳狗尾草智能科技有限公司 User portrait representation learning system and method based on deep neural network
CN106663131A (en) * 2016-06-29 2017-05-10 深圳狗尾草智能科技有限公司 Personalized response generating method and personalized response generating system based on user portrait
CN106489148A (en) * 2016-06-29 2017-03-08 深圳狗尾草智能科技有限公司 A kind of intention scene recognition method that is drawn a portrait based on user and system
WO2018006366A1 (en) * 2016-07-07 2018-01-11 深圳狗尾草智能科技有限公司 Interactive information-based scoring method and system
CN106168971A (en) * 2016-07-08 2016-11-30 北京麒麟合盛网络技术有限公司 information subscribing method and device
CN106126503B (en) * 2016-07-12 2020-02-11 海信集团有限公司 Service field positioning method and terminal
WO2018010635A1 (en) * 2016-07-14 2018-01-18 腾讯科技(深圳)有限公司 Method of generating random interactive data, network server, and smart conversation system
CN107623621B (en) * 2016-07-14 2020-08-07 腾讯科技(深圳)有限公司 Chat corpus collection method and device
CN107040450B (en) * 2016-07-20 2018-06-01 平安科技(深圳)有限公司 Automatic reply method and device
CN106250366B (en) * 2016-07-21 2019-04-19 北京光年无限科技有限公司 A kind of data processing method and system for question answering system
CN106182007B (en) * 2016-08-09 2018-07-27 北京光年无限科技有限公司 A kind of interim card processing method and processing device for intelligent robot
CN107784354A (en) * 2016-08-17 2018-03-09 华为技术有限公司 The control method and company robot of robot
CN106446045B (en) * 2016-08-31 2020-01-21 上海交通大学 User portrait construction method and system based on dialogue interaction
CN106415527B (en) * 2016-08-31 2019-07-30 北京小米移动软件有限公司 Information communication method and device
CN106328166B (en) * 2016-08-31 2019-11-08 上海交通大学 Human-computer dialogue abnormality detection system and method
CN106469212B (en) * 2016-09-05 2019-10-15 北京百度网讯科技有限公司 Man-machine interaction method and device based on artificial intelligence
US20180082184A1 (en) * 2016-09-19 2018-03-22 TCL Research America Inc. Context-aware chatbot system and method
WO2018057536A1 (en) * 2016-09-20 2018-03-29 Google Llc Bot requesting permission for accessing data
CN106503046B (en) * 2016-09-21 2020-01-14 北京光年无限科技有限公司 Interaction method and system based on intelligent robot
CN106485634A (en) * 2016-09-27 2017-03-08 北京百度网讯科技有限公司 Opinion poll method and device based on artificial intelligence
CN106445147B (en) * 2016-09-28 2019-05-10 北京百度网讯科技有限公司 The behavior management method and device of conversational system based on artificial intelligence
CN106446213B (en) * 2016-09-30 2020-04-14 北京百度网讯科技有限公司 Service ordering method and device based on artificial intelligence
CN106547736B (en) * 2016-10-31 2020-01-10 百度在线网络技术(北京)有限公司 Text information term importance degree generation method and device based on artificial intelligence
CN106547884A (en) * 2016-11-03 2017-03-29 深圳量旌科技有限公司 A kind of behavior pattern learning system of augmentor
CN106570002A (en) * 2016-11-07 2017-04-19 网易(杭州)网络有限公司 Natural language processing method and device
CN106716933B (en) * 2016-11-10 2020-11-06 深圳达闼科技控股有限公司 Message processing method and device and electronic equipment
CN106557165B (en) * 2016-11-14 2019-06-21 北京儒博科技有限公司 The action simulation exchange method and device and smart machine of smart machine
CN106557576B (en) * 2016-11-24 2020-02-04 百度在线网络技术(北京)有限公司 Prompt message recommendation method and device based on artificial intelligence
CN106779817A (en) * 2016-11-29 2017-05-31 竹间智能科技(上海)有限公司 Intension recognizing method and system based on various dimensions information
CN106559321A (en) * 2016-12-01 2017-04-05 竹间智能科技(上海)有限公司 The method and system of dynamic adjustment dialog strategy
CN108132952A (en) * 2016-12-01 2018-06-08 百度在线网络技术(北京)有限公司 A kind of active searching method and device based on speech recognition
US20200090057A1 (en) * 2016-12-07 2020-03-19 Cloudminds (Shenzhen) Robotics Systems Co., Ltd. Human-computer hybrid decision method and apparatus
CN106777018B (en) * 2016-12-08 2020-05-22 竹间智能科技(上海)有限公司 Method and device for optimizing input sentences in intelligent chat robot
CN106708983A (en) * 2016-12-09 2017-05-24 竹间智能科技(上海)有限公司 Dialogue interactive information-based user portrait construction system and method
CN106847271A (en) * 2016-12-12 2017-06-13 北京光年无限科技有限公司 A kind of data processing method and device for talking with interactive system
CN106599179B (en) * 2016-12-13 2020-02-14 竹间智能科技(上海)有限公司 Man-machine conversation control method and device integrating knowledge graph and memory graph
CN106777081A (en) * 2016-12-13 2017-05-31 竹间智能科技(上海)有限公司 Method and device for determining conversational system acknowledgment strategy
US20180173999A1 (en) * 2016-12-21 2018-06-21 XBrain, Inc. Natural Transfer of Knowledge Between Human and Artificial Intelligence
CN106649739B (en) * 2016-12-23 2020-09-11 广东惠禾科技发展有限公司 Multi-round interactive information inheritance identification method and device and interactive system
CN106649752A (en) * 2016-12-26 2017-05-10 北京云知声信息技术有限公司 Answer acquisition method and device
CN106844603A (en) * 2017-01-16 2017-06-13 竹间智能科技(上海)有限公司 The computational methods and device, application process and device of entity hot topic degree
CN106777364A (en) * 2017-01-22 2017-05-31 竹间智能科技(上海)有限公司 Artificial intelligence response method and device that topic drives
CN106919660A (en) * 2017-02-09 2017-07-04 厦门快商通科技股份有限公司 The clothes customer service intelligent Service method and system of knowledge based graphical spectrum technology
CN106997375B (en) * 2017-02-28 2020-08-18 浙江大学 Customer service reply recommendation method based on deep learning
WO2018157349A1 (en) * 2017-03-02 2018-09-07 深圳前海达闼云端智能科技有限公司 Method for interacting with robot, and interactive robot
CN106909677B (en) * 2017-03-02 2020-09-08 腾讯科技(深圳)有限公司 Method and device for generating question
CN106951470B (en) * 2017-03-03 2019-01-18 中兴耀维科技江苏有限公司 A kind of intelligent Answer System based on the retrieval of professional knowledge figure
CN107169586A (en) * 2017-03-29 2017-09-15 北京百度网讯科技有限公司 Resource optimization method, device and storage medium based on artificial intelligence
CN107066567A (en) * 2017-04-05 2017-08-18 竹间智能科技(上海)有限公司 The user's portrait modeling method and system detected in word dialog based on topic
CN106934068A (en) * 2017-04-10 2017-07-07 江苏东方金钰智能机器人有限公司 The method that robot is based on the semantic understanding of environmental context
CN108766421B (en) * 2017-04-20 2020-09-15 杭州萤石网络有限公司 Voice interaction method and device
CN108733722B (en) * 2017-04-24 2020-07-31 北京京东尚科信息技术有限公司 Automatic generation method and device for conversation robot
WO2018195783A1 (en) * 2017-04-25 2018-11-01 Microsoft Technology Licensing, Llc Input method editor
CN107153641B (en) * 2017-05-08 2021-01-12 北京百度网讯科技有限公司 Comment information determination method, comment information determination device, server and storage medium
CN107122491B (en) * 2017-05-19 2020-12-15 深圳市优必选科技有限公司 Method for data interaction
CN107273477A (en) * 2017-06-09 2017-10-20 北京光年无限科技有限公司 A kind of man-machine interaction method and device for robot
CN107291867A (en) * 2017-06-13 2017-10-24 北京百度网讯科技有限公司 Dialog process method, device, equipment and computer-readable recording medium based on artificial intelligence
CN107239978A (en) * 2017-06-23 2017-10-10 北京好豆网络科技有限公司 The analysis method and device of cuisines content
CN107918634A (en) * 2017-06-27 2018-04-17 上海壹账通金融科技有限公司 Intelligent answer method, apparatus and computer-readable recording medium
CN107507612B (en) * 2017-06-30 2020-08-28 百度在线网络技术(北京)有限公司 Voiceprint recognition method and device
CN107391645B (en) * 2017-07-12 2018-04-10 广州市昊链信息科技股份有限公司 A kind of logistics information automatic push and practical operation specification form system and method
CN107451265A (en) * 2017-07-31 2017-12-08 广州网嘉玩具科技开发有限公司 A kind of story platform based on Internet of Things and artificial intelligence technology
CN107977395B (en) * 2017-08-01 2020-10-23 北京物灵智能科技有限公司 Method for helping user read and understand electronic article and intelligent voice assistant
CN107577728B (en) * 2017-08-22 2020-06-26 北京奇艺世纪科技有限公司 User request processing method and device
CN107316644A (en) * 2017-08-22 2017-11-03 北京百度网讯科技有限公司 Method and device for information exchange
CN107577736A (en) * 2017-08-25 2018-01-12 上海斐讯数据通信技术有限公司 A kind of file recommendation method and system based on BP neural network
CN107590216A (en) * 2017-08-31 2018-01-16 北京百度网讯科技有限公司 Answer preparation method, device and computer equipment
WO2019051847A1 (en) * 2017-09-18 2019-03-21 Microsoft Technology Licensing, Llc Providing diet assistance in a session
CN107590274A (en) * 2017-09-27 2018-01-16 合肥博力生产力促进中心有限公司 A kind of intelligent Answer System for patent service
CN107741976A (en) * 2017-10-16 2018-02-27 泰康保险集团股份有限公司 Intelligent response method, apparatus, medium and electronic equipment
CN107679039B (en) * 2017-10-17 2020-12-29 北京百度网讯科技有限公司 Method and device for determining statement intention
CN107704612A (en) * 2017-10-23 2018-02-16 北京光年无限科技有限公司 Dialogue exchange method and system for intelligent robot
CN107679231A (en) * 2017-10-24 2018-02-09 济南浪潮高新科技投资发展有限公司 A kind of vertical field and the implementation method of Opening field mixed type intelligent Answer System
US10546584B2 (en) * 2017-10-29 2020-01-28 International Business Machines Corporation Creating modular conversations using implicit routing
CN107870994A (en) * 2017-10-31 2018-04-03 北京光年无限科技有限公司 Man-machine interaction method and system for intelligent robot
CN107885815A (en) * 2017-11-06 2018-04-06 北京奇艺世纪科技有限公司 Content recommendation method, device and electronic equipment
CN107832439B (en) * 2017-11-16 2019-03-08 百度在线网络技术(北京)有限公司 Method, system and the terminal device of more wheel state trackings
CN108052499A (en) * 2017-11-20 2018-05-18 北京百度网讯科技有限公司 Text error correction method, device and computer-readable medium based on artificial intelligence
CN108170704A (en) * 2017-11-21 2018-06-15 北京明略软件系统有限公司 A kind of method and device of atlas analysis
CN108000526A (en) * 2017-11-21 2018-05-08 北京光年无限科技有限公司 Dialogue exchange method and system for intelligent robot
CN107967243A (en) * 2017-11-22 2018-04-27 语联网(武汉)信息技术有限公司 A kind of processing method for supporting that user independently makes pauses in reading unpunctuated ancient writings
CN107798140B (en) * 2017-11-23 2020-07-03 中科鼎富(北京)科技发展有限公司 Dialog system construction method, semantic controlled response method and device
CN108108340A (en) * 2017-11-28 2018-06-01 北京光年无限科技有限公司 For the dialogue exchange method and system of intelligent robot
CN107958059B (en) * 2017-12-01 2020-07-10 北京百度网讯科技有限公司 Intelligent question answering method, device, terminal and computer readable storage medium
CN108021556A (en) * 2017-12-20 2018-05-11 北京百度网讯科技有限公司 For obtaining the method and device of information
CN108170749A (en) * 2017-12-21 2018-06-15 北京百度网讯科技有限公司 Dialogue method, device and computer-readable medium based on artificial intelligence
CN110019837A (en) * 2017-12-22 2019-07-16 百度在线网络技术(北京)有限公司 The generation method and device, computer equipment and readable medium of user's portrait
CN108197191B (en) * 2017-12-27 2018-11-23 神思电子技术股份有限公司 A kind of scene intention interrupt method of more wheel dialogues
CN110110050A (en) * 2018-01-22 2019-08-09 北京大学 A kind of generation method of media event production question and answer data set
CN108694223A (en) * 2018-03-26 2018-10-23 北京奇艺世纪科技有限公司 The construction method and device in a kind of user's portrait library
CN109948017A (en) * 2018-04-26 2019-06-28 华为技术有限公司 A kind of information processing method and device
CN108763329A (en) * 2018-05-08 2018-11-06 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Evaluating method, device and the computer equipment of voice interactive system IQ level
CN108763495B (en) * 2018-05-30 2019-09-20 苏州思必驰信息科技有限公司 Interactive method, system, electronic equipment and storage medium
CN109102809A (en) * 2018-06-22 2018-12-28 北京光年无限科技有限公司 A kind of dialogue method and system for intelligent robot
CN108984655B (en) * 2018-06-28 2021-01-01 厦门快商通信息技术有限公司 Intelligent customer service guiding method for customer service robot
CN109146610A (en) * 2018-07-16 2019-01-04 众安在线财产保险股份有限公司 It is a kind of intelligently to insure recommended method, device and intelligence insurance robot device
CN110891074A (en) * 2018-08-06 2020-03-17 珠海格力电器股份有限公司 Information pushing method and device
CN109684443B (en) * 2018-11-01 2020-11-24 百度在线网络技术(北京)有限公司 Intelligent interaction method and device
CN109635108B (en) * 2018-11-22 2020-02-18 华东师范大学 Man-machine interaction based remote supervision entity relationship extraction method
WO2020107184A1 (en) * 2018-11-26 2020-06-04 华为技术有限公司 Model selection method and terminal
CN109783805B (en) * 2018-12-17 2020-04-24 北京邮电大学 Network community user identification method and device and readable storage medium
CN109683727A (en) * 2018-12-26 2019-04-26 联想(北京)有限公司 A kind of data processing method and device
CN109726279A (en) * 2018-12-30 2019-05-07 联想(北京)有限公司 A kind of data processing method and device
CN109783733B (en) * 2019-01-15 2020-11-06 腾讯科技(深圳)有限公司 User image generation device and method, information processing device, and storage medium
CN109857848A (en) * 2019-01-18 2019-06-07 深圳壹账通智能科技有限公司 Interaction content generation method, device, computer equipment and storage medium
CN109933647A (en) * 2019-02-12 2019-06-25 北京百度网讯科技有限公司 Determine method, apparatus, electronic equipment and the computer storage medium of description information
CN109782925A (en) * 2019-02-20 2019-05-21 联想(北京)有限公司 A kind of processing method, device and electronic equipment
CN109960811A (en) * 2019-03-29 2019-07-02 联想(北京)有限公司 A kind of data processing method, device and electronic equipment
CN110110133B (en) * 2019-04-18 2020-08-11 贝壳找房(北京)科技有限公司 Intelligent voice data generation method and device
CN110287305A (en) * 2019-07-03 2019-09-27 浪潮云信息技术有限公司 A kind of intelligent answer management system based on natural language processing
WO2021012772A1 (en) * 2019-07-22 2021-01-28 中兴通讯股份有限公司 Speech information processing method and device, storage medium, and electronic device
CN111124121A (en) * 2019-12-24 2020-05-08 腾讯科技(深圳)有限公司 Voice interaction information processing method and device, storage medium and computer equipment
CN111368191A (en) * 2020-02-29 2020-07-03 重庆百事得大牛机器人有限公司 User portrait system based on legal consultation interaction process
CN111062220B (en) * 2020-03-13 2020-06-16 成都晓多科技有限公司 End-to-end intention recognition system and method based on memory forgetting device
CN111538816B (en) * 2020-07-09 2020-10-20 平安国际智慧城市科技股份有限公司 Question-answering method, device, electronic equipment and medium based on AI identification

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101086750A (en) * 2006-06-09 2007-12-12 虞玲华 A liverish expert system based on instant message
CN101741759A (en) * 2008-11-24 2010-06-16 中国电信股份有限公司 Instant communication-based intelligent interactive system and interactive method
CN102246164A (en) * 2008-12-11 2011-11-16 有限公司呢哦派豆 Information search method and information provision method based on user's intention
CN102792320A (en) * 2010-01-18 2012-11-21 苹果公司 Intelligent automated assistant
CN103390047A (en) * 2013-07-18 2013-11-13 天格科技(杭州)有限公司 Chatting robot knowledge base and construction method thereof
US20130332172A1 (en) * 2012-06-08 2013-12-12 Apple Inc. Transmitting data from an automated assistant to an accessory
CN104050302A (en) * 2014-07-10 2014-09-17 华东师范大学 Topic detecting system based on atlas model
CN105068661A (en) * 2015-09-07 2015-11-18 百度在线网络技术(北京)有限公司 Man-machine interaction method and system based on artificial intelligence
CN105094315A (en) * 2015-06-25 2015-11-25 百度在线网络技术(北京)有限公司 Method and apparatus for smart man-machine chat based on artificial intelligence

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101086750A (en) * 2006-06-09 2007-12-12 虞玲华 A liverish expert system based on instant message
CN101741759A (en) * 2008-11-24 2010-06-16 中国电信股份有限公司 Instant communication-based intelligent interactive system and interactive method
CN102246164A (en) * 2008-12-11 2011-11-16 有限公司呢哦派豆 Information search method and information provision method based on user's intention
CN102792320A (en) * 2010-01-18 2012-11-21 苹果公司 Intelligent automated assistant
US20130332172A1 (en) * 2012-06-08 2013-12-12 Apple Inc. Transmitting data from an automated assistant to an accessory
CN103390047A (en) * 2013-07-18 2013-11-13 天格科技(杭州)有限公司 Chatting robot knowledge base and construction method thereof
CN104050302A (en) * 2014-07-10 2014-09-17 华东师范大学 Topic detecting system based on atlas model
CN105094315A (en) * 2015-06-25 2015-11-25 百度在线网络技术(北京)有限公司 Method and apparatus for smart man-machine chat based on artificial intelligence
CN105068661A (en) * 2015-09-07 2015-11-18 百度在线网络技术(北京)有限公司 Man-machine interaction method and system based on artificial intelligence

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018213485A1 (en) * 2017-05-17 2018-11-22 Google Llc Determining agents for performing actions based at least in part on image data
WO2019193479A1 (en) * 2018-04-05 2019-10-10 Venkata Krishna Pratyusha Challa Cognitive robotic system for test data management activities and method employed thereof

Also Published As

Publication number Publication date
CN105068661A (en) 2015-11-18
CN105068661B (en) 2018-09-07

Similar Documents

Publication Publication Date Title
Boyd-Graber et al. Applications of topic models
US10311118B2 (en) Systems and methods for generating search results using application-specific rule sets
JP2019008818A (en) Disambiguating user intent in conversational interaction
US10496684B2 (en) Automatically linking text to concepts in a knowledge base
RU2693184C2 (en) Simulating session context for colloquial speech understanding systems
JP2019164795A (en) Method and system for inferring user intent in search input in conversational interaction system
US10592504B2 (en) System and method for querying questions and answers
US20170270203A1 (en) Methods, systems, and media for searching for video content
US10157224B2 (en) Quotations-modules on online social networks
US10303798B2 (en) Question answering from structured and unstructured data sources
Bontcheva et al. Making sense of social media streams through semantics: a survey
Nie et al. Harvesting visual concepts for image search with complex queries
US20170242886A1 (en) User intent and context based search results
Veale et al. Metaphor: A computational perspective
US9621601B2 (en) User collaboration for answer generation in question and answer system
US9348900B2 (en) Generating an answer from multiple pipelines using clustering
US9607264B2 (en) Providing recommendations using information determined for domains of interest
US20170177714A1 (en) Automatic new concept definition
Weichselbraun et al. Extracting and grounding contextualized sentiment lexicons
US10515086B2 (en) Intelligent agent and interface to provide enhanced search
Ceri et al. Web information retrieval
US20170243107A1 (en) Interactive search engine
Deshpande et al. Building, maintaining, and using knowledge bases: a report from the trenches
US9406020B2 (en) System and method for natural language querying
KR102082886B1 (en) Automatic generation of headlines

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15903475

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15903475

Country of ref document: EP

Kind code of ref document: A1