WO2018040501A1 - 基于人工智能的人机交互方法和装置 - Google Patents

基于人工智能的人机交互方法和装置 Download PDF

Info

Publication number
WO2018040501A1
WO2018040501A1 PCT/CN2017/072267 CN2017072267W WO2018040501A1 WO 2018040501 A1 WO2018040501 A1 WO 2018040501A1 CN 2017072267 W CN2017072267 W CN 2017072267W WO 2018040501 A1 WO2018040501 A1 WO 2018040501A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
keywords
human
answer
mapping relationship
Prior art date
Application number
PCT/CN2017/072267
Other languages
English (en)
French (fr)
Inventor
�田�浩
赵世奇
忻舟
温泉
马文涛
许腾
许心诺
张海松
周湘阳
严睿
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Priority to US16/317,526 priority Critical patent/US11645547B2/en
Priority to JP2019501993A priority patent/JP6726800B2/ja
Priority to EP17844812.2A priority patent/EP3508991A4/en
Priority to KR1020197004771A priority patent/KR102170563B1/ko
Publication of WO2018040501A1 publication Critical patent/WO2018040501A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90332Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present application relates to the field of artificial intelligence technologies, and in particular, to a human-computer interaction method and apparatus based on artificial intelligence.
  • Artificial Intelligence is a new technical science that studies and develops theories, methods, techniques, and applications for simulating, extending, and extending human intelligence. Artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence. Research in this field includes intelligent ordering robots, language recognition, Image recognition, natural language processing, and expert systems.
  • the present application aims to solve at least one of the technical problems in the related art to some extent.
  • an object of the present application is to propose a human-computer interaction method based on artificial intelligence, which can make a machine interact with humans in a human dialogue style, so that human-computer interaction has a real dialogue effect between humans. .
  • Another object of the present application is to provide a human-computer interaction device based on artificial intelligence.
  • the artificial intelligence-based human-computer interaction method includes: receiving a question input by a user; processing the problem according to a pre-generated model, and acquiring a problem corresponding to the problem. An answer with a human dialogue style generated from a human conversation corpus; the answer is fed back to the user.
  • the artificial intelligence-based human-computer interaction method proposed by the first aspect of the present application obtains an answer corresponding to a question input by a user through a pre-generated model, and the model is generated according to a human conversation corpus.
  • the answer has a human dialogue style. Therefore, the machine can interact with humans in a human dialogue style, so that human-computer interaction has a real effect of dialogue interaction between humans.
  • the human intelligence interaction device based on the artificial intelligence provided by the second aspect of the present application includes: a receiving module, configured to receive a problem input by a user; and an obtaining module, configured to: according to the pre-generated model, The problem is processed to obtain an answer with a human dialogue style corresponding to the question, the model is generated according to a human conversation corpus; and a feedback module is configured to feed the answer to the user.
  • the artificial intelligence-based human-computer interaction device proposed in the second aspect of the present application obtains an answer corresponding to a question input by the user through a pre-generated model, and the model is generated according to a human conversation corpus, and the answer has a human dialogue style. Therefore, the machine can interact with humans in a human dialogue style, so that human-computer interaction has a real effect of dialogue interaction between humans.
  • An embodiment of the present application provides a device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform: as in the first aspect of the present application One of the methods described.
  • the embodiment of the present application provides a non-transitory computer readable storage medium, when the instructions in the storage medium are executed by a processor, enabling the processor to perform: as described in any one of the first aspect of the present application. Methods.
  • the embodiment of the present application proposes a computer program product, when the instructions in the computer program product are executed by a processor, enabling the processor to perform: as in any one of the first aspect embodiments of the present application The method described.
  • FIG. 1 is a schematic flow chart of a human-computer interaction method based on artificial intelligence according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of generating a model in a training process in an embodiment of the present application
  • FIG. 3 is a schematic diagram of classification of a corpus source in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a prediction model in an embodiment of the present application.
  • FIG. 5 is a schematic diagram of another prediction model in the embodiment of the present application.
  • FIG. 6 is a schematic diagram of another prediction model in the embodiment of the present application.
  • FIG. 7 is a general architecture diagram corresponding to an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of a human-computer interaction method based on artificial intelligence according to another embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a human-machine interaction apparatus based on artificial intelligence according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a human-computer interaction device based on artificial intelligence according to another embodiment of the present application. schematic diagram.
  • FIG. 1 is a schematic flowchart of a human-computer interaction method based on artificial intelligence according to an embodiment of the present application.
  • the embodiment includes:
  • the user can input the question in the form of text, voice or picture.
  • you can first convert the non-text form of the question and convert it to text.
  • the specifically adopted technologies include, for example, speech recognition, picture content recognition and other conversion technologies, and these conversion technologies can be implemented by existing or future technologies, and will not be described in detail herein.
  • S12 processing the problem according to a pre-generated model, and obtaining an answer with a human dialogue style corresponding to the problem, the model being generated according to a human conversation corpus.
  • the above model can be generated in the training phase.
  • a large number of human dialogue corpora are collected.
  • the human dialogue corpus is in units of pairs, each group including questions and answers. (answer), during training, use the query in the corpus as input, train the model, so that the output is as consistent as possible with the corresponding answer in the corpus. Since the model is generated based on the human conversation corpus, the current output is processed according to the model, and the resulting output is also a human conversation style answer.
  • the above-mentioned model is not limited to one in the specific implementation, and may be multiple, respectively performing different functions to obtain an answer with a human dialogue style by a question input by the user.
  • the answer After the answer is obtained, the answer can be played to the user in a voice form.
  • the answer obtained is in the form of text, it can be converted into speech by techniques such as speech synthesis.
  • an answer corresponding to a question input by the user is obtained through a pre-generated model, the model is generated according to a human conversation corpus, and the answer has a human dialogue style, so the machine can interact with humans in a human dialogue style. So that human-computer interaction has the effect of real dialogue between humans.
  • the previous embodiment describes the dialog process, which is used in the dialogue process.
  • the model can be generated during the training process. The process of generating the model in the training process is described below.
  • FIG. 2 is a schematic flowchart of generating a model in a training process in an embodiment of the present application.
  • the model includes: a mapping relationship, a prediction model, and a grammar model as an example.
  • the mapping relationship is used to indicate the mapping relationship between the keywords in the question and the keywords in the answer, and the prediction mode
  • the type is used to determine an optimal mapping relationship among multiple mapping relationships according to the context information, and generate a collocation matching the keywords in the determined mapping relationship
  • the grammar model is used to adjust the order of the words, according to the adjustment
  • the resulting entry generates a sentence that conforms to the grammatical structure.
  • this embodiment includes:
  • the choice of corpus can be based on all conversations with people, including conversations in video (movies, TV series, animations, etc.), dialogues in literary works (historical classics, mystery novels, romance novels, online novels, etc.), social platforms. Dialogues in the microblogs, post bars, watercress, etc., local language (Northeast dialect, Beijing dialect, Cantonese, etc.) dialogue.
  • the classification of corpus sources is given so that different styles of corpus can be collected from multiple corpora sources, and the same corpus source can have one or more dialogue styles.
  • the dialogue in the video varies greatly depending on the type of video.
  • the dialogue in the comedy is generally humorous
  • the dialogue in the love film is generally deep
  • the dialogue in the war film is generally intense
  • the dialogue style in the literary works It also varies with the type.
  • the dialogue in historical masterpieces generally has the characteristics of a certain historical background, the logical dialogue of the mystery novels, and the rich emotions of romantic dialogues.
  • the dialogue in the online social platform has many network vocabulary, but because of its It is the daily conversation of people's daily conversation that is most close to people's daily conversations; local language dialogues contain various local dialects with various local characteristics.
  • S22 Extract keywords in the question in the human conversation corpus and keywords in the corresponding answer, and generate a mapping relationship between the keyword in the question and the keyword in the answer according to the extracted keyword.
  • mapping relationship can be one-to-many, for example, another set of corpora is as follows:
  • corpora can come from different corpora sources, different corpora sources can have different styles, so Different styles of mapping can be formed.
  • a group of conversations in a TV series includes:
  • mapping relationships with humorous styles such as the mapping relationship between "thinking” and “thinking”, and the mapping relationship between "examination” and “how to cheat” can be established.
  • a set of conversations in a romance novel includes:
  • mapping relationships with full love sentimental styles, such as the mapping relationship between "I” and “Dear” and “Family”, the mapping relationship between “Thinking” and “Thinking”, “Exam The mapping relationship between "you” and "you”.
  • a set of conversations includes:
  • mapping relationships with common life styles such as a mapping relationship between "thinking” and “thinking”, and a mapping relationship between "exam” and “what” can be established.
  • a group of conversations in the Northeast dialect include:
  • mapping relations with the northeastern style can be established, such as the mapping relationship between "thinking” and “thinking”, and the mapping relationship between "exam” and "paw".
  • S23 Obtain keywords in the question and keywords in the answer from the human conversation corpus, and context information, and generate a prediction model according to the obtained keywords and context information.
  • the keywords in the question include “busy” and “off work”.
  • the context information in a corpus is “late time”, “want to go home”, etc., and the corresponding answers often appear “hard” and “go home”.
  • the prediction model includes the correspondence between the problem, the context information, and the answer as shown in FIG. 4.
  • the keywords in the question include "busy” and "off work”.
  • the context information in another corpus is "how many nights work”, “leader reminder”, etc., and the corresponding answers often appear “rest”, “ “Work” and “Complete”, the prediction model includes the correspondence between the problem, the context information, and the answer as shown in FIG. 5.
  • Figure 4-5 illustrates the logical relationship, but the above relationship in the prediction model is not limited to logic, but also can be style, for example, corresponding to the same question "think”, “exam,” see figure 6, in different styles, can correspond to different answers.
  • the prediction model is not only used to represent the correspondence between the problem, the context information, and the answer, but also used to learn the collocation, to supplement it according to the keyword in the answer, and to match the sentence.
  • the answers based on questions and contextual information include the words “I think, think, cheat” and then root.
  • the humorous dialogue corpus you can learn the common collocations in the corresponding styles, "I think, how to cheat”; in romance novels, you can also extract the keywords of "people, think", and then you can learn the corresponding rich emotions. The match "Dear, miss you", and finally learned such a way to match the situation.
  • the essence of the grammatical model is a language model, which learns the common grammatical structure in human speech according to the dialogue in the corpus.
  • the main principle is based on the lexical labeling and sequence of the pre-processed pair in the corpus. Learn the habitual expressions in human conversations, including the addition and addition of conjunctions and auxiliary words.
  • the grammar model learns to construct a grammatical structure of an answer through these two phrases, so as to learn to add a conjunction such as "that is”;
  • the grammar model goes back to learn from these terms to the final reply “Dear, but people think of you?" "Expression, so as to learn the use of the "how,” it is also used to learn the expression of this kind of emotion-rich expression.
  • the grammar model mainly learns the structure order and expression of the language in the corpus. The learning of the structure order will ensure that the sentence is basically fluent, and the learning in the expression mode will change with the corpus style.
  • mapping relationships, predictive models, and grammatical models can be generated in this embodiment, which are then used in the dialog phase.
  • the model can be generated based on the human dialogue corpus training, so that the machine learns the human dialogue style, and after applying the model to the dialogue process, the machine can It is enough to interact with humans in a human dialogue style, so that human-computer interaction has the effect of real dialogue between humans.
  • FIG. 8 is a schematic flowchart of a human-computer interaction method based on artificial intelligence according to another embodiment of the present application.
  • the embodiment includes:
  • S801 Collect human conversation corpus.
  • the pre-processing may include: cutting words and answers in the human conversation corpus separately, selecting keywords and determining an identifier (id) corresponding to each keyword, thereby converting the word sequence into an id sequence.
  • the dictionary containing the correspondence between the word and the identifier may be acquired, and the word sequence may be converted into a corresponding id sequence according to the dictionary.
  • This step can be performed by the pre-processing module shown in FIG.
  • S803 Generate a mapping relationship between the keyword in the question and the keyword in the answer according to the pre-processed human conversation corpus, and store the mapping relationship.
  • This step can be performed by the mapping learning and storage module shown in FIG.
  • mapping relationship may be a mapping relationship between ids.
  • S804 Generate a prediction model according to the pre-processed human conversation corpus.
  • This step can be performed by the prediction module shown in FIG.
  • S805 Generate a grammatical model according to the pre-processed human conversation corpus.
  • This step can be performed by the syntax learning and control module shown in FIG.
  • the S801-S805 can be executed during the training phase.
  • the pre-processing can be performed by a pre-processing module.
  • a pre-processing module For the specific pre-processing process, refer to the corresponding process in the above training phase.
  • S808 Determine, according to the mapping relationship, a keyword in the answer corresponding to the keyword in the question input by the user.
  • the main control system may transmit the pre-processed problem to the mapping learning and storage module, and the mapping learning and storage module determines the keyword in the answer corresponding to the pre-processed problem according to the mapping relationship stored by the mapping learning and storage module.
  • S809 Select an optimal set of keywords among the determined keywords according to the prediction model, and generate a collocation according to the selected set of keywords.
  • the main control system can obtain multiple sets of keywords from the mapping learning and storage module, and then the main control system can transmit the multiple sets of keywords to the prediction module, and the context memory module obtains the current context information, and the prediction module can A set of keywords is selected among the plurality of sets of keywords according to the generated prediction model and the current context information.
  • multiple sets of keywords can be determined. For example, when the keywords in the question include “busy” and “off work”, the keywords determined according to the mapping relationship may include “rest, work, completion” and “hard work”. Go home, and in this step, according to the prediction model and the current context information, an optimal set of keywords can be selected among the determined plurality of keywords, for example, if the current context information is "worked more" , the leadership reminder, then select a set of keywords is "rest, work, complete", or, if the current context information is "time is late, want to go home”, then select a set of keywords is "hard ,Come back home”.
  • the predictive model can also determine the current style according to the context information, and then determine the corresponding collocation words according to the style, for example, the selected set of keywords is “in, think”, and if the current style is humorous, it can be determined “ The collocations of thinking, how to cheat, or, if the current style is full of emotions, you can identify collocations such as "Dear, think of you.”
  • S810 Perform grammatical structural adjustment on the selected set of keywords and the generated collocation words according to the grammar model, and obtain a sentence satisfying the grammatical structure.
  • the master control system can obtain keywords and collocations from the prediction module, and then transmit them to the language.
  • the law learning and control module adjusts the order of the words according to the grammar model by the grammar learning and control module to generate sentences satisfying the grammatical structure.
  • the grammar model adopted by the grammar learning and control module may be generated according to the human dialogue corpus during the training phase, or may be a grammatical model obtained from a third party according to the open interface.
  • the master control system obtains a sentence satisfying the grammatical structure from the grammar learning and control module, and then performs speech synthesis on the sentence and plays it to the user through the output interface.
  • the method may further include:
  • S812 Perform online learning according to an interactive dialogue with the user.
  • the module mainly collects the dialogue records at regular intervals as a corpus to re-train each module of the system in real time.
  • each input of the user is also a query for the answer from the sentence on the relative machine, so the machine is generated one step at a time.
  • the answer is as a query, and the user's input is used as a pair as a pair to retrain, so that the system can learn the user's dialogue style during the dialogue with the user.
  • the module is a pluggable module. When the module is connected, the module learns through the dialogue between the user and the machine in the log. The entire system can also operate normally when the module is removed.
  • the method may further include:
  • S813 Call other systems through an open interface or be called by other systems.
  • the system can also provide some open interfaces. These open interfaces are open call interfaces and extended interfaces.
  • the call interface can enable other systems to directly invoke the system through the interface, and the extended interface can access other interfaces.
  • Related models or systems perform functional enhancements.
  • the grammar learning module can call other mature language models to enhance the grammar learning and adjustment functions in the system.
  • the model can be generated based on the human dialogue corpus training, so that the machine learns the human dialogue style, and after applying the model to the dialogue process, the machine can interact with humans in a human dialogue style.
  • Make human-computer interaction have the effect of real dialogue between humans.
  • new data can be learned in real time to improve human-computer interaction effects.
  • other systems can be called or called other systems to better provide human-computer interaction services.
  • FIG. 9 is a schematic structural diagram of a human-machine interaction apparatus based on artificial intelligence according to an embodiment of the present application.
  • the device 90 includes a receiving module 91, an obtaining module 92, and a feedback module 93.
  • a receiving module 91 configured to receive a problem input by a user
  • the obtaining module 92 is configured to process the problem according to a pre-generated model, and obtain an answer with a human dialogue style corresponding to the question, where the model is generated according to a human conversation corpus;
  • the feedback module 93 is configured to feed back the answer to the user.
  • the model includes: a mapping relationship, a prediction model, and a grammar model
  • the mapping relationship is used to indicate a mapping relationship between a keyword in the question and a keyword in the answer
  • the prediction model being used according to
  • the context information determines an optimal mapping relationship among the plurality of mapping relationships, and generates a collocation matching the keywords in the determined mapping relationship
  • the grammar model is used to adjust the order of the words, according to the adjusted The term generates a sentence that conforms to the grammatical structure.
  • the obtaining module 92 includes:
  • mapping sub-module 921 configured to determine, according to the mapping relationship, a keyword in an answer corresponding to a keyword in a question input by a user
  • a prediction sub-module 922 configured to select an optimal group of keywords among the determined keywords according to the prediction model, and generate a collocation word according to the selected group of keywords;
  • the grammar analysis sub-module 923 is configured to perform grammatical structural adjustment on the selected set of keywords and the generated collocation words according to the grammar model, and obtain a sentence satisfying the grammatical structure as an answer with a human dialogue style.
  • the mapping sub-module is further configured to: extract a keyword in a question in a human conversation corpus and a keyword in a corresponding answer, and generate the mapping relationship according to the extracted keyword; or
  • the predicting sub-module is further configured to: extract keywords in the question in the human conversation corpus and keywords in the corresponding answer, and extract corresponding context information, and generate according to the extracted keywords and context information.
  • the predictive model or,
  • the parsing sub-module is further configured to: generate the The grammar model, or the grammar model is obtained from other systems through an open interface.
  • the apparatus 90 further includes:
  • the pre-processing module 94 is configured to pre-process the problem to trigger the acquiring module to process the pre-processed problem according to the pre-generated model.
  • the apparatus 90 further includes:
  • the online learning module 95 is configured to perform online learning according to an interactive dialogue with the user.
  • the apparatus 90 further includes:
  • the open interface 96 is used to provide an interface for calling other systems or being called by other system calls.
  • the model can be generated based on the human dialogue corpus training, so that the machine learns the human dialogue style, and after applying the model to the dialogue process, the machine can interact with humans in a human dialogue style.
  • Make human-computer interaction have the effect of real dialogue between humans.
  • new data can be learned in real time to improve human-computer interaction effects.
  • other systems can be called or called other systems to better provide human-computer interaction services.
  • An embodiment of the present application provides a device, including: a processor; The processor executables the memory of the instructions; wherein the processor is configured to perform: receiving a question of user input; processing the question according to a pre-generated model to obtain a human conversation style corresponding to the question In response, the model is generated from a human conversation corpus; the answer is fed back to the user.
  • the embodiment of the present application proposes a non-transitory computer readable storage medium, when the instructions in the storage medium are executed by a processor, enabling the processor to perform: receiving a user input question; according to a pre-generated model, The problem is processed to obtain an answer with a human dialogue style corresponding to the question, the model being generated from a human conversation corpus; the answer is fed back to the user.
  • the embodiment of the present application proposes a computer program product, when the instructions in the computer program product are executed by a processor, enabling the processor to perform: receiving a problem input by a user; and performing the problem according to a pre-generated model Processing, obtaining an answer with a human dialogue style corresponding to the question, the model being generated according to a human conversation corpus; feeding the answer to the user.
  • portions of the application can be implemented in hardware, software, firmware, or a combination thereof.
  • multiple steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system.
  • a suitable instruction execution system For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques well known in the art: having logic gates for implementing logic functions on data signals. Discrete logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.
  • each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
  • the integrated modules, if implemented in the form of software functional modules and sold or used as stand-alone products, may also be stored in a computer readable storage medium.
  • the above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

Abstract

本申请提出一种基于人工智能的人机交互方法和装置,该基于人工智能的人机交互方法包括:接收用户输入的问题;根据预先生成的模型,对所述问题进行处理,获取与所述问题对应的具有人类对话风格的回答,所述模型是根据人类对话语料生成的;将所述回答反馈给用户。该方法能够使得人机交互具有真正的人类之间对话交互的效果。

Description

基于人工智能的人机交互方法和装置
相关申请的交叉引用
本申请要求北京百度网讯科技公司于2016年9月5日提交的、发明名称为“基于人工智能的人机交互方法和装置”的、中国专利申请号“201610803645.8”的优先权。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种基于人工智能的人机交互方法和装置。
背景技术
人工智能(Artificial Intelligence,AI)是研究、开发用于模拟、延伸和扩展人的智能的理论、方法、技术及应用系统的一门新的技术科学。人工智能是计算机科学的一个分支,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器,该领域的研究包括智能点餐机器人、语言识别、图像识别、自然语言处理和专家系统等。
随着人工智能等技术的发展,人机交互系统已经以多种形式出现在人们的生活中。比如在自然对话领域,机器可以与人进行对话,在智能客服领域,客服系统可以为人提供服务。但是,目前的人机交互系统的流程通常是机器接收到人的问题(query)后,在数据库查找相关的回答(reply)展现给用户。这种方式在本质上是检索,不具有人类之间对话时的逻辑,无法实现真正的人类之间对话交互的效果。
发明内容
本申请旨在至少在一定程度上解决相关技术中的技术问题之一。
为此,本申请的一个目的在于提出一种基于人工智能的人机交互方法,该方法可以使得机器以人类对话风格与人类进行对话交互,使得人机交互具有真正的人类之间对话交互的效果。
本申请的另一个目的在于提出一种基于人工智能的人机交互装置。
为达到上述目的,本申请第一方面实施例提出的基于人工智能的人机交互方法,包括:接收用户输入的问题;根据预先生成的模型,对所述问题进行处理,获取与所述问题对应的具有人类对话风格的回答,所述模型是根据人类对话语料生成的;将所述回答反馈给用户。
本申请第一方面实施例提出的基于人工智能的人机交互方法,通过预先生成的模型得到与用户输入的问题对应的回答,该模型是根据人类对话语料生成 的,该回答具有人类对话风格,因此,机器能够以人类对话风格与人类进行对话交互,使得人机交互具有真正的人类之间对话交互的效果。
为达到上述目的,本申请第二方面实施例提出的基于人工智能的人机交互装置,包括:接收模块,用于接收用户输入的问题;获取模块,用于根据预先生成的模型,对所述问题进行处理,获取与所述问题对应的具有人类对话风格的回答,所述模型是根据人类对话语料生成的;反馈模块,用于将所述回答反馈给用户。
本申请第二方面实施例提出的基于人工智能的人机交互装置,通过预先生成的模型得到与用户输入的问题对应的回答,该模型是根据人类对话语料生成的,该回答具有人类对话风格,因此,机器能够以人类对话风格与人类进行对话交互,使得人机交互具有真正的人类之间对话交互的效果。
本申请实施例提出了一种设备,其特征在于,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行:如本申请第一方面实施例任一项所述的方法。
本申请实施例提出了一种非临时性计算机可读存储介质,当所述存储介质中的指令由处理器执行时,使得处理器能够执行:如本申请第一方面实施例任一项所述的方法。
本申请实施例提出了一种计算机程序产品,当所述计算机程序产品中的指令被处理器执行时,使得处理器能够执行:如本申请第一方面实施例任一项所 述的方法。
本申请附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。
附图说明
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1是本申请一个实施例提出的基于人工智能的人机交互方法的流程示意图;
图2是本申请实施例中在训练过程中生成模型的流程示意图;
图3是本申请实施例中语料源的分类示意图;
图4是本申请实施例中一种预测模型的示意图;
图5是本申请实施例中另一种预测模型的示意图;
图6是本申请实施例中另一种预测模型的示意图;
图7是本申请实施例对应的一种总体架构图;
图8是本申请另一个实施例提出的基于人工智能的人机交互方法的流程示意图;
图9是本申请一个实施例提出的基于人工智能的人机交互装置的结构示意图;
图10是本申请另一个实施例提出的基于人工智能的人机交互装置的结构 示意图。
具体实施方式
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的模块或具有相同或类似功能的模块。下面通过参考附图描述的实施例是示例性的,仅用于解释本申请,而不能理解为对本申请的限制。相反,本申请的实施例包括落入所附加权利要求书的精神和内涵范围内的所有变化、修改和等同物。
图1是本申请一个实施例提出的基于人工智能的人机交互方法的流程示意图。
如图1所示,本实施例包括:
S11:接收用户输入的问题(query)。
其中,用户可以以文本、语音或图片等形式输入问题。当问题不是文本形式时,可以先对非文本形式的问题进行转换,将其转换为文本。具体采用的技术例如包括语音识别、图片内容识别等转换技术,这些转换技术可以采用已有或将来出现的技术实现,在此不再详细说明。
S12:根据预先生成的模型,对所述问题进行处理,获取与所述问题对应的具有人类对话风格的回答,所述模型是根据人类对话语料生成的。
其中,可以在训练阶段生成上述的模型。在训练阶段,先收集大量的人类对话语料,人类对话语料以组(pair)为单位,每组包括问题(query)和回答 (answer),在训练时,以语料中的query作为输入,训练模型,使得输出尽量与语料中对应的answer一致。由于该模型是根据人类对话语料生成的,因此根据该模型对当前的问题进行处理后,得到的输出也是具有人类对话风格的answer。
进一步的,上述的模型在具体实现时不限于一个,可以是多个,分别完成不同的功能,以由用户输入的问题获取具有人类对话风格的回答。
S13:将所述回答反馈给用户。
在获取到回答后,可以将回答以语音形式播放给用户。
另外,如果获取的回答是文本形式,则可以通过语音合成等技术将其转换为语音。
本实施例中,通过预先生成的模型得到与用户输入的问题对应的回答,该模型是根据人类对话语料生成的,该回答具有人类对话风格,因此,机器能够以人类对话风格与人类进行对话交互,使得人机交互具有真正的人类之间对话交互的效果。
上一实施例描述了对话过程,在对话过程中会用到模型,该模型可以是在训练过程中生成的,下面对训练过程中生成模型的流程进行描述。
图2是本申请实施例中在训练过程中生成模型的流程示意图。
本实施例中,以模型包括:映射关系、预测模型和语法模型为例。其中,映射关系用于表明问题中的关键词与回答中的关键词之间的映射关系,预测模 型用于根据上下文信息在多种映射关系中确定出最优的一种映射关系,以及生成与确定出的映射关系中的关键词匹配的搭配词,语法模型用于调整词条顺序,根据调整后的词条生成符合语法结构的句子。
如图2所示,本实施例包括:
S21:收集人类对话语料。
语料的选择可以基于所有带有人与人对话的地方,包含视频(电影、电视剧、动画等)中的对话、文学作品(历史名著、推理小说、言情小说、网络小说等)中的对话、社交平台(微博、贴吧、豆瓣等)中的对话、地方语言(东北话、北京话、广东话等)对话。
如图3所示,给出了语料源的分类,从而可以从多种语料源中收集到不同风格的语料,并且同一种语料源可以带有一种或多种对话风格。视频中的对话随视频种类的变化对话风格差异很大,喜剧片中的对话一般幽默诙谐、爱情片中的对话一般情深意浓、战争片中的对话一般紧张激烈等;文学作品中的对话风格也随种类不同而不同,历史名著中的对话一般带有某种历史背景的特色、推理小说对话逻辑严密、言情小说对话情感丰富等;网络社交平台中的对话带有很多网络词汇,但是因为其本身就是人们的日常对话整体风格最贴近人们的日常对话;地方语言对话包含各种地方方言,带有各种地方特色。
S22:提取人类对话语料中问题中的关键词及对应的回答中的关键词,根据提取的关键词生成问题中的关键词与回答中的关键词之间的映射关系。
其中,对应一组人类对话语料,可以对其中的问题(简称为问)和回答(简称为答)进行切词,得到问中的词条(term)和答中的词条,再在词条中确定出关键词(如根据出现概率),再通过对大量语料的学习可以得到映射关系。
例如,一组问答如下:
问:忙了一天,终于下班了。
答:辛苦了,现在回家吗?
通过对问答分别进行切词并提取其中的关键词,可以得到问中的关键词包括:忙、下班,而在答中的关键词包括:辛苦、回家,因此,可以建立“忙”与“辛苦”之间的映射关系,“下班”与“回家”之间的映射关系。
上述的映射关系可以是一对多的,例如,另一组语料如下:
问:忙了一天,终于下班了。
答:休息一下,工作完成了吗?
类似上述处理,可以建立“忙”与“休息”之间的映射关系,“下班”与“工作”之间的映射关系。
因此综合多种语料,可以建立“忙”与“辛苦”、“休息”之间的映射关系,“下班”与“回家”、“工作”之间的映射关系。
在得到上述的映射关系后,可以以键值对(key、value)的方式存储下来。如key是“忙”,value包括“辛苦”、“休息”。
由于语料可以来自不同的语料源,不同语料源可以具有不同的风格,因此 可以形成不同风格的映射关系。
例如,一个电视剧中的一组对话包括:
问:我在想关于考试的事情呢。
答:那就是在想怎么作弊?
问:你怎么老把我往坏处想呢?
答:让我往好处想也得给我机会啊!
根据上述语料,可以建立一组具有诙谐幽默风格的映射关系,如“想”与“想”之间的映射关系,“考试”与“怎么作弊”之间的映射关系。
又例如,一个言情小说中的一组对话包括:
问:我在想关于考试的事情呢。
答:亲爱的,可是人家在想你呢。
根据上述语料,可以建立一组具有饱含情愫风格的映射关系,如“我”与“亲爱的”、“人家”之间的映射关系,“想”与“想”之间的映射关系,“考试”与“你”之间的映射关系。
又例如,在普通社交平台上,一组对话包括:
问:我在想关于考试的事情呢。
答:关于考试的什么事儿呢?
根据上述语料,可以建立一组具有普通生活风格的映射关系,如“想”与“想”之间的映射关系,“考试”与“什么”之间的映射关系。
又例如,在东北话中一组对话包括:
问:我在想关于考试的事情呢。
答:寻思啥呢,到考试就麻爪儿了吧?
根据上述语料,可以建立一组具有东北话风格的映射关系,如“想”与“寻思”之间的映射关系,“考试”与“麻爪”之间的映射关系。
S23:从人类对话语料中获取问题中的关键词和回答中的关键词,以及上下文信息,根据获取的关键词及上下文信息生成预测模型。
例如,问题中的关键词包括“忙”和“下班”,在一种语料中的上下文信息是“时间晚”、“想回家”等,相应的回答经常会出现“辛苦”、“回家”,则预测模型中包括如图4所示的问题、上下文信息、回答之间的对应关系。又例如,问题中的关键词包括“忙”和“下班”,在另一种语料中的上下文信息是“工作多晚”、“领导催”等,相应的回答经常会出现“休息”、“工作”、“完成”,则预测模型中包括如图5所示的问题、上下文信息、回答之间的对应关系。
图4-图5以逻辑上的关系进行了说明,但是预测模型中的上述关系不限于逻辑上的,还可以是风格上的,比如,对应同样的问题“想”、“考试,”参见图6,在不同风格下,可以对应不同的回答。
进一步的,预测模型不仅用于表示问题、上下文信息和回答之间的对应关系,还用于学习搭配,以根据回答中的关键词对其进行补充,搭配出句子。比如根据问题和上下文信息得到的回答包括“在、想、作弊”这些关键词,然后根 据幽默诙谐的对话语料可以学习到相应风格中的常用搭配“在想、怎么作弊”;而在言情小说里,同样可以抽出“人家、想”的关键词,然后可以学到对应的富含情愫的搭配“亲爱的、想你”,最后学到这样一种包含情愫的搭配方式。
S24:分析人类对话语料的语法结构,生成语法模型。
语法模型的本质就是一个语言模型,该模型会根据语料中的对话去学习人类说话中的常用的语法结构,其主要原理是根据在对语料中对话pair预处理后的词性标注及序列的顺序去学习人类对话中的习惯表达方式,包括一些连接词和助词的添加和补充。比如在上文中当学习到“在想、怎么作弊”之后,语法模型去学习通过这两个短语去构建一个回答的语法结构,从而学习到去添加“那就是”这样的连接词;再例如上文在言情小说里面的对话,在reply中抽出与上文对应生成的“亲爱的、人家、想、你”之后,语法模型回去学习由这些term到最终reply“亲爱的,可是人家想你了呢”的表达方式,从而去学到“可是、呢”这些语气助词的使用,同时也学习到这种富含情愫的表达方式。在训练阶段,语法模型主要学习语料中语言的结构顺序和表达方式,结构顺序的学习会保证句子基本通顺,而表达方式上的学习也会随着语料风格的不同有所变化。
如上所示,在本实施例中可以生成映射关系、预测模型和语法模型,之后这些模型用于对话阶段。
本实施例中,通过收集人类对话语料,可以基于人类对话语料训练生成模型,从而使得机器学习到人类对话风格,在将模型应用到对话过程后,机器能 够以人类对话风格与人类进行对话交互,使得人机交互具有真正的人类之间对话交互的效果。
结合上述的对话过程和训练过程,如图7所示,给出了一个总体架构图。
下面结合图7所示的架构,对包括训练过程和对话过程的整个流程进行说明。
图8是本申请另一个实施例提出的基于人工智能的人机交互方法的流程示意图。
如图8所示,本实施例包括:
S801:收集人类对话语料。
S802:对人类对话语料进行预处理。
预处理可以包括:对人类对话语料中的问题和答案分别进行切词,选取关键词及确定与每个关键词对应的标识(id),从而将词序列转换为id序列。
其中,可以获取包含词与标识之间对应关系的词典,根据该词典可以将词序列转换为对应的id序列。
该步骤可以由图7所示的预处理模块执行。
S803:根据预处理后的人类对话语料,生成问题中的关键词与回答中的关键词之间的映射关系,并存储该映射关系。
该步骤可以由图7所示的映射学习与存储模块执行。
具体的映射关系的生成流程可以参见上一实施例,在此不再详细说明。
另外,可以理解的是,由于在训练阶段进行了上述的预处理,因此上述的映射关系可以是id之间的映射关系。
S804:根据预处理后的人类对话语料生成预测模型。
该步骤可以由图7所示的预测模块执行。
具体的生成预测模型的流程可以参见上一实施例,在此不再详细说明。
S805:根据预处理后的人类对话语料生成语法模型。
该步骤可以由图7所示的语法学习与控制模块执行。
具体的生成语法模型的流程可以参见上一实施例,在此不再详细说明。
S801-S805可以在训练阶段执行。
另外,各模块之间的交互可以由图7所示的主控系统执行。
S806:接收用户输入的问题。
S807:对用户输入的问题进行预处理。
预处理可以由预处理模块执行。具体的预处理流程可以参见上述训练阶段的相应流程。
S808:根据映射关系,确定与用户输入的问题中的关键词对应的回答中的关键词。
其中,主控系统可以将预处理后问题传输给映射学习与存储模块,由映射学习与存储模块根据自身存储的映射关系确定与预处理后的问题对应的回答中的关键词。
S809:根据预测模型,在确定出的关键词中选择最优的一组关键词,并根据选择的一组关键词生成搭配词。
其中,主控系统可以从映射学习与存储模块获取到多组关键词,之后主控系统可以将这多组关键词传输给预测模块,并且由上下文记忆模块获取到当前的上下文信息,预测模块可以根据已生成的预测模型以及当前的上下文信息在多组关键词中选择出一组关键词。
例如,根据映射关系可以确定出多组关键词,比如,问题中的关键词包括“忙”和“下班”时,根据映射关系确定出的关键词可以包括“休息、工作、完成”和“辛苦,回家”,而在该步骤中,根据预测模型以及当前的上下文信息可以在确定出的多组关键词中选择出最优的一组关键词,例如,如果当前的上下文信息是“工作多,领导催”,则选择出的一组关键词是“休息,工作,完成”,或者,如果当前的上下文信息是“时间晚、想回家”,则选择出的一组关键词是“辛苦,回家”。
另外,预测模型还可以根据上下文信息确定当前的风格,再根据风格确定相应的搭配词语,比如选择的一组关键词为“在、想”,如果当前风格是幽默诙谐的,则可以确定出“在想、怎么作弊”这类的搭配词,或者,如果当前风格是饱含情愫,则可以确定出“亲爱的、想你”这类的搭配词。
S810:根据语法模型对选择的一组关键词及生成的搭配词进行语法结构调整,得到满足语法结构的句子。
其中,主控系统可以从预测模块获取关键词及搭配词,之后将其传输给语 法学习与控制模块,由语法学习与控制模块根据语法模型调整各词的顺序,以生成满足语法结构的句子。语法学习与控制模块采用的语法模型可以是训练阶段根据人类对话语料生成的,或者,也可以是根据开放接口从第三方获取的语法模型。
S811:将满足语法结构的句子作为回答反馈给用户。
例如,主控系统从语法学习与控制模块获取满足语法结构的句子,之后对该句子进行语音合成,通过输出接口播放给用户。
进一步的,该方法还可以包括:
S812:根据与用户的交互对话,进行在线学习。
在与用户进行对话时,系统可以实时产生一些对话的语料,这些语料是包含着当前用户的表达习惯和风格,因此,可以把一定时间内与用户对话的聊天记录作为语料去学习用户的表达习惯。该模块主要是定时收集对话记录作为语料对系统各个模块进行实时重训,在聊天记录使用中,用户的每一次输入对从相对机器上一句的回答来说也是一个query,因此以机器上一步产生的回答作为query,以用户的输入作为answer作为一个个pair进行重训,让系统在与用户对话的过程中去学习到用户的对话风格。该模块时一个可插拔的模块,在接上该模块时模块比不断通过日志中用户与机器的对话进行学习,在拆卸掉改模块时整个系统也能够正常运行。
进一步的,该方法还可以包括:
S813:通过开放接口调用其他系统或者被其他系统调用。
如图7所示,该系统还可以提供一些开放接口,这些开放接口是对外开放的调用接口和拓展接口,调用接口可以使其他系统可以通过该接口直接调用本系统,而拓展接口可以接入其他相关的模型或系统进行功能强化,比如语法学习模块可以调用其他一些成熟的语言模型去强化系统中的语法学习和调整的功能。
本实施例中,通过收集人类对话语料,可以基于人类对话语料训练生成模型,从而使得机器学习到人类对话风格,在将模型应用到对话过程后,机器能够以人类对话风格与人类进行对话交互,使得人机交互具有真正的人类之间对话交互的效果。进一步的,通过在线学习可以实时学习到新的数据,以提高人机交互效果。进一步的,通过开放接口可以被其他系统调用或调用其他系统,更好的提供人机交互服务。
图9是本申请一个实施例提出的基于人工智能的人机交互装置的结构示意图。
如图9所示,该装置90包括:接收模块91、获取模块92和反馈模块93。
接收模块91,用于接收用户输入的问题;
获取模块92,用于根据预先生成的模型,对所述问题进行处理,获取与所述问题对应的具有人类对话风格的回答,所述模型是根据人类对话语料生成的;
反馈模块93,用于将所述回答反馈给用户。
一些实施例中,所述模型包括:映射关系、预测模型和语法模型,所述映射关系用于表明问题中的关键词与回答中的关键词之间的映射关系,所述预测模型用于根据上下文信息在多种映射关系中确定出最优的一种映射关系,以及生成与确定出的映射关系中的关键词匹配的搭配词,所述语法模型用于调整词条顺序,根据调整后的词条生成符合语法结构的句子。
一些实施例中,参见图10,所述获取模块92包括:
映射子模块921,用于根据所述映射关系,确定与用户输入的问题中的关键词对应的回答中的关键词;
预测子模块922,用于根据所述预测模型,在确定出的关键词中选择最优的一组关键词,并根据选择的一组关键词生成搭配词;
语法分析子模块923,用于根据所述语法模型,对所述选择的一组关键词及生成的搭配词进行语法结构调整,得到满足语法结构的句子,作为具有人类对话风格的回答。
一些实施例中,所述映射子模块还用于:提取人类对话语料中问题中的关键词及对应的回答中的关键词,根据提取的关键词生成所述映射关系;或者,
一些实施例中,所述预测子模块还用于:提取人类对话语料中问题中的关键词及对应的回答中的关键词,以及提取对应的上下文信息,根据提取的关键词和上下文信息,生成所述预测模型;或者,
一些实施例中,所述语法分析子模块还用于:根据人类对话语料生成所述 语法模型,或者,通过开放接口从其他系统获取所述语法模型。
一些实施例中,参见图10,该装置90还包括:
预处理模块94,用于对所述问题进行预处理,以触发所述获取模块根据预先生成的模型对预处理后的问题进行处理。
一些实施例中,参见图10,该装置90还包括:
在线学习模块95,用于根据与所述用户的交互对话,进行在线学习。
一些实施例中,参见图10,该装置90还包括:
开放接口96,用于为调用其他系统或被其他系统调用提供接口。
可以理解的是,本实施例的装置与上述方法实施例对应,具体内容可以参见方法实施例的相关描述,在此不再详细说明。
本实施例中,通过收集人类对话语料,可以基于人类对话语料训练生成模型,从而使得机器学习到人类对话风格,在将模型应用到对话过程后,机器能够以人类对话风格与人类进行对话交互,使得人机交互具有真正的人类之间对话交互的效果。进一步的,通过在线学习可以实时学习到新的数据,以提高人机交互效果。进一步的,通过开放接口可以被其他系统调用或调用其他系统,更好的提供人机交互服务。
可以理解的是,上述各实施例中相同或相似部分可以相互参考,在一些实施例中未详细说明的内容可以参见其他实施例中相同或相似的内容。
本申请实施例提出了一种设备,其特征在于,包括:处理器;用于存储处 理器可执行指令的存储器;其中,所述处理器被配置为执行:接收用户输入的问题;根据预先生成的模型,对所述问题进行处理,获取与所述问题对应的具有人类对话风格的回答,所述模型是根据人类对话语料生成的;将所述回答反馈给用户。
本申请实施例提出了一种非临时性计算机可读存储介质,当所述存储介质中的指令由处理器执行时,使得处理器能够执行:接收用户输入的问题;根据预先生成的模型,对所述问题进行处理,获取与所述问题对应的具有人类对话风格的回答,所述模型是根据人类对话语料生成的;将所述回答反馈给用户。
本申请实施例提出了一种计算机程序产品,当所述计算机程序产品中的指令被处理器执行时,使得处理器能够执行:接收用户输入的问题;根据预先生成的模型,对所述问题进行处理,获取与所述问题对应的具有人类对话风格的回答,所述模型是根据人类对话语料生成的;将所述回答反馈给用户。
需要说明的是,在本申请的描述中,术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性。此外,在本申请的描述中,除非另有说明,“多个”的含义是指至少两个。
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本申请的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或 按相反的顺序,来执行功能,这应被本申请的实施例所属技术领域的技术人员所理解。
应当理解,本申请的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。
此外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。
上述提到的存储介质可以是只读存储器,磁盘或光盘等。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示 例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。
尽管上面已经示出和描述了本申请的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (19)

  1. 一种基于人工智能的人机交互方法,其特征在于,包括:
    接收用户输入的问题;
    根据预先生成的模型,对所述问题进行处理,获取与所述问题对应的具有人类对话风格的回答,所述模型是根据人类对话语料生成的;
    将所述回答反馈给用户。
  2. 根据权利要求1所述的方法,其特征在于,所述模型包括:映射关系、预测模型和语法模型,所述映射关系用于表明问题中的关键词与回答中的关键词之间的映射关系,所述预测模型用于根据上下文信息在多种映射关系中确定出最优的一种映射关系,以及生成与确定出的映射关系中的关键词匹配的搭配词,所述语法模型用于调整词条顺序,根据调整后的词条生成符合语法结构的句子。
  3. 根据权利要求2所述的方法,其特征在于,所述根据预先生成的模型,对所述问题进行处理,获取与所述问题对应的具有人类对话风格的回答,包括:
    根据所述映射关系,确定与用户输入的问题中的关键词对应的回答中的关键词;
    根据所述预测模型,在确定出的关键词中选择最优的一组关键词,并根据选择的一组关键词生成搭配词;
    根据所述语法模型,对所述选择的一组关键词及生成的搭配词进行语法结构调整,得到满足语法结构的句子,作为具有人类对话风格的回答。
  4. 根据权利要求2-3任一项所述的方法,其特征在于,还包括:
    提取人类对话语料中问题中的关键词及对应的回答中的关键词,根据提取的关键词生成所述映射关系。
  5. 根据权利要求2-4任一项所述的方法,其特征在于,还包括:
    提取人类对话语料中问题中的关键词及对应的回答中的关键词,以及提取对应的上下文信息,根据提取的关键词和上下文信息,生成所述预测模型。
  6. 根据权利要求2-5任一项所述的方法,其特征在于,还包括:
    根据人类对话语料生成所述语法模型,或者,通过开放接口从其他系统获取所述语法模型。
  7. 根据权利要求1-6任一项所述的方法,其特征在于,还包括:
    对所述问题进行预处理,以根据预先生成的模型对预处理后的问题进行处理。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,还包括:
    根据与所述用户的交互对话,进行在线学习。
  9. 根据权利要求1-8任一项所述的方法,其特征在于,还包括:
    通过开放接口调用其他系统或者被其他系统调用。
  10. 一种基于人工智能的人机交互装置,其特征在于,包括:
    接收模块,用于接收用户输入的问题;
    获取模块,用于根据预先生成的模型,对所述问题进行处理,获取与所述问题对应的具有人类对话风格的回答,所述模型是根据人类对话语料生成的;
    反馈模块,用于将所述回答反馈给用户。
  11. 根据权利要求10所述的装置,其特征在于,所述模型包括:映射关系、预测模型和语法模型,所述映射关系用于表明问题中的关键词与回答中的关键词之间的映射关系,所述预测模型用于根据上下文信息在多种映射关系中确定出最优的一种映射关系,以及生成与确定出的映射关系中的关键词匹配的搭配词,所述语法模型用于调整词条顺序,根据调整后的词条生成符合语法结构的句子。
  12. 根据权利要求11所述的装置,其特征在于,所述获取模块包括:
    映射子模块,用于根据所述映射关系,确定与用户输入的问题中的关键词对应的回答中的关键词;
    预测子模块,用于根据所述预测模型,在确定出的关键词中选择最优的一组关键词,并根据选择的一组关键词生成搭配词;
    语法分析子模块,用于根据所述语法模型,对所述选择的一组关键词及生成的搭配词进行语法结构调整,得到满足语法结构的句子,作为具有人类对话风格的回答。
  13. 根据权利要求12所述的装置,其特征在于,
    所述映射子模块还用于:提取人类对话语料中问题中的关键词及对应的回答中的关键词,根据提取的关键词生成所述映射关系;或者,
    所述预测子模块还用于:提取人类对话语料中问题中的关键词及对应的回答中的关键词,以及提取对应的上下文信息,根据提取的关键词和上下文信息,生成所述预测模型;或者,
    所述语法分析子模块还用于:根据人类对话语料生成所述语法模型,或者,通过开放接口从其他系统获取所述语法模型。
  14. 根据权利要求10-13任一项所述的装置,其特征在于,还包括:
    预处理模块,用于对所述问题进行预处理,以触发所述获取模块根据预先生成的模型对预处理后的问题进行处理。
  15. 根据权利要求10-14任一项所述的装置,其特征在于,还包括:
    在线学习模块,用于根据与所述用户的交互对话,进行在线学习。
  16. 根据权利要求10-15任一项所述的装置,其特征在于,还包括:
    开放接口,用于为调用其他系统或被其他系统调用提供接口。
  17. 一种设备,其特征在于,包括:
    处理器;
    用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行:
    如权利要求1-9任一项所述的方法。
  18. 一种非临时性计算机可读存储介质,其特征在于,当所述存储介质中 的指令由处理器执行时,使得处理器能够执行:
    如权利要求1-9任一项所述的方法。
  19. 一种计算机程序产品,其特征在于,当所述计算机程序产品中的指令被处理器执行时,使得处理器能够执行:
    如权利要求1-9任一项所述的方法。
PCT/CN2017/072267 2016-09-05 2017-01-23 基于人工智能的人机交互方法和装置 WO2018040501A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US16/317,526 US11645547B2 (en) 2016-09-05 2017-01-23 Human-machine interactive method and device based on artificial intelligence
JP2019501993A JP6726800B2 (ja) 2016-09-05 2017-01-23 人工知能に基づくヒューマンマシンインタラクション方法及び装置
EP17844812.2A EP3508991A4 (en) 2016-09-05 2017-01-23 HUMAN-MACHINE INTERACTION METHOD AND DEVICE ON THE BASIS OF ARTIFICIAL INTELLIGENCE
KR1020197004771A KR102170563B1 (ko) 2016-09-05 2017-01-23 인공 지능에 기반한 휴먼 머신 인터랙티브 방법 및 장치

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610803645.8 2016-09-05
CN201610803645.8A CN106469212B (zh) 2016-09-05 2016-09-05 基于人工智能的人机交互方法和装置

Publications (1)

Publication Number Publication Date
WO2018040501A1 true WO2018040501A1 (zh) 2018-03-08

Family

ID=58230458

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/072267 WO2018040501A1 (zh) 2016-09-05 2017-01-23 基于人工智能的人机交互方法和装置

Country Status (6)

Country Link
US (1) US11645547B2 (zh)
EP (1) EP3508991A4 (zh)
JP (1) JP6726800B2 (zh)
KR (1) KR102170563B1 (zh)
CN (1) CN106469212B (zh)
WO (1) WO2018040501A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109033428A (zh) * 2018-08-10 2018-12-18 深圳市磐创网络科技有限公司 一种智能客服方法及系统
CN109840255A (zh) * 2019-01-09 2019-06-04 平安科技(深圳)有限公司 答复文本生成方法、装置、设备及存储介质
CN111833854A (zh) * 2020-01-08 2020-10-27 北京嘀嘀无限科技发展有限公司 一种人机交互方法与终端、计算机可读存储介质

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649778B (zh) * 2016-12-27 2020-03-03 北京百度网讯科技有限公司 基于深度问答的交互方法和装置
CN107168990A (zh) * 2017-03-28 2017-09-15 厦门快商通科技股份有限公司 基于用户性格的智能客服系统及对话方法
US10839017B2 (en) 2017-04-06 2020-11-17 AIBrain Corporation Adaptive, interactive, and cognitive reasoner of an autonomous robotic system utilizing an advanced memory graph structure
US10929759B2 (en) 2017-04-06 2021-02-23 AIBrain Corporation Intelligent robot software platform
US11151992B2 (en) 2017-04-06 2021-10-19 AIBrain Corporation Context aware interactive robot
US10810371B2 (en) 2017-04-06 2020-10-20 AIBrain Corporation Adaptive, interactive, and cognitive reasoner of an autonomous robotic system
US10963493B1 (en) * 2017-04-06 2021-03-30 AIBrain Corporation Interactive game with robot system
CN108733722B (zh) * 2017-04-24 2020-07-31 北京京东尚科信息技术有限公司 一种对话机器人自动生成方法及装置
US10628754B2 (en) 2017-06-06 2020-04-21 At&T Intellectual Property I, L.P. Personal assistant for facilitating interaction routines
CN108304436B (zh) 2017-09-12 2019-11-05 深圳市腾讯计算机系统有限公司 风格语句的生成方法、模型的训练方法、装置及设备
CN110019702B (zh) * 2017-09-18 2023-04-07 阿里巴巴集团控股有限公司 数据挖掘方法、装置和设备
CN107818787B (zh) * 2017-10-31 2021-02-05 努比亚技术有限公司 一种语音信息的处理方法、终端及计算机可读存储介质
CN108010531B (zh) * 2017-12-14 2021-07-27 南京美桥信息科技有限公司 一种可视智能问询方法及系统
CN108153875B (zh) * 2017-12-26 2022-03-11 北京金山安全软件有限公司 语料处理方法、装置、智能音箱和存储介质
CN108038230B (zh) * 2017-12-26 2022-05-20 北京百度网讯科技有限公司 基于人工智能的信息生成方法和装置
CN108711423A (zh) * 2018-03-30 2018-10-26 百度在线网络技术(北京)有限公司 智能语音交互实现方法、装置、计算机设备及存储介质
CN110471538B (zh) * 2018-05-10 2023-11-03 北京搜狗科技发展有限公司 一种输入预测方法及装置
WO2020060151A1 (en) 2018-09-19 2020-03-26 Samsung Electronics Co., Ltd. System and method for providing voice assistant service
CN109684453A (zh) * 2018-12-26 2019-04-26 联想(北京)有限公司 一种信息处理方法及电子设备
CN110069707A (zh) * 2019-03-28 2019-07-30 广州创梦空间人工智能科技有限公司 一种人工智能自适应互动教学系统
CN110046242A (zh) * 2019-04-22 2019-07-23 北京六行君通信息科技股份有限公司 一种自动应答装置及方法
CN110223697B (zh) * 2019-06-13 2022-04-22 思必驰科技股份有限公司 人机对话方法及系统
CN110689078A (zh) * 2019-09-29 2020-01-14 浙江连信科技有限公司 基于人格分类模型的人机交互方法、装置及计算机设备
KR102380397B1 (ko) * 2019-10-08 2022-03-31 채명진 IoT센서 및 인공지능을 이용한 스마트 빌딩 관리방법
KR102385198B1 (ko) * 2020-06-25 2022-04-12 (주)아크릴 인공지능 간의 대화를 위한 대화생성 시스템 및 방법
CN114519101B (zh) 2020-11-18 2023-06-06 易保网络技术(上海)有限公司 数据聚类方法和系统、数据存储方法和系统以及存储介质
CN112667796B (zh) * 2021-01-05 2023-08-11 网易(杭州)网络有限公司 一种对话回复方法、装置、电子设备及可读存储介质
US11610581B2 (en) * 2021-02-05 2023-03-21 International Business Machines Corporation Multi-step linear interpolation of language models
CN113032540B (zh) * 2021-03-19 2023-06-23 北京百度网讯科技有限公司 人机交互方法、装置、设备和存储介质
CN113488030A (zh) * 2021-07-06 2021-10-08 思必驰科技股份有限公司 语音点餐方法、装置及系统
CN113378583A (zh) * 2021-07-15 2021-09-10 北京小米移动软件有限公司 对话回复方法及装置、对话模型训练方法及装置、存储介质
CN116561286B (zh) * 2023-07-06 2023-10-27 杭州华鲤智能科技有限公司 一种对话方法及装置

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015023031A1 (ko) * 2013-08-14 2015-02-19 숭실대학교산학협력단 전문분야 검색 지원 방법 및 그 장치
CN105068661A (zh) * 2015-09-07 2015-11-18 百度在线网络技术(北京)有限公司 基于人工智能的人机交互方法和系统

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9318108B2 (en) * 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
CN101075435B (zh) * 2007-04-19 2011-05-18 深圳先进技术研究院 一种智能聊天系统及其实现方法
US8543565B2 (en) 2007-09-07 2013-09-24 At&T Intellectual Property Ii, L.P. System and method using a discriminative learning approach for question answering
CN101763212B (zh) * 2009-04-30 2012-08-15 广东国笔科技股份有限公司 人机交互系统及其相关系统、设备和方法
EP2839391A4 (en) * 2012-04-20 2016-01-27 Maluuba Inc CONVERSATION AGENT
US8577671B1 (en) 2012-07-20 2013-11-05 Veveo, Inc. Method of and system for using conversation state information in a conversational interaction system
KR102175539B1 (ko) 2013-10-18 2020-11-06 에스케이텔레콤 주식회사 사용자 발화 스타일에 따른 대화형 서비스 장치 및 방법
CN104615646A (zh) * 2014-12-25 2015-05-13 上海科阅信息技术有限公司 智能聊天机器人系统
CN105095444A (zh) * 2015-07-24 2015-11-25 百度在线网络技术(北京)有限公司 信息获取方法和装置

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015023031A1 (ko) * 2013-08-14 2015-02-19 숭실대학교산학협력단 전문분야 검색 지원 방법 및 그 장치
CN105068661A (zh) * 2015-09-07 2015-11-18 百度在线网络技术(北京)有限公司 基于人工智能的人机交互方法和系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3508991A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109033428A (zh) * 2018-08-10 2018-12-18 深圳市磐创网络科技有限公司 一种智能客服方法及系统
CN109840255A (zh) * 2019-01-09 2019-06-04 平安科技(深圳)有限公司 答复文本生成方法、装置、设备及存储介质
CN109840255B (zh) * 2019-01-09 2023-09-19 平安科技(深圳)有限公司 答复文本生成方法、装置、设备及存储介质
CN111833854A (zh) * 2020-01-08 2020-10-27 北京嘀嘀无限科技发展有限公司 一种人机交互方法与终端、计算机可读存储介质

Also Published As

Publication number Publication date
CN106469212B (zh) 2019-10-15
KR20190028793A (ko) 2019-03-19
CN106469212A (zh) 2017-03-01
JP2019528512A (ja) 2019-10-10
KR102170563B1 (ko) 2020-10-27
US11645547B2 (en) 2023-05-09
EP3508991A4 (en) 2020-02-12
US20190286996A1 (en) 2019-09-19
EP3508991A1 (en) 2019-07-10
JP6726800B2 (ja) 2020-07-22

Similar Documents

Publication Publication Date Title
WO2018040501A1 (zh) 基于人工智能的人机交互方法和装置
US11756537B2 (en) Automated assistants that accommodate multiple age groups and/or vocabulary levels
US11568855B2 (en) System and method for defining dialog intents and building zero-shot intent recognition models
US20210142794A1 (en) Speech processing dialog management
US8935163B2 (en) Automatic conversation system and conversation scenario editing device
JP6819988B2 (ja) 音声対話装置、サーバ装置、音声対話方法、音声処理方法およびプログラム
US11093110B1 (en) Messaging feedback mechanism
JP2018028752A (ja) 対話システム及びそのためのコンピュータプログラム
US10950223B2 (en) System and method for analyzing partial utterances
Dubiel et al. Investigating how conversational search agents affect user's behaviour, performance and search experience
CN110599999A (zh) 数据交互方法、装置和机器人
CN111128175B (zh) 口语对话管理方法及系统
KR20200130400A (ko) 네트워크에서 디지컬 컨텐츠에 대한 음성 기반 검색
US20210264812A1 (en) Language learning system and method
CN114120985A (zh) 智能语音终端的安抚交互方法、系统、设备及存储介质
CN111968646A (zh) 一种语音识别方法及装置
Vlasenko et al. Fusion of acoustic and linguistic information using supervised autoencoder for improved emotion recognition
JP6511192B2 (ja) 議論支援システム、議論支援方法、及び議論支援プログラム
Nothdurft et al. Application of verbal intelligence in dialog systems for multimodal interaction
Wahlster Robust translation of spontaneous speech: a multi-engine approach
US20220101840A1 (en) Assessment of the quality of a communication session over a telecommunication network
Marklynn et al. A Framework for Abstractive Summarization of Conversational Meetings
Sabharwal et al. Various Cognitive Platforms/Engines
KR20230014680A (ko) 서드파티 디지털 어시스턴트 액션을 위한 비트 벡터 기반 콘텐츠 매칭
Kraus Mixed-initiative intent recognition using cloud-based cognitive services

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17844812

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019501993

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20197004771

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2017844812

Country of ref document: EP