WO2020125457A1 - 多轮交互的语义理解方法、装置及计算机存储介质 - Google Patents

多轮交互的语义理解方法、装置及计算机存储介质 Download PDF

Info

Publication number
WO2020125457A1
WO2020125457A1 PCT/CN2019/123807 CN2019123807W WO2020125457A1 WO 2020125457 A1 WO2020125457 A1 WO 2020125457A1 CN 2019123807 W CN2019123807 W CN 2019123807W WO 2020125457 A1 WO2020125457 A1 WO 2020125457A1
Authority
WO
WIPO (PCT)
Prior art keywords
association
voice information
round
preset
current round
Prior art date
Application number
PCT/CN2019/123807
Other languages
English (en)
French (fr)
Inventor
徐小峰
张晨
田原
王一舒
Original Assignee
广东美的白色家电技术创新中心有限公司
美的集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东美的白色家电技术创新中心有限公司, 美的集团股份有限公司 filed Critical 广东美的白色家电技术创新中心有限公司
Publication of WO2020125457A1 publication Critical patent/WO2020125457A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the present application relates to the field of semantic understanding, in particular to a multi-round interactive semantic understanding method, semantic understanding device and computer storage medium.
  • the application provides a semantic understanding method, a semantic understanding device and a computer storage medium for multi-round interactions to solve the problem that the prior art cannot accurately understand voices in multi-round voice interactions.
  • this application provides a multi-round interactive semantic understanding method, which includes obtaining the current round of voice information; analyzing the current round of voice information according to at least two preset rules to determine the current round of voice information and Correlation of voice information of historical rounds; judging whether the correlation meets the preset conditions; in response to the judgment result of the correlation of meeting the preset conditions, according to the semantic understanding data of the voice information of historical rounds, analyze the voice information of the current round to obtain The semantic understanding data of the current round of voice information.
  • the present application provides a multi-round interactive semantic understanding device, including a processor and a memory, a computer program is stored in the memory, and the processor is used to execute the computer program to implement the above semantic understanding method.
  • the present application provides a computer storage medium in which a computer program is stored, and when the computer program is executed, the above semantic understanding method is implemented.
  • semantic understanding is performed for each round.
  • the present application first analyzes the current round of speech information through at least two preset rules , which can more accurately determine the association between the current round of voice information and the historical round of voice information; then when the association condition meets the preset conditions, analyze the current round of voice information according to the semantic understanding data of the historical round of voice information, Thus, the semantic understanding data of the current round of voice information can be accurately obtained.
  • FIG. 1 is a schematic flowchart of an embodiment of a semantic understanding method for multiple rounds of interaction in this application;
  • FIG. 2 is a schematic flowchart of another embodiment of a semantic understanding method for multiple rounds of interaction in this application;
  • FIG. 4 is a schematic structural diagram of an embodiment of a multi-round interactive system of the present application.
  • FIG. 5 is a schematic structural diagram of an embodiment of a multi-round interactive semantic understanding device of the present application.
  • FIG. 6 is a schematic structural diagram of an embodiment of a computer storage medium of the present application.
  • the semantic understanding method for multiple rounds of interaction in this application belongs to the field of natural language processing. It studies the communication between humans and computers through natural language. This application is concerned with task-driven multiple rounds of interaction. Intent, thereby responding to user intent, or executing user intent. During multiple rounds of interaction, the user may continue the previous topic, or may change the topic, and for the smart devices of daily life, the user will perform spoken voice interaction; for this situation, an accurate judgment mechanism is required to Accurately determine the correlation between the user's current round of speech and historical round of speech, so as to more intelligently realize semantic understanding.
  • FIG. 1 is a schematic flowchart of an embodiment of a multi-round interactive semantic understanding method of the present application.
  • the semantic understanding method of this embodiment includes the following steps.
  • a computer for semantic understanding, it can obtain the current round of voice information through its own voice sensor, such as a microphone, and can also communicate with other devices to obtain the current round of voice information through the voice sensor of other devices.
  • the computer generally only obtains the current voice information without determining the round information; however, to facilitate the description of the relationship between the current round and the historical round in multi-round interaction, the concept of the current round is introduced in this application.
  • the current round here is at least the second round of multiple rounds of interaction, and for the first round of voice information, you can directly To understand it semantically.
  • S102 Analyze the current round of voice information according to at least two preset rules to determine the association between the current round of voice information and the historical round of voice information.
  • this embodiment uses at least two preset rules to analyze the current round of voice information to determine the association between the current round of voice information and the historical round of voice information.
  • Using at least two preset rules for analysis that is, using more dimensions to analyze the current round of voice information, can make this semantic understanding method more suitable for human voice interactions with multiple dialogue characteristics, which can be more accurate Determines whether the current voice and the previous voice are related, and then can determine whether the current voice and the previous voice belong to the same topic, and whether they constitute multiple rounds of interaction on the same topic.
  • the preset rules involve human dialogue features, so the preset rules may be rules related to demonstrative pronouns, rules related to information integrity, rules related to grammatical accuracy, or rules related to interval time.
  • the relevant rules of demonstrative pronouns are: whether demonstrative pronouns such as "this” and “that" appear.
  • Corresponding analysis of the voice information of the current round may include: when the demonstrative pronoun appears in the voice information of the current round, it indicates that it is related to the voice information of the historical round.
  • the relevant rules of information integrity are: whether it can fill the semantic slot in semantic understanding completely.
  • Corresponding analysis of the current round of voice information may include: when the information in the current round of voice information is incomplete, it indicates that it is related to the historical round of voice information.
  • the relevant rules of grammatical accuracy are: whether the grammar is accurate or its accuracy.
  • Corresponding analysis of the current round of voice information may include: when the grammar of the current round of voice information is inaccurate, it indicates that it is related to the historical round of voice information.
  • the interval time-related rules are: whether the time interval between the current round of voice information and the previous round of voice information exceeds the threshold.
  • Corresponding analysis of the current round of voice information may include: when the time interval does not exceed the threshold, it indicates that it is related to the historical round of voice information.
  • the preset rules can also be set according to the characteristics of the dialogue, which is not limited here. Further, in this embodiment, at least two preset rules for analyzing the voice information of the current round are associated with the dialogue features of the preset application domain, and the relevant Default rules. Among them, the preset application field is the application field of the semantic understanding method of this embodiment.
  • the dialogue features in this field are more colloquial, and usually use demonstrative pronouns or omit the information dialogue method, so priority Use demonstrative pronoun related rules and information integrity related rules for analysis; if applied to the work area, the dialogue features in this area are more rigorous, the dialogue process is usually more accurate in grammar, and the real-time nature of the dialogue is stronger; therefore, the grammar-related rules are preferred Analyze the rules related to interval time.
  • step S103 it is determined whether the association condition meets the preset condition. If the association condition meets the preset condition, go to step S104; if the association condition does not meet the preset condition, go to step S105.
  • the association condition meets the preset condition, which means that the current round of voice information and the historical round of voice information are related to each other or have a high degree of correlation, so the understanding of the historical round of voice information can be continued to understand the current round of voice information. If the association condition does not meet the preset condition, it means that the current round of voice information and the historical round of voice information are not related to each other or have a low degree of correlation, so a separate understanding of the current round of voice information is performed.
  • Q3 is the voice information of the current round, and it is determined that "Q3" is not related to "Q2" through at least two preset rules, so the "Q3" is re-understood.
  • S104 Analyze the current round of speech information according to the semantic understanding data of the historical round of speech information to obtain the semantic understanding data of the current round of speech information.
  • the current round of voice information After determining that the current round of voice information and the historical round of voice information are related to each other, the current round of voice information can be analyzed according to the semantic understanding data of the historical round of voice information to obtain semantic understanding data of the current round of voice information.
  • Semantic understanding data is the data generated when understanding speech information.
  • the domain is generally divided first, and the definition of the domain can be a general domain or a culinary domain; then intent analysis is performed, using the intentions of different domains Tree to determine the intent; after determining the intent, then determine the semantic slot corresponding to the intent.
  • semantic understanding data generally includes domain data, intent data, and semantic slot data.
  • the semantic understanding data of the historical round of speech information on which it is based may be the semantic understanding data of the previous round of speech information, or the semantic understanding data of the speech information of the previous rounds.
  • S105 Clear the semantic understanding data of the historical round of speech information, and analyze the current round of speech information to obtain the semantic understanding data of the current round of speech information.
  • the current round of voice information and the historical round of voice information are not related to each other, that is, the current round of voice information belongs to another dialogue field, and is not related to the historical round, it needs to re-semantic understanding, so the historical round of voice will be cleared Semantic understanding data of the information, analyze the current round of voice information, that is, re-determine the current round of voice information domain information, and based on the domain information, semantic understanding of the current round of voice information, including intent understanding and semantic slot filling.
  • At least two preset rules are used to analyze the current round of voice information, and the correlation between the current round of voice information and the historical round of voice information is accurately judged from multiple dimensions, and then the different rounds of correlation are used to determine whether Use the semantic understanding data of the historical round of voice information to analyze the current round of voice information to accurately understand the current round of voice information, making the entire multi-round interaction process more intelligent and more suitable for human natural dialogue.
  • FIG. 2 is a schematic flowchart of another embodiment of a multi-round interactive semantic understanding method of the present application.
  • the semantic understanding method of this embodiment includes the following steps.
  • This step S201 is similar to the above step S101, and the details are not repeated here.
  • the analysis of the current round of voice information according to at least two preset rules is mainly through the process of sequential analysis and judgment.
  • the first priority preset rule is used to determine, if it is determined that the association situation does not meet
  • the second priority preset rules are used to judge; if it is judged that the association conditions meet the preset conditions, the judgment is ended; the judgment is analyzed and judged in order of the priority of at least two preset rules from high to low Until the end of the process of sequential analysis and judgment.
  • S202 Analyze the current round of voice information by using preset rules to obtain the association of the corresponding preset rules.
  • this step S202 is repeated multiple times, and each execution only uses a single preset rule to analyze the current round of voice information, so as to obtain the association condition corresponding to the preset rule.
  • the setting of the preset rule and the analysis of the voice information, that is, the determination of the association situation are similar to step S102, and the details are not repeated here.
  • the priority of at least two preset rules in this embodiment may also be set according to the dialog characteristics of the applied field.
  • S203 Determine whether the associated condition meets the preset condition.
  • step S202 a single preset rule is used to analyze the current round of voice information, and after determining the association, it is determined whether the association meets the preset conditions. If the association meets the preset conditions, then step S206 is performed, that is, the semantics is directly performed. It is understood that there is no need to use the preset rules afterwards for analysis and judgment; if the associated conditions do not meet the preset conditions, go to step S204.
  • step S202 the current round of voice information is analyzed with a single preset rule to determine the association; generally, it is determined whether the current round of voice information and the historical round of voice information are related to each other.
  • step S203 it is determined whether the association condition meets the preset condition, that is, whether the association condition is related to each other; wherein, the association condition is that the association condition corresponds to the association condition meets the preset condition, and the association condition is that the association condition does not correspond to the association The situation does not meet the preset conditions.
  • S204 Determine whether the preset rule is the preset rule with the lowest priority.
  • step S202 When the preset rule is used to determine that the association condition does not meet the preset condition, it is determined whether the preset rule is the lowest priority preset rule. If the preset rule is not the lowest priority preset rule, the following The preset rule with a priority is re-executed in step S202; if the preset rule is the preset rule with the lowest priority, it is determined that the association condition is not meeting the preset condition, and step S205 is performed.
  • S205 Clear the semantic understanding data of the historical round of speech information, and analyze the current round of speech information to obtain the semantic understanding data of the current round of speech information.
  • S206 Analyze the current round of speech information according to the semantic understanding data of the historical round of speech information to obtain the semantic understanding data of the current round of speech information.
  • the preset rules are used to analyze the voice information of the current round in order of priority from high to low, so as to determine the association. Further, in this embodiment, the priority of the preset rules can also be determined according to the dialogue characteristics of the applied field, that is, the preset rules most relevant to the dialogue characteristics are preferentially used to analyze the current round of voice information, so as to be more accurate Perform semantic understanding of the dialogue in this application area.
  • FIG. 3 is a schematic flowchart of another embodiment of the semantic understanding method for multiple rounds of interaction in the present application.
  • the semantic understanding method in this embodiment includes the following steps.
  • This step S301 is similar to the above step S101, and the details are not repeated here.
  • At least two preset rules are used to comprehensively analyze the current round of voice information, so as to obtain the correlation between the current round of voice information and the historical round of voice information. Specific steps are as follows.
  • S302 Use at least two preset rules to analyze the current round of voice information to obtain at least two associated scores corresponding to each preset rule.
  • a preset rule is used to analyze the current round of voice information to obtain an associated score.
  • the preset rule is no longer a simple judgment, but involves analysis of metrics. For example, in the interval time correlation rule, it can be analyzed to which interval time period the interval time belongs, and different interval time periods correspond to different correlation scores.
  • S303 Combine at least two association scores and the weight of each association score to calculate the association degree between the current round of voice information and the historical round of voice information.
  • the correlation scores are comprehensively solved to obtain a correlation degree, and the solution uses a combination of correlation scores and weights.
  • the weight of the correlation score is positively related to the priority of the preset rule corresponding to the correlation score, and the priority of the preset rule can also be set according to the dialogue characteristics of the applied field.
  • the calculated association degree can more accurately reflect the natural human dialogue in the application field.
  • S304 Determine whether the degree of association exceeds the threshold of the degree of association.
  • step S305 is performed; if the relevance degree does not exceed the relevance degree threshold corresponding to the relevance condition not meeting To preset conditions, proceed to step S306.
  • S305 Analyze the current round of speech information according to the semantic understanding data of the historical round of speech information to obtain the semantic understanding data of the current round of speech information.
  • S306 Clear the semantic understanding data of the historical round of speech information, and analyze the current round of speech information to obtain the semantic understanding data of the current round of speech information
  • each preset rule is used to analyze to obtain a plurality of corresponding association scores, and the association degree is calculated by combining the weights of the association scores for judgment, wherein the weight of the association scores is positively correlated with the priority of the corresponding preset rule, And the priority is determined by the dialogue characteristics of the applied field, so that an accurate understanding of the dialogue in the application field can be achieved.
  • FIG. 4 is a schematic structural diagram of an embodiment of a multi-round interaction system of the present application.
  • the multi-round interaction system 100 of this embodiment includes a speech recognition module 11, a semantic understanding module 12, a dialogue management module 13, and a language generation module 14. Voice broadcast module 15 and command execution module 16.
  • the voice recognition module (ASR) 11 converts the voice information into text information, and the text information is transmitted to the semantic understanding module (NLU) 12 for understanding; when multiple rounds of dialogue occur, the dialogue management module (DM) 13 is used to determine the current round The association relationship between the secondary voice information and the historical round voice information.
  • ASR voice recognition module
  • NLU semantic understanding module
  • DM dialogue management module
  • the dialogue management module 13 uses the above method to determine the association relationship; and then uses the semantic understanding module (NLU) 12 to understand to determine the current round of voice
  • the semantic understanding data of the information uses the semantic understanding data to determine the reply content or the execution instruction; for the reply content, the speech generation module (NLG) 14 and the voice broadcast module (TTS) 15 are used to realize the voice Reply; and execute the command through the command execution module 16 to execute.
  • the multi-round interaction system of this embodiment can accurately understand the user's language and achieve a high fluency of human-machine voice interaction.
  • FIG. 5 is a schematic structural diagram of an embodiment of a multi-round interactive semantic understanding device of the present application.
  • the semantic understanding device 200 of this embodiment includes a processor 21 and a memory 22. Among them, a computer program is stored in the memory 22, and the processor 21 is used to execute the computer program to implement the semantic understanding method of the above-mentioned multiple rounds of interaction.
  • the processor 21 is used to obtain the current round of voice information; analyze the current round of voice information according to at least two preset rules preset in the memory 22 to determine the association between the current round of voice information and the historical round of voice information; Judging whether the association condition meets the preset condition; in response to the judgment result of the association condition meeting the preset condition, analyzing the current round of speech information according to the semantic understanding data of the historical round of speech information in the memory 22 to obtain the current round of speech information Semantic understanding data.
  • the processor 21 may be an integrated circuit chip, which has signal processing capability.
  • the processor 21 may also be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components .
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • the general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the semantic understanding device 200 may be a smart home appliance that implements smart dialogue in home life.
  • the preset rules in the corresponding home appliance are determined according to the dialogue characteristics of the home domain to which it is applied.
  • the semantic understanding device 200 may also be a server, and the smart home appliance is connected to the server, and combines the functions of the server to realize multiple rounds of voice interaction.
  • FIG. 6 is an embodiment of the computer storage medium of this application
  • a schematic diagram of the structure, the computer storage medium 300 of this embodiment includes a computer program 31, which can be executed to implement the method in the above embodiment.
  • the computer storage medium 300 may be a medium that can store program instructions, such as a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a random access memory (RAM, Random Access Memory), a magnetic disk, or an optical disk. Or, it may be a server that stores the program instructions, and the server may send the stored program instructions to other devices to run, or it may run the stored program instructions by itself.
  • the disclosed method and device may be implemented in other ways.
  • the device implementation described above is only schematic.
  • the division of modules or units is only a division of logical functions.
  • there may be other divisions for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical, or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or software function unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application may be essentially or part of the contribution to the existing technology or all or part of the technical solution may be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to enable a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute all or part of the steps of the methods of the embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

一种多轮交互的语义理解方法、语义理解装置及计算机存储介质,语义理解方法包括:获取当前轮次语音信息(S101);根据至少两个预设规则分析当前轮次语音信息,以确定当前轮次语音信息与历史轮次语音信息的关联情况(S102);判断关联情况是否符合预设条件(S103);响应于关联情况符合预设条件的判断结果,根据历史轮次语音信息的语义理解数据,分析当前轮次语音信息,以获得当前轮次语音信息的语义理解数据(S104)。该方法可准确判断多轮交互中前后轮次是否关联,以实现语音的准确理解。

Description

多轮交互的语义理解方法、装置及计算机存储介质
本申请要求于2018年12月21日提交的申请号为2018115721792,发明名称为“多轮交互的语义理解方法、装置及计算机存储介质”的中国专利申请的优先权,其通过引用方式全部并入本申请。
【技术领域】
本申请涉及语义理解领域,特别是涉及一种多轮交互的语义理解方法、语义理解装置及计算机存储介质。
【背景技术】
如今,在人类生活设施的智能化发展中,生活设施类似于智能机器人能够与人类进行交互,以使得人类生活活动更为智能。而人机交互涉及到语音对话,在对话过程中,语义理解成为很重要的一环。
特别是对于多轮交互的情况,在更多口语化的对话中,如何实现准确的理解成为当前研究难点。
【发明内容】
本申请提供一种多轮交互的语义理解方法、语义理解装置及计算机存储介质,以解决现有技术无法实现多轮语音交互中语音的准确理解问题。
为解决上述技术问题,本申请提供一种多轮交互的语义理解方法,其包括获取当前轮次语音信息;根据至少两个预设规则分析当前轮次语音信息,以确定当前轮次语音信息与历史轮次语音信息的关联情况;判断关联情况是否符合预设条件;响应于关联情况符合预设条件的判断结果,根据历史轮次语音信息的语义理解数据,分析当前轮次语音信息,以获得当前轮次语音信息的语义理解数据。
为解决上述技术问题,本申请提供一种多轮交互的语义理解装置,包括处理器和存储器,存储器中存储有计算机程序,处理器用于执行计算机程序以实现上述语义理解方法。
为解决上述技术问题,本申请提供一种计算机存储介质,其中存储有计算 机程序,计算机程序被执行时实现上述语义理解方法。
本申请多轮交互的语义理解方法中,对于每一轮次均进行语义理解,在对当前轮次语音信息进行语义理解时,本申请首先通过至少两个预设规则来分析当前轮次语音信息,可更为准确的确定当前轮次语音信息和历史轮次语音信息的关联情况;然后在关联情况符合预设条件时,根据历史轮次语音信息的语义理解数据来分析当前轮次语音信息,从而准确的获得当前轮次语音信息的语义理解数据。
【附图说明】
图1是本申请多轮交互的语义理解方法一实施例的流程示意图;
图2是本申请多轮交互的语义理解方法另一实施例的流程示意图;
图3是本申请多轮交互的语义理解方法又一实施例的流程示意图;
图4是本申请多轮交互系统一实施例的结构示意图;
图5是本申请多轮交互的语义理解装置一实施例的结构示意图;
图6是本申请计算机存储介质一实施例的结构示意图。
【具体实施方式】
为使本领域的技术人员更好地理解本发明的技术方案,下面结合附图和具体实施方式对本申请所提供的一种多轮交互的语义理解方法、语义理解装置及计算机存储介质做进一步详细描述。
本申请多轮交互的语义理解方法属于自然语言处理领域,其研究人与计算机之间能通过自然语言进行的沟通,本申请研究的是任务驱动型多轮交互,通过多轮交互明确用户语音的意图,从而回应用户意图,或执行用户意图。在进行多轮交互时,用户可能会延续上一话题,也可能会换一个话题,并且对于生活化的智能设备,用户会进行口语化的语音交互;对于该情况,需要一个准确的判断机制来准确确定用户当前轮次语音与历史轮次语音的相关性,从而更为智能的实现语义理解。
具体请参阅图1,图1是本申请多轮交互的语义理解方法一实施例的流程示意图,本实施例语义理解方法包括以下步骤。
S101:获取当前轮次语音信息。
对于进行语义理解的计算机来说,可通过自身的语音传感器例如麦克风获得当前轮次语音信息,还可与其他设备通信连接,通过其他设备的语音传感器获取到当前轮次语音信息。另外,计算机一般是仅获取到当前语音信息,而不会确定轮次信息;但为方便描述多轮交互中当前轮次与历史轮次的关系,在本申请中引入当前轮次的概念。
由于后续步骤中会分析当前轮次语音信息与历史轮次语音信息的关系,因而此处当前轮次为多轮交互中的至少第二轮次,而对于第一轮次的语音信息,可直接对其进行语义理解。
S102:根据至少两个预设规则分析当前轮次语音信息,以确定当前轮次语音信息与历史轮次语音信息的关联情况。
在多轮交互中,本实施例采用至少两个预设规则来分析当前轮次语音信息,从而确定当前轮次语音信息与历史轮次语音信息的关联关系。使用至少两个预设规则来进行分析即使用了更多的维度来对当前轮次语音信息进行分析,可使得本语义理解方法更加适用于有多种对话特征的人类语音交互,继而可更准确的判断出当前语音与上一语音是否存在关联,继而可判断当前语音与上一语音是否属于同一话题,是否构成同一话题的多轮交互。
预设规则涉及到人类对话特征,因而预设规则可以为指示代词相关规则、信息完整度相关规则、语法准确度相关规则或间隔时间相关规则。
其中,指示代词相关规则为:是否出现“这个”“那个”等指示代词。相应对当前轮次语音信息进行分析可包括:在当前轮次语音信息中出现指示代词时,表示其与历史轮次语音信息相关。
信息完整度相关规则为:是否能完整的填充语义理解中的语义槽。相应对当前轮次语音信息进行分析可包括:在当前轮次语音信息中信息不完整时,表示其与历史轮次语音信息相关。
语法准确度相关规则为:语法是否准确或其准确度。相应对当前轮次语音信息进行分析可包括:在当前轮次语音信息的语法不准确时,表示其与历史轮次语音信息相关。
间隔时间相关规则为:当前轮次语音信息与上一轮次语音信息的时间间隔是否超过阈值。相应对当前轮次语音信息进行分析可包括:在时间间隔未超过阈值时,表示其与历史轮次语音信息相关。
预设规则除了上述提到的几个,还可依据对话特征进行另外设置,在此不 作限定。进一步的,在本实施例中,用于分析当前轮次语音信息的至少两个预设规则均与预设应用领域的对话特征相关联,即可根据预设应用领域的对话特征来设置相关的预设规则。其中,预设应用领域为本实施例语义理解方法所应用的领域,例如若应用于生活领域中,该领域的对话特征比较口语化,通常会用指示代词,或省略信息的对话方式,因而优先采用指示代词相关规则和信息完整度相关规则来进行分析;若应用于工作领域,该领域的对话特征比较严谨,对话过程通常语法比较准确,且对话的实时性比较强;因而优先采用语法相关规则和间隔时间相关规则来进行分析。
本实施例中不仅仅使用至少两个预设规则来进行分析,实现了多维度的准确分析;还从智能设备的应用角度,针对其应用领域,来设置不同的预设规则,以更适用于该领域的对话,从而实现更为准确的语义分析。
S103:判断关联情况是否符合预设条件。
在上述步骤确定当前轮次语音信息和历史轮次语音信息的关联情况后,即可对不同的关联情况应用不同的语义理解方法。因而,在本步骤S103中,判断关联情况是否符合预设条件,在关联情况符合预设条件的情况下,进入步骤S104;在关联情况不符合预设条件的情况下,进入步骤S105。
关联情况符合预设条件,即表示当前轮次语音信息与历史轮次语音信息相互关联或关联度高,因而可延续对历史轮次语音信息的理解来理解当前轮次语音信息。若关联情况不符合预设条件,则表示当前轮次语音信息与历史轮次语音信息不相互关联或关联度低,因而对当前轮次语音信息重新进行单独的理解。
关于当前轮次语音信息与历史轮次语音信息的情况,可以下述对话的例子来理解。
问1:今天深圳天气如何?
答1:天气很好。
问2:哈尔滨呢?
答2:大雪。
问3:给我推荐一个四川菜吧?
答3:鱼香肉丝
问4:这个要怎么做?
答4:……(提供鱼香肉丝的菜谱)
对于以上对话,“问2”作为当前轮次语音信息,可通过信息完整度相关规 则判断出“问2”与“问1”相关,因而根据“问1”的语义理解,可理解“问2”在询问天气,因而“答2”也回答天气相关信息。
“问3”作为当前轮次语音信息,通过至少两个预设规则均判断出“问3”与“问2”不相关,因而对“问3”进行重新理解。
“问4”作为当前轮次语音信息,可通过指示代词相关规则判断出“问4”与“问3”相关,因而根据“问3”的语义理解,可理解“问4”在询问鱼香肉丝怎么做,因而“答4”即提供鱼香肉丝的菜谱。
S104:根据历史轮次语音信息的语义理解数据,分析当前轮次语音信息,以获得当前轮次语音信息的语义理解数据。
在确定当前轮次语音信息与历史轮次语音信息相互关联后,即可根据历史轮次语音信息的语义理解数据,来分析当前轮次语音信息,从而获得当前轮次语音信息的语义理解数据。
语义理解数据即对语音信息进行理解时所生成的数据,在进行语义理解时,一般先做领域划分,领域的定义可以是通用领域或是烹饪领域;然后在进行意图分析,使用不同领域的意图树来确定意图;在确定意图后再进行该意图所对应语义槽的确定。因而相应的,语义理解数据一般包括领域数据、意图数据及语义槽数据。而对当前轮次语音信息进行理解时,所缺失的信息直接使用历史轮次的语义理解数据即可,而无需再通过轮询的方式向用户确定,使得交流过程更流畅,用户体验更好。
所依据的历史轮次语音信息的语义理解数据可以是上一轮次语音信息语义理解数据,也可以是前几轮次语音信息的语义理解数据。
S105:清除历史轮次语音信息的语义理解数据,分析当前轮次语音信息,以获得当前轮次语音信息的语义理解数据。
在确定当前轮次语音信息与历史轮次语音信息不相互关联后,即当前轮次语音信息属于另一对话领域,与历史轮次不相关,需要重新进行语义理解,因而会清除历史轮次语音信息的语义理解数据,分析当前轮次语音信息,即重新确定当前轮次语音信息的领域信息,并基于领域信息,对当前轮次的语音信息进行语义理解,包括意图理解及语义槽的填充。
本步骤中对历史轮次语音信息的语义理解数据进行清除动作,以免其影响当前轮次语音信息的理解。
本实施例中通过至少两个预设规则来分析当前轮次语音信息,从多维度实 现当前轮次语音信息与历史轮次语音信息的关联情况准确判断,继而针对不同的关联情况,从而确定是否使用历史轮次语音信息的语义理解数据来分析当前轮次语音信息,以准确理解当前轮次语音信息,使得整个多轮交互过程更加智能化,更加适应于人类的自然对话。
基于图1所示实施例,其中,至少两个预设规则的应用有多种方式,例如以下实施例。
请参阅图2,图2是本申请多轮交互的语义理解方法另一实施例的流程示意图,本实施例语义理解方法包括以下步骤。
S201:获取当前轮次语音信息。
本步骤S201与上述步骤S101类似,具体不再赘述。
在本实施例中,根据至少两个预设规则分析当前轮次语音信息主要是通过依次分析判断的过程,具体为首先利用第一优先级的预设规则来判断,若判断出关联情况不符合预设条件,则采用第二优先级的预设规则来判断;若判断出关联情况符合预设条件,则结束判断;以至少两个预设规则的优先级从高到低的次序依次分析判断,直至结束该依次分析判断的过程。具体步骤如下。
S202:利用预设规则分析当前轮次语音信息,以获得对应预设规则的关联情况。
本实施例中的本步骤S202为多次重复进行步骤,而每次执行时均只利用单一的预设规则来分析当前轮次语音信息,从而获得对应该预设规则的关联情况。其中预设规则的设置,语音信息的分析,即关联情况的确定均与步骤S102类似,具体不再赘述。需说明的是,如步骤S102中所示,本实施例中至少两个预设规则的优先级也可根据所应用领域的对话特征来设置。
S203:判断关联情况是否符合预设条件。
在步骤S202中采用单一预设规则分析当前轮次语音信息,并确定关联情况后,即判断该关联情况是否符合预设条件,若关联情况符合预设条件,则进行步骤S206,即直接进行语义理解,无需再利用之后的预设规则进行分析判断;若关联情况不符合预设条件,则进行步骤S204。
在步骤S202中,以单一预设规则来分析当前轮次语音信息,以确定关联情况;一般是确定当前轮次语音信息与历史轮次语音信息是否为相互关联。而在本步骤S203中,判断关联情况是否符合预设条件,即判断关联情况是否为相互关联;其中,关联情况为相互关联对应于关联情况符合预设条件,关联情况为 不相互关联对应于关联情况不符合预设条件。
S204:判断预设规则是否为最低优先级的预设规则。
当利用该预设规则判断出关联情况为不符合预设条件时,则判断该预设规则是否为最低优先级的预设规则,若预设规则不为最低优先级的预设规则,则以下一优先级的预设规则重新进行步骤S202;若预设规则为最低优先级的预设规则,则确定关联情况为不符合预设条件,进行步骤S205。
S205:清除历史轮次语音信息的语义理解数据,分析当前轮次语音信息,以获得当前轮次语音信息的语义理解数据。
S206:根据历史轮次语音信息的语义理解数据,分析当前轮次语音信息,以获得当前轮次语音信息的语义理解数据。
上述步骤S205-S206与步骤S104-S105类似,具体不再赘述。
本实施例以优先级从高到低的顺序依次利用预设规则来对当前轮次语音信息进行分析,从而确定关联情况。进一步的,本实施例中还可根据所应用领域的对话特征来确定预设规则的优先级,即优先采用与对话特征最相关的预设规则来分析当前轮次语音信息,从而能更准确的进行该应用领域中对话的语义理解。
请参阅图3,图3是本申请多轮交互的语义理解方法又一实施例的流程示意图,本实施例语义理解方法包括以下步骤。
S301:获取当前轮次语音信息。
本步骤S301与上述步骤S101类似,具体不再赘述。
在本实施例中,采用至少两个预设规则来综合分析当前轮次语音信息,从而获得当前轮次语音信息与历史轮次语音信息的关联度。具体步骤如下。
S302:利用至少两个预设规则分别分析当前轮次语音信息,以获得对应各个预设规则的至少两个关联分数。
在本步骤S302中利用预设规则分析当前轮次语音信息可获得一个关联分数,预设规则不再是简单的是否判断,而涉及到度量的分析。例如间隔时间相关规则中,可分析间隔时间属于哪个间隔时间段,而不同的间隔时间段则对应不同的关联分数。
S303:结合至少两个关联分数和各个关联分数的权重,计算当前轮次语音信息和历史轮次语音信息的关联度。
在获得对应于各个预设规则的至少两个关联分数后,对关联分数进行综合 求解,以得到一个关联度,求解采用关联分数结合权重的方式。其中,关联分数的权重与关联分数所对应预设规则的优先级正相关,而预设规则的优先级也可根据所应用领域的对话特征来设置。
与对话特征越相关的优先级越高,优先级越高相应关联分数的权重越大,因而所计算出的关联度则越能准确反映该应用领域的人类自然对话情况。
S304:判断关联度是否超过关联度阈值。
在获得关联度后,则由关联度阈值来判断关联度,若关联度超过关联度阈值对应于关联情况符合预设条件,进行步骤S305;若关联度未超过关联度阈值对应于关联情况不符合预设条件,进行步骤S306。
S305:根据历史轮次语音信息的语义理解数据,分析当前轮次语音信息,以获得当前轮次语音信息的语义理解数据。
S306:清除历史轮次语音信息的语义理解数据,分析当前轮次语音信息,以获得当前轮次语音信息的语义理解数据
上述步骤S305-S306与步骤S104-S105类似,具体不再赘述。
本实施例利用各个预设规则来分析以获得对应的多个关联分数,并结合关联分数的权重计算得到关联度,以进行判断,其中关联分数的权重与对应预设规则的优先级正相关,且优先级由所应用领域的对话特征来确定,因而可实现该应用领域对话的准确理解。
基于本申请多轮交互的语义理解方法,可构建一多轮交互系统。请参阅图4,图4是本申请多轮交互系统一实施例的结构示意图,本实施例多轮交互系统100包括语音识别模块11、语义理解模块12、对话管理模块13、语言生成模块14、语音播报模块15及命令执行模块16。
其中,语音识别模块(ASR)11将语音信息转化为文字信息,文字信息被传输到语义理解模块(NLU)12进行理解;在出现多轮对话时,利用对话管理模块(DM)13确定当前轮次语音信息与历史轮次语音信息的关联关系,此处对话管理模块13则利用了上述方法来实现关联关系的确定;然后再利用语义理解模块(NLU)12进行理解,以确定当前轮次语音信息的语义理解数据;对话管理模块(DM)13再利用该语义理解数据确定回复内容或确定执行指令;对于回复内容,通过语言生成模块(NLG)14及语音播报模块(TTS)15来实现语音回复;而对于执行指令则通过命令执行模块16来执行。
本实施例多轮交互系统可准确的理解用户语言,实现人机语音交互较高的 流畅性。
上述方法应用于硬件设备中,可实现多轮语音交互。具体请参阅图5,图5是本申请多轮交互的语义理解装置一实施例的结构示意图,本实施例语义理解装置200包括处理器21和存储器22。其中,存储器22中存储有计算机程序,处理器21用于执行计算机程序以实现上述多轮交互的语义理解方法。
处理器21用于获取当前轮次语音信息;根据预设在存储器22中的至少两个预设规则分析当前轮次语音信息,以确定当前轮次语音信息与历史轮次语音信息的关联情况;判断关联情况是否符合预设条件;响应于关联情况符合预设条件的判断结果,根据存储器22中历史轮次语音信息的语义理解数据,分析当前轮次语音信息,以获得当前轮次语音信息的语义理解数据。
其中,处理器21可以是一种集成电路芯片,具有信号的处理能力。处理器21还可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
语义理解装置200可以是智能家电设备,实现家居生活中的智能对话,相应的家电设备中所预设的规则是根据其所应用的家居领域的对话特征来确定的。语义理解装置200也可以是服务器,智能家电设备连接到服务器,结合服务器的功能来实现多轮语音交互。
对于图1-图3所示实施例的方法,其可以计算机程序的形式呈现,本申请提出一种承载计算机程序的计算机存储介质,请参阅图6,图6是本申请计算机存储介质一实施例的结构示意图,本实施例计算机存储介质300包括计算机程序31,其可被执行以实现上述实施例中的方法。
本实施例计算机存储介质300可以是U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等可以存储程序指令的介质,或者也可以为存储有该程序指令的服务器,该服务器可将存储的程序指令发送给其他设备运行,或者也可以自运行该存储的程序指令。
在本申请所提供的几个实施例中,应该理解到,所揭露的方法和装置,可以通过其它的方式实现。例如,以上所描述的装置实施方式仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外 的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施方式方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施方式方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅为本申请的实施方式,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (10)

  1. 一种多轮交互的语义理解方法,其特征在于,所述方法包括:
    获取当前轮次语音信息;
    根据至少两个预设规则分析所述当前轮次语音信息,以确定所述当前轮次语音信息与历史轮次语音信息的关联情况;
    判断所述关联情况是否符合预设条件;
    响应于所述关联情况符合所述预设条件的判断结果,根据所述历史轮次语音信息的语义理解数据,分析所述当前轮次语音信息,以获得所述当前轮次语音信息的语义理解数据。
  2. 根据权利要求1所述的方法,其特征在于,所述至少两个预设规则均具有优先级;
    所述根据至少两个预设规则分析所述当前轮次语音信息,以确定所述当前轮次语音信息与历史轮次语音信息的关联情况;判断所述关联情况是否符合预设条件;包括:
    以所述优先级从高到低的次序,依次利用各个所述预设规则分析所述当前轮次语音信息,以获得所述关联情况;其中,所述关联情况对应所利用的所述预设规则;并判断所述关联情况是否符合预设条件;
    直至判定所述关联情况为符合预设条件,或者直至利用最低优先级的预设规则分析所述当前轮次语音信息。
  3. 根据权利要求2所述的方法,其特征在于,所述判断所述关联情况是否符合预设条件,包括:
    判断所述关联情况是否为相互关联,其中,所述关联情况为相互关联对应于所述关联情况符合预设条件,所述关联情况为不相互关联对应于所述关联情况不符合预设条件。
  4. 根据权利要求1所述的方法,其特征在于,所述根据至少两个预设规则分析所述当前轮次语音信息,以确定所述当前轮次语音信息与历史轮次语音信息的关联情况,包括:
    根据至少两个预设规则分析所述当前轮次语音信息,以计算所述当前轮次语音信息与历史轮次语音信息的关联度;
    所述判断所述关联情况是否符合预设条件,包括:
    判断所述关联度是否超过关联度阈值,其中,所述关联度超过关联度阈值 对应于所述关联情况符合预设条件,所述关联度未超过关联度阈值对应于所述关联情况不符合预设条件。
  5. 根据权利要求4所述的方法,其特征在于,所述根据至少两个预设规则分析所述当前轮次语音信息,以计算所述当前轮次语音信息与历史轮次语音信息的关联度,包括:
    利用所述至少两个预设规则分别分析所述当前轮次语音信息,以获得对应各个预设规则的所述当前轮次与所述历史轮次的至少两个关联分数;
    结合所述至少两个关联分数以及各个所述关联分数的权重,计算所述关联度;其中,所述关联分数的权重与所述关联分数所对应预设规则的优先级正相关。
  6. 根据权利要求1所述的方法,其特征在于,所述至少两个预设规则均与预设应用领域的对话特征相关联。
  7. 根据权利要求1所述的方法,其特征在于,所述至少两个预设规则包括:指示代词相关规则、信息完整度相关规则、语法准确度相关规则及间隔时间相关规则中的至少两个。
  8. 根据权利要求7所述的方法,其特征在于,所述指示代词相关规则的优先级高于所述信息完整度相关规则的优先级,所述信息完整度相关规则的优先级高于所述语法准确度相关规则的优先级,所述语法准确度相关规则的优先级高于所述间隔时间相关规则的优先级。
  9. 一种多轮交互的语义理解装置,其特征在于,所述语义理解装置包括处理器和存储器;所述存储器中存储有计算机程序,所述处理器用于执行所述计算机程序以实现如权利要求1-8中任一项所述方法的步骤。
  10. 一种计算机存储介质,其特征在于,所述计算机存储介质存储有计算机程序,所述计算机程序被执行时实现如权利要求1-8中任一项所述方法的步骤。
PCT/CN2019/123807 2018-12-21 2019-12-06 多轮交互的语义理解方法、装置及计算机存储介质 WO2020125457A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811572179.2A CN111429895B (zh) 2018-12-21 2018-12-21 多轮交互的语义理解方法、装置及计算机存储介质
CN201811572179.2 2018-12-21

Publications (1)

Publication Number Publication Date
WO2020125457A1 true WO2020125457A1 (zh) 2020-06-25

Family

ID=71101987

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/123807 WO2020125457A1 (zh) 2018-12-21 2019-12-06 多轮交互的语义理解方法、装置及计算机存储介质

Country Status (2)

Country Link
CN (1) CN111429895B (zh)
WO (1) WO2020125457A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112632251A (zh) * 2020-12-24 2021-04-09 北京百度网讯科技有限公司 回复内容的生成方法、装置、设备和存储介质
CN113656562A (zh) * 2020-11-27 2021-11-16 话媒(广州)科技有限公司 一种多轮次人机心理交互方法及装置
CN115022733A (zh) * 2022-06-17 2022-09-06 中国平安人寿保险股份有限公司 摘要视频生成方法、装置、计算机设备及存储介质

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112084768A (zh) * 2020-08-06 2020-12-15 珠海格力电器股份有限公司 一种多轮交互方法、装置及存储介质
CN112256229B (zh) * 2020-09-11 2024-05-14 北京三快在线科技有限公司 人机语音交互方法、装置、电子设备及存储介质
CN112164401B (zh) * 2020-09-18 2022-03-18 广州小鹏汽车科技有限公司 语音交互方法、服务器和计算机可读存储介质
CN112417107A (zh) * 2020-10-22 2021-02-26 联想(北京)有限公司 一种信息处理方法及装置
CN112992137B (zh) * 2021-01-29 2022-12-06 青岛海尔科技有限公司 语音交互方法和装置、存储介质及电子装置
CN113035180A (zh) * 2021-03-22 2021-06-25 建信金融科技有限责任公司 语音输入完整性判断方法、装置、电子设备和存储介质
CN113380241B (zh) * 2021-05-21 2024-03-08 珠海格力电器股份有限公司 语义交互的调整方法、装置、语音设备及存储介质
CN113435196B (zh) * 2021-06-22 2022-07-29 平安科技(深圳)有限公司 意图识别方法、装置、设备及存储介质
CN113555018B (zh) * 2021-07-20 2024-05-28 海信视像科技股份有限公司 语音交互方法及装置
CN113673253B (zh) * 2021-08-13 2023-07-07 珠海格力电器股份有限公司 语义交互方法、装置及电子设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589844A (zh) * 2015-12-18 2016-05-18 北京中科汇联科技股份有限公司 一种用于多轮问答系统中缺失语义补充的方法
CN105704013A (zh) * 2016-03-18 2016-06-22 北京光年无限科技有限公司 基于上下文的话题更新数据处理方法及装置
CN106503156A (zh) * 2016-10-24 2017-03-15 北京百度网讯科技有限公司 基于人工智能的人机交互方法及装置
CN106777013A (zh) * 2016-12-07 2017-05-31 科大讯飞股份有限公司 对话管理方法和装置
CN107785018A (zh) * 2016-08-31 2018-03-09 科大讯飞股份有限公司 多轮交互语义理解方法和装置
CN107799116A (zh) * 2016-08-31 2018-03-13 科大讯飞股份有限公司 多轮交互并行语义理解方法和装置
CN108959412A (zh) * 2018-06-07 2018-12-07 出门问问信息科技有限公司 标注数据的生成方法、装置、设备及存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013167806A (ja) * 2012-02-16 2013-08-29 Toshiba Corp 情報通知支援装置、情報通知支援方法、および、プログラム
SG10201503834PA (en) * 2014-05-20 2015-12-30 Hootsuite Media Inc Method and system for managing voice calls in association with social media content
CN107665706B (zh) * 2016-07-29 2021-05-04 科大讯飞股份有限公司 快速语音交互方法及系统
CN108037905B (zh) * 2017-11-21 2021-12-21 北京光年无限科技有限公司 一种用于智能机器人的交互输出方法及智能机器人
CN108388638B (zh) * 2018-02-26 2020-09-18 出门问问信息科技有限公司 语义解析方法、装置、设备及存储介质
CN108962261A (zh) * 2018-08-08 2018-12-07 联想(北京)有限公司 信息处理方法、信息处理装置和蓝牙耳机

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589844A (zh) * 2015-12-18 2016-05-18 北京中科汇联科技股份有限公司 一种用于多轮问答系统中缺失语义补充的方法
CN105704013A (zh) * 2016-03-18 2016-06-22 北京光年无限科技有限公司 基于上下文的话题更新数据处理方法及装置
CN107785018A (zh) * 2016-08-31 2018-03-09 科大讯飞股份有限公司 多轮交互语义理解方法和装置
CN107799116A (zh) * 2016-08-31 2018-03-13 科大讯飞股份有限公司 多轮交互并行语义理解方法和装置
CN106503156A (zh) * 2016-10-24 2017-03-15 北京百度网讯科技有限公司 基于人工智能的人机交互方法及装置
CN106777013A (zh) * 2016-12-07 2017-05-31 科大讯飞股份有限公司 对话管理方法和装置
CN108959412A (zh) * 2018-06-07 2018-12-07 出门问问信息科技有限公司 标注数据的生成方法、装置、设备及存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656562A (zh) * 2020-11-27 2021-11-16 话媒(广州)科技有限公司 一种多轮次人机心理交互方法及装置
CN113656562B (zh) * 2020-11-27 2024-07-02 话媒(广州)科技有限公司 一种多轮次人机心理交互方法及装置
CN112632251A (zh) * 2020-12-24 2021-04-09 北京百度网讯科技有限公司 回复内容的生成方法、装置、设备和存储介质
CN112632251B (zh) * 2020-12-24 2023-12-29 北京百度网讯科技有限公司 回复内容的生成方法、装置、设备和存储介质
CN115022733A (zh) * 2022-06-17 2022-09-06 中国平安人寿保险股份有限公司 摘要视频生成方法、装置、计算机设备及存储介质
CN115022733B (zh) * 2022-06-17 2023-09-15 中国平安人寿保险股份有限公司 摘要视频生成方法、装置、计算机设备及存储介质

Also Published As

Publication number Publication date
CN111429895A (zh) 2020-07-17
CN111429895B (zh) 2023-05-05

Similar Documents

Publication Publication Date Title
WO2020125457A1 (zh) 多轮交互的语义理解方法、装置及计算机存储介质
WO2021022992A1 (zh) 对话生成模型的训练方法、对话生成方法、装置及介质
WO2021169615A1 (zh) 基于人工智能的语音响应处理方法、装置、设备及介质
US9583102B2 (en) Method of controlling interactive system, method of controlling server, server, and interactive device
KR20190075800A (ko) 지능형 개인 보조 인터페이스 시스템
CN114207710A (zh) 检测和/或登记热命令以由自动助理触发响应动作
WO2018040501A1 (zh) 基于人工智能的人机交互方法和装置
US20200342854A1 (en) Method and apparatus for voice interaction, intelligent robot and computer readable storage medium
WO2017084334A1 (zh) 一种语种识别方法、装置、设备及计算机存储介质
CN112634897B (zh) 设备唤醒方法、装置和存储介质及电子装置
CN113674746B (zh) 人机交互方法、装置、设备以及存储介质
CN113674742B (zh) 人机交互方法、装置、设备以及存储介质
WO2020135067A1 (zh) 语音交互方法、装置、机器人及计算机可读存储介质
CN108897517B (zh) 一种信息处理方法及电子设备
CN107742516B (zh) 智能识别方法、机器人及计算机可读存储介质
CN117253478A (zh) 一种语音交互方法和相关装置
CN116013257A (zh) 语音识别、语音识别模型训练方法、装置、介质及设备
US12062361B2 (en) Wake word method to prolong the conversational state between human and a machine in edge devices
CN107632813A (zh) 一种关闭闹钟功能的方法及装置
CN115731915A (zh) 对话机器人的主动对话方法、装置、电子设备及存储介质
EP4093005A1 (en) System method and apparatus for combining words and behaviors
CN109002498A (zh) 人机对话方法、装置、设备及存储介质
CN111414760B (zh) 自然语言处理方法及相关设备、系统和存储装置
CN114399992A (zh) 语音指令响应方法、装置及存储介质
TWI639997B (zh) 基於機率規則之對話理解方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19901185

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19901185

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17.11.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19901185

Country of ref document: EP

Kind code of ref document: A1