CN112148848A

CN112148848A - A question and answer processing method and device

Info

Publication number: CN112148848A
Application number: CN202010885459.XA
Authority: CN
Inventors: 李喜莲; 牛嘉斌; 雷欣; 李志飞
Original assignee: Go Out And Ask Suzhou Information Technology Co ltd
Current assignee: Volkswagen China Investment Co Ltd; Mobvoi Innovation Technology Co Ltd
Priority date: 2020-08-28
Filing date: 2020-08-28
Publication date: 2020-12-29
Anticipated expiration: 2040-08-28
Also published as: CN112148848B

Abstract

The invention discloses a question and answer processing method and device, and relates to the technical field of artificial intelligence. One embodiment of the method comprises: acquiring a voice request; processing the voice request to obtain an intention classification corresponding to the voice request; and executing answer searching operation corresponding to the intention classification on the voice request, and feeding back a searching answer. Therefore, the voice requests with multiple intentions and incomplete intentions can be recognized, the active inquiry function is provided, and the intellectualization of the dialogue system is improved.

Description

A question and answer processing method and device

技术领域technical field

本发明涉及人工智能技术领域，尤其涉及一种问答处理方法及装置。The present invention relates to the technical field of artificial intelligence, and in particular, to a question and answer processing method and device.

背景技术Background technique

现有对话系统包括语音识别模块、自然语言理解模块、对话管理模块、语言生成模块和语音生成模块。现有语言识别模块只能识别单意图的语音请求，无法识别多意图或不完整意图的语音请求。例如，用户语音请求为“我想听”，现有对话系统的回答为“准备播放《朋友》”、“为您播放音乐”或者“没找到这个歌曲”等。针对多意图的语音请求，例如，用户语音请求为“播放哈利波特”，现有对话系统的回答为“准备播放哈利波特歌曲”、“为您播放音乐”或者“没找到这个歌曲”等。Existing dialogue systems include a speech recognition module, a natural language understanding module, a dialogue management module, a language generation module and a speech generation module. Existing language recognition modules can only recognize single-intent voice requests, but cannot recognize multi-intent or incomplete-intent voice requests. For example, the user's voice request is "I want to listen", and the answer of the existing dialogue system is "preparing to play "Friends", "playing music for you" or "can't find this song", etc. For multi-intent voice requests, for example, the user's voice request is "play Harry Potter", and the existing dialogue system's answer is "ready to play a Harry Potter song", "play music for you" or "can't find this song" "Wait.

对于多意图或不完整意图的语音请求，现有对话系统的处理方法比较简单粗暴，直接将多意图或者不完整意图划分到概率较高的单意图上，然后按照单意图返回与语音请求相应的答案。虽然该过程完成了一轮对话，但是没有充分体现对话系统的主动性，因此降低了对话系统应答的准确率。For voice requests with multiple intents or incomplete intents, the processing method of the existing dialogue system is relatively simple and rude. It directly divides multiple intents or incomplete intents into a single intent with a higher probability, and then returns the corresponding voice request according to the single intent. Answer. Although this process completes a round of dialogue, it does not fully reflect the initiative of the dialogue system, thus reducing the accuracy of the dialogue system's response.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明实施例提供一种问答处理方法及装置，能够识别多意图和不完整意图的语音请求，并提供主动询问的功能，提升了对话系统的智能化。In view of this, the embodiments of the present invention provide a question and answer processing method and device, which can identify voice requests with multiple intentions and incomplete intentions, and provide the function of active inquiry, which improves the intelligence of the dialogue system.

为实现上述目的，根据本发明实施例第一方面，提供一种问答处理方法，该方法包括：获取语音请求；对所述语音请求进行处理，得到与所述语音请求相应的意图分类；对所述语音请求执行与所述意图分类相应的答案搜索操作，并反馈搜索答案。In order to achieve the above object, according to the first aspect of the embodiments of the present invention, there is provided a question and answer processing method, the method includes: obtaining a voice request; processing the voice request to obtain an intent classification corresponding to the voice request; The voice request performs an answer search operation corresponding to the intent classification, and feeds back the search answer.

可选的，所述对所述语音请求进行处理，得到与所述语音请求相应的意图分类，包括：对所述语音请求进行语义解析；利用模型对所述语音请求进行意图识别；将语义解析结果为无场景且意图识别结果为不明确意图的语音请求确定为不完整意图；并将语义解析结果为有场景且意图识别结果为明确意图的语音请求确定为完整意图。Optionally, the processing of the voice request to obtain an intent classification corresponding to the voice request includes: performing semantic analysis on the voice request; using a model to perform intent recognition on the voice request; The voice request whose result is no scene and the intent recognition result is ambiguous intent is determined as incomplete intent; and the voice request whose semantic parsing result is scene and the intent recognition result is clear intent is determined as complete intent.

可选的，所述对所述语音请求执行与所述意图分类相应的答案搜索操作，包括：若所述意图分类是完整意图，则在所述语义解析结果指示的场景下搜索所述语音请求的答案；若所述意图分类是不完整意图，则预测与所述语音请求相应的多个预选场景，从多个所述预选场景中选取满足第一预设条件的预选场景作为候选场景；发送与所述候选场景相应的询问请求，在询问请求结果指示的候选场景下搜索所述语音请求的答案。Optionally, performing an answer search operation corresponding to the intent classification on the voice request includes: if the intent classification is a complete intent, searching for the voice request in a scenario indicated by the semantic analysis result. If the intent classification is an incomplete intent, predict multiple preselected scenarios corresponding to the voice request, and select a preselected scenario that satisfies the first preset condition from the multiple preselected scenarios as candidate scenarios; send For the query request corresponding to the candidate scene, the answer to the voice request is searched in the candidate scene indicated by the query request result.

可选的，所述在所述语义解析结果指示的场景下搜索所述语音请求的答案，包括：若所述语义解析结果存在实体值，则在所述语义解析结果指示的场景下搜索与所述实体值相应的答案，并从搜索答案中选取所述语音请求的答案；若所述语义解析结果不存在实体值，则搜索与所述语义解析结果指示的场景相应的答案，并从搜索答案中选取所述语音请求的答案。Optionally, the searching for the answer to the voice request in the scenario indicated by the semantic parsing result includes: if the semantic parsing result has an entity value, searching for the answer to the voice request in the scenario indicated by the semantic parsing result. and select the answer of the voice request from the search answer; if there is no entity value in the semantic parsing result, search for the answer corresponding to the scene indicated by the semantic parsing result, and select the answer from the search answer to select the answer to the voice request.

可选的，所述若所述语义解析结果存在实体值，则在所述语义解析结果指示的场景下搜索与所述实体值相应的答案，包括：若存在一个实体值，则在所述语义解析结果指示的场景下搜索与所述实体值相应的答案；若存在多个实体值，则确定多个所述实体值是否同时满足相同语义槽；若是，则在所述语义解析结果指示的场景下针对每个所述实体值分别搜索答案，从每个所述实体值对应的答案中选取任意一个答案；针对多个所述实体值，将所选取的多个答案进行拼接，得到与所述实体值对应的搜索答案；若否，则在所述语义解析结果指示的场景下搜索同时满足多个所述实体值的答案。Optionally, if the semantic parsing result has an entity value, searching for an answer corresponding to the entity value in the scenario indicated by the semantic parsing result, including: if there is an entity value, then in the semantic Search for the answer corresponding to the entity value in the scenario indicated by the parsing result; if there are multiple entity values, determine whether the multiple entity values satisfy the same semantic slot at the same time; if so, in the scenario indicated by the semantic parsing result Next, the answer is searched for each of the entity values, and any answer is selected from the answers corresponding to each of the entity values; for a plurality of the entity values, the selected answers are spliced to obtain the The search answer corresponding to the entity value; if not, search for an answer that simultaneously satisfies a plurality of the entity values in the scenario indicated by the semantic parsing result.

可选的，所述第一预设条件是指置信度排在前两位且有搜索答案的预选场景。Optionally, the first preset condition refers to a preselected scene with the top two confidence levels and a search answer.

为实现上述目的，根据本发明实施例第二方面，还提供一种问答处理装置，该处理装置包括：获取模块，用于获取语音请求；处理模块，用于对所述语音请求进行处理，得到与所述语音请求相应的意图分类；搜索模块，用于对所述语音请求执行与所述意图分类相应的答案搜索操作，并反馈搜索答案。In order to achieve the above object, according to the second aspect of the embodiments of the present invention, there is also provided a question and answer processing device, the processing device includes: an acquisition module for acquiring a voice request; a processing module for processing the voice request to obtain Intention classification corresponding to the voice request; a search module for performing an answer search operation corresponding to the intention classification to the voice request, and feeding back the search answer.

可选的，所述处理模块包括：语义解析单元，用于对所述语音请求进行语义解析；意图识别单元，用于利用模型对所述语音请求进行意图识别；确定单元，用于将语义解析结果为无场景且意图识别结果为不明确意图的语音请求确定为不完整意图；并将语义解析结果为有场景且意图识别结果为明确意图的语音请求确定为完整意图。Optionally, the processing module includes: a semantic parsing unit for performing semantic parsing on the voice request; an intent recognition unit for performing intent recognition on the voice request by using a model; a determining unit for parsing the semantics The voice request whose result is no scene and the intent recognition result is ambiguous intent is determined as incomplete intent; and the voice request whose semantic parsing result is scene and the intent recognition result is clear intent is determined as complete intent.

可选的，所述搜索模块包括：第一搜索单元，用于若所述意图分类是完整意图，则在所述语义解析结果指示的场景下搜索所述语音请求的答案；第二搜索单元，用于若所述意图分类是不完整意图，则预测与所述语音请求相应的多个预选场景，从多个所述预选场景中选取满足第一预设条件的预选场景作为候选场景，发送与所述候选场景相应的询问请求，在询问请求结果指示的候选场景下搜索所述语音请求的答案。Optionally, the search module includes: a first search unit, configured to search for an answer to the voice request in the scenario indicated by the semantic analysis result if the intent classification is a complete intent; a second search unit, If the intent classification is an incomplete intent, predict multiple pre-selected scenarios corresponding to the voice request, select a pre-selected scenario that satisfies the first preset condition from the multiple pre-selected scenarios as candidate scenarios, and send the For the query request corresponding to the candidate scene, the answer to the voice request is searched in the candidate scene indicated by the query request result.

可选的，所述第一搜索单元包括：实体值单元，用于若所述语义解析结果存在实体值，则在所述语义解析结果指示的场景下搜索与所述实体值相应的答案，并从搜索答案中选取所述语音请求的答案；场景单元，用于若所述语义解析结果不存在实体值，则搜索与所述语义解析结果指示的场景相应的答案，并从搜索答案中选取所述语音请求的答案。Optionally, the first search unit includes: an entity value unit, configured to search for an answer corresponding to the entity value in the scenario indicated by the semantic analysis result if there is an entity value in the semantic analysis result, and Select the answer of the voice request from the search answer; the scene unit is used to search for the answer corresponding to the scene indicated by the semantic parsing result if the semantic parsing result does not have an entity value, and select the answer from the search answer answer the voice request.

为实现上述目的，根据本发明实施例第三方面，还提供一种计算机可读介质，其上存储有计算机程序，所述程序被处理器执行时实现如第一方面所述的问答处理方法。To achieve the above object, according to a third aspect of the embodiments of the present invention, a computer-readable medium is further provided, on which a computer program is stored, and when the program is executed by a processor, the question and answer processing method described in the first aspect is implemented.

本发明实施例通过对获取的语音请求进行处理，得到与所述语音请求相应的意图分类，之后对语音请求执行与所述意图分类相应的答案搜索操作，并反馈搜索答案。由此，能够识别多意图和不完整意图的语音请求，并提供主动询问的功能，提升了对话系统的智能化。In the embodiment of the present invention, an intent classification corresponding to the voice request is obtained by processing the acquired voice request, and then an answer search operation corresponding to the intent classification is performed on the voice request, and the search answer is fed back. As a result, voice requests with multiple intentions and incomplete intentions can be recognized, and the function of active inquiry can be provided, which improves the intelligence of the dialogue system.

上述的非惯用的可选方式所具有的进一步的效果将在下文结合具体实施方式加以说明。Further effects of the above non-conventional alternatives will be described below in conjunction with specific embodiments.

附图说明Description of drawings

附图用于更好地理解本发明，不构成对本发明的不当限定。其中在附图中，相同或对应的标号表示相同或对应的部分。The accompanying drawings are used for better understanding of the present invention and do not constitute an improper limitation of the present invention. In the drawings, the same or corresponding reference numerals denote the same or corresponding parts.

图1为本发明实施例问答处理方法的流程图；1 is a flowchart of a question and answer processing method according to an embodiment of the present invention;

图2为本发明实施例针对不完整意图语音请求的问答处理方法的流程图；2 is a flowchart of a question and answer processing method for an incomplete intent voice request according to an embodiment of the present invention;

图3为本发明实施例针对完整意图语音请求的问答处理方法的流程图；3 is a flowchart of a question and answer processing method for a complete intent voice request according to an embodiment of the present invention;

图4为本发明一实施例的问答处理装置的示意图；4 is a schematic diagram of a question and answer processing apparatus according to an embodiment of the present invention;

图5为本发明实施例可以应用于其中的示例性系统架构图；FIG. 5 is an exemplary system architecture diagram to which an embodiment of the present invention may be applied;

图6是适于用来实现本发明实施例的终端设备或服务器的计算机系统的结构示意图。FIG. 6 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.

具体实施方式Detailed ways

以下结合附图对本发明的示范性实施例做出说明，其中包括本发明实施例的各种细节以助于理解，应当将它们认为仅仅是示范性的。因此，本领域普通技术人员应当认识到，可以对这里描述的实施例做出各种改变和修改，而不会背离本发明的范围和精神。同样，为了清楚和简明，以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

如图1所示，为本发明一实施例的问答处理方法的流程图，该方法至少包括如下操作流程：As shown in FIG. 1, it is a flowchart of a question and answer processing method according to an embodiment of the present invention, and the method at least includes the following operation flow:

S101，获取语音请求。S101, acquiring a voice request.

示例性的，通过文字输入或语音输入的方式获取语音请求。Exemplarily, the voice request is obtained by means of text input or voice input.

S102，对语音请求进行处理，得到与语音请求相应的意图分类。S102: Process the voice request to obtain an intent classification corresponding to the voice request.

示例性的，对语音请求进行语义解析；利用模型对语音请求进行意图识别；将语义解析结果为无场景且意图识别结果为不明确意图的语音请求确定为不完整意图；并将语义解析结果为有场景且意图识别结果为明确意图的语音请求确定为完整意图。当对语音请求的处理结果中出现以下两种情形时，均属于异常情况，遇到这种异常情况，通常需要优化语义解析模块或意图识别模块。这两种情形分别是语义解析结果为无场景且意图识别结果为明确意图，以及语义解析结果为有场景且意图识别结果为不明确意图。Exemplarily, perform semantic parsing on the voice request; use the model to perform intent recognition on the voice request; determine the voice request whose semantic parsing result is no scene and the intent recognition result is an ambiguous intent as an incomplete intent; and the semantic parsing result is A voice request that has a scene and the intent recognition result is a clear intent is determined to be a complete intent. When the following two situations occur in the processing result of the voice request, both belong to abnormal situations. When encountering such abnormal situations, it is usually necessary to optimize the semantic parsing module or the intent recognition module. The two cases are that the semantic parsing result is no scene and the intent recognition result is a clear intent, and the semantic parsing result is a scene and the intent recognition result is an ambiguous intent.

S103，对语音请求执行与意图分类相应的答案搜索操作，并反馈搜索答案。S103, perform an answer search operation corresponding to the intent classification on the voice request, and feed back the search answer.

示例性的，意图分类包括完整意图和不完整意图。若语音请求的意图分类是不完整意图，则搜索与语音请求相应的多个候选答案，并针对候选答案发送询问请求以确定语音请求的答案。Exemplarily, intent categories include complete intents and incomplete intents. If the intent classification of the voice request is an incomplete intent, multiple candidate answers corresponding to the voice request are searched, and an inquiry request is sent for the candidate answers to determine an answer to the voice request.

若语音请求的意图分类是完整意图，则在语义解析结果指示的场景下搜索语音请求的答案；进一步，若语义解析结果存在实体值，则在语义解析结果指示的场景下搜索与实体值相应的答案，并从搜索答案中选取语音请求的答案；若语义解析结果不存在实体值，则搜索与语义解析结果指示的场景相应的答案，并从搜索答案中选取语音请求的答案。If the intent classification of the voice request is a complete intent, search for the answer to the voice request in the scenario indicated by the semantic parsing result; further, if there is an entity value in the semantic parsing result, search for the corresponding entity value in the scenario indicated by the semantic parsing result answer, and select the answer of the voice request from the search answer; if there is no entity value in the semantic parsing result, search for the answer corresponding to the scene indicated by the semantic parsing result, and select the answer of the voice request from the search answer.

由此，本发明实施例通过语义解析结果和意图识别结果确定语音请求的意图分类，从而对语音请求执行与意图分类相应的答案搜索操作，进而提升了对话系统的智能性。Therefore, the embodiment of the present invention determines the intent classification of the voice request through the semantic analysis result and the intent recognition result, so as to perform an answer search operation corresponding to the intent classification on the voice request, thereby improving the intelligence of the dialogue system.

如图2所示，为本发明实施例针对不完整意图语音请求的问答处理方法的流程图；该方法至少包括如下操作流程：As shown in FIG. 2, it is a flowchart of a question and answer processing method for an incomplete intent voice request according to an embodiment of the present invention; the method at least includes the following operation flow:

S201，获取语音请求。S201, acquiring a voice request.

S202，分别对语音请求进行语义解析和意图识别，得到与语音请求相应的意图分类为不完整意图。S202: Perform semantic analysis and intent recognition on the voice request, respectively, to obtain an intent corresponding to the voice request and classify it as an incomplete intent.

示例性的，语音请求的语义解析结果为无场景，意图识别模型对语音请求的意图识别结果为不明确意图；由此，确定语音请求对应的意图分类为不完整意图。Exemplarily, the semantic analysis result of the voice request is no scene, and the intent recognition result of the voice request by the intent recognition model is an unclear intent; thus, it is determined that the intent corresponding to the voice request is classified as an incomplete intent.

S203，预测与语音请求相应的多个预选场景。S203: Predict multiple preselected scenarios corresponding to the voice request.

S204，从多个预选场景中选取满足第一预设条件的预选场景作为候选场景。S204: Select a preselected scene that satisfies the first preset condition from the plurality of preselected scenes as a candidate scene.

示例性的，第一预设条件是指置信度排在前两位且有搜索答案的预选场景。语义解析模块预测与语音请求相应的多个预选场景时，输出预选场景的置信度。Exemplarily, the first preset condition refers to a preselected scene with the top two confidence levels and a search answer. When the semantic parsing module predicts multiple pre-selected scenarios corresponding to the voice request, it outputs the confidence level of the pre-selected scenarios.

在这里，第一预设条件是根据实际应用场景人为设定的。Here, the first preset condition is artificially set according to the actual application scenario.

S205，发送与候选场景相应的询问请求。S205, sending a query request corresponding to the candidate scene.

S206，在询问请求结果指示的候选场景下搜索语音请求的答案。S206: Search for an answer to the voice request in the candidate scene indicated by the query request result.

具体地，语音请求为“我想听”，对该语音请求进行语义解析，语义解析结果表征该语音请求无场景；利用意图识别模型对该语音请求进行意图识别，意图识别结果表征该语音请求为不明确意图，由此确定该语音请求为不完整意图。通过语义解析模块预测与该语音请求对应的多个预选场景，并输出每个预选场景的置信度，从多个预选场景中选取置信度排在前两位且有搜索答案的预选场景，例如多个预选场景按照预选场景的置信度从大到小的顺序排序，依次为音乐场景、诗歌场景、有声书场景、相声场景和戏曲场景，选取的前两位预选场景分别为音乐场景和诗歌场景；然后确定该语音请求在音乐场景或诗歌场景下是否具有答案。若只有在音乐场景下具有搜索答案，例如搜索答案为“朋友”、“同一首歌”以及“爱人”，则将音乐场景作为该语音请求的候选场景。若在音乐场景和在诗歌场景下均具有搜索答案，例如在诗歌场景下的搜索答案为“绝句”。发送询问请求，该询问请求为“您是想听音乐还是诗歌”，若用户反馈内容为“想听音乐”，则从音乐场景相应的搜索答案中选取一个作为该语音请求的答案，例如从“朋友”、“同一首歌”以及“爱人”三个搜索答案中选取“朋友”作为“我想听”的答案。Specifically, the voice request is "I want to listen", semantic analysis is performed on the voice request, and the semantic analysis result indicates that the voice request has no scene; the intent recognition model is used to identify the voice request, and the intent recognition result indicates that the voice request is The intent is ambiguous, whereby the voice request is determined to be an incomplete intent. Predict multiple pre-selected scenarios corresponding to the voice request through the semantic parsing module, output the confidence level of each pre-selected scenario, and select the pre-selected scenarios with the top two confidence scores and search answers from the multiple pre-selected scenarios, such as multiple pre-selected scenarios. The pre-selected scenes are sorted in descending order of the confidence of the pre-selected scenes, followed by music scene, poetry scene, audiobook scene, cross talk scene and opera scene, and the first two pre-selected scenes are music scene and poetry scene respectively; It is then determined whether the voice request has an answer in the context of music or poetry. If there is a search answer only in the music scene, for example, the search answers are "friend", "same song" and "lover", the music scene is used as the candidate scene for the voice request. If there is a search answer in both the music scene and the poetry scene, for example, the search answer in the poetry scene is "quatrain". Send a query request, the query request is "do you want to listen to music or poetry", if the user feedback content is "want to listen to music", select one of the corresponding search answers in the music scene as the answer to the voice request, such as from " "friend", "same song" and "lover", select "friend" as the answer of "I want to hear".

在这里，可以基于搜索答案的热度选取排名第一位的搜索答案作为该语音请求的答案，也可以随机选取一个搜索答案作为该语音请求的答案。Here, the first search answer may be selected as the answer to the voice request based on the popularity of the search answer, or a search answer may be randomly selected as the answer to the voice request.

本发明实施例能够识别不完整意图并针对不完整意图提供主动询问的功能，实现了对话系统的主动性和智能化。The embodiment of the present invention can identify the incomplete intention and provide the function of active inquiry for the incomplete intention, which realizes the initiative and intelligence of the dialogue system.

如图3所示，为本发明实施例针对完整意图语音请求的问答处理方法的流程图；该方法至少包括如下操作流程：As shown in FIG. 3, it is a flowchart of a question and answer processing method for a complete intent voice request according to an embodiment of the present invention; the method includes at least the following operation process:

S301，获取语音请求。S301, acquiring a voice request.

S302，分别对语音请求进行语义解析和意图识别，得到与语音请求相应的意图分类为完整意图；若语义解析结果存在一个实体值，则执行S303操作；若语义解析结果存在多个实体值，则执行S304操作；若语义解析结果不存在实体值，则执行S307操作。S302: Perform semantic analysis and intent recognition on the voice request respectively, and obtain the intent corresponding to the voice request and classify it as a complete intent; if there is one entity value in the semantic analysis result, perform the operation of S303; if the semantic analysis result has multiple entity values, then Execute operation S304; if there is no entity value in the semantic analysis result, execute operation S307.

S303，在语义解析结果指示的场景下搜索与实体值相应的答案；之后执行S308操作。S303, search for an answer corresponding to the entity value in the scenario indicated by the semantic analysis result; then perform the operation of S308.

示例性的，搜索答案可以是一个，也可以是多个。Exemplarily, there may be one or multiple search answers.

S304，确定多个实体值是否同时满足相同语义槽；若是，则执行S305操作，若否，则执行S306操作。S304, determine whether the multiple entity values satisfy the same semantic slot at the same time; if so, execute the operation S305, and if not, execute the operation S306.

S305，在语义解析结果指示的场景下针对每个实体值分别搜索答案，从每个实体值对应的答案中选取任意一个答案；针对多个实体值，将所选取的多个答案进行拼接，得到与实体值对应的搜索答案；之后执行S308操作。S305, in the scenario indicated by the semantic analysis result, search for answers for each entity value, and select any answer from the answers corresponding to each entity value; for multiple entity values, splicing the selected answers to obtain The search answer corresponding to the entity value; then perform the S308 operation.

S306，在语义解析结果指示的场景下搜索同时满足多个实体值的答案；之后执行S308操作。S306, search for an answer that satisfies multiple entity values at the same time in the scenario indicated by the semantic analysis result; then perform the operation of S308.

S307，搜索与语义解析结果指示的场景相应的答案；之后执行S308操作。S307, search for an answer corresponding to the scene indicated by the semantic analysis result; then perform the operation of S308.

S308，从搜索答案中选取语音请求的答案。S308, select the answer of the voice request from the search answers.

示例性的，若有一个搜索答案，则将该搜索答案作为语音请求的答案。若有多个搜索答案，则从多个搜索答案中选取满足预设条件的一个搜索答案作为语音请求的答案。Exemplarily, if there is a search answer, the search answer is used as the answer to the voice request. If there are multiple search answers, a search answer that satisfies the preset condition is selected from the multiple search answers as the answer to the voice request.

具体地，语音请求为“我想听朋友这首歌”，语义解析结果表征该语音请求的场景为“歌曲”且实体值为“朋友”，意图识别结果为明确意图，因此确定该语音请求的意图分类为完整意图。由于该语音请求存在一个实体值，因此在“歌曲”场景下搜索与“朋友”相应的答案，得到三个搜索答案，例如周杰伦唱的“朋友”，那英唱的“朋友”，以及高峰唱的“朋友”。按照不同歌手唱的“朋友”的热度选取排名第一的作为该语音请求的答案，或者随机选取一个搜索答案作为该语音请求的答案。Specifically, the voice request is "I want to listen to my friend's song", the semantic analysis result indicates that the scene of the voice request is "song" and the entity value is "friend", and the intent recognition result is a clear intent, so it is determined that the voice request is Intents are classified as full intents. Since the voice request has an entity value, search for the answer corresponding to "friend" in the "song" scene, and get three search answers, such as "friend" sung by Jay Chou, "friend" sung by Na Ying, and "friend" sung by Gao Gao "friend". According to the popularity of "friends" sung by different singers, the first one is selected as the answer to the voice request, or a search answer is randomly selected as the answer to the voice request.

语音请求为“我想听音乐”，语义解析结果表征该语音请求的场景为“音乐”且不存在实体值，意图识别结果为明确意图，因此确定该语音请求的意图分类为完整意图。由于该语音请求不存在实体值，因此在“音乐”场景下搜索答案，得到三个搜索答案，分别为“朋友”、“同一首歌”以及“爱人”。从三个搜索答案中选取播放热度最高的搜索答案作为该语音请求的答案，例如选取“朋友”作为“我想听音乐”的答案。The voice request is "I want to listen to music", the semantic parsing result indicates that the scene of the voice request is "music" and there is no entity value, and the intent recognition result is a clear intent, so it is determined that the intent of the voice request is classified as a complete intent. Since the voice request does not have an entity value, the answer is searched in the "music" scene, and three search answers are obtained, namely "friend", "same song" and "lover". Select the most popular search answer from the three search answers as the answer to the voice request, for example, select "friend" as the answer to "I want to listen to music".

语音请求为“帮我查下北京和上海的天气”，语义解析结果表征该语音请求的场景为“天气”，以及实体值分别为“北京”和“上海”，意图识别结果为明确意图，因此确定该语音请求的意图分类为完整意图。由于该语音请求存在两个实体值，且两个实体值对应的语义槽均是“weather location”，因此两个实体值满足相同语义槽。在“天气”场景下分别针对实体值“北京”和实体值“上海”搜索答案，与实体值“北京”对应的答案有“晴，16-30℃”以及“小雨，25℃”，与实体值“上海”对应的答案有“雷阵雨，21℃”。从每个实体值对应的答案中选取任意一个答案；针对两个实体值，将所选取的两个答案进行拼接，得到两组与该语音请求的实体值对应的答案，两组答案分别是“北京市今天晴，16-30℃；上海市今天雷阵雨，21℃”和“北京市明天小雨，25℃；上海市今天雷阵雨，21℃”。之后可以针对两组答案发送询问请求，例如“您是想了解哪天的天气”，根据用户回答的“我想了解今天的天气”，将“北京市今天晴，16-30℃；上海市今天雷阵雨，21℃”作为语音请求的答案，并反馈至用户。The voice request is "check the weather in Beijing and Shanghai for me", the semantic parsing result indicates that the scene of the voice request is "weather", and the entity values are "Beijing" and "Shanghai" respectively, and the intent recognition result is a clear intent, so The intent of the voice request is determined to be classified as a full intent. Since the voice request has two entity values, and the semantic slots corresponding to the two entity values are both "weather location", the two entity values satisfy the same semantic slot. In the "weather" scenario, search for answers for the entity value "Beijing" and the entity value "Shanghai" respectively. The answer corresponding to the value "Shanghai" is "Thunderstorm, 21°C". Select any answer from the answers corresponding to each entity value; for the two entity values, splicing the selected two answers to obtain two sets of answers corresponding to the entity value of the voice request, and the two sets of answers are " Beijing is sunny today, 16-30 ℃; Shanghai today is thunderstorms, 21 ℃" and "Beijing tomorrow, light rain, 25 ℃; Shanghai today, thunderstorms, 21 ℃". Afterwards, a query request can be sent for the two sets of answers, for example, "Which day do you want to know the weather?", according to the user's answer of "I want to know the weather today", change the query to "Beijing is sunny today, 16-30℃; Shanghai today is Thunderstorm, 21°C" as the answer to the voice request and fed back to the user.

语音请求为“帮我查下2月1日北京的天气”，语义解析结果表征该语音请求的场景为“天气”，以及实体值分别为“北京”和“2月1日”，意图识别结果为明确意图，因此确定该语音请求的意图分类为完整意图。该语音请求存在两个实体值，实体值“北京”对应的语义槽为“weather location”，实体值“2月1日”对应的语义槽为“weather time”。由于两个实体值不满足相同语义槽。因此在“天气”场景下搜索同时满足“weather location”和“weathertime”的答案，得到与实体值对应的答案有两个，分别为“北京市2月1日晴，16-30℃”以及“北京市2月1日雷阵雨，16-30℃”。从两个搜索答案中选取置信度最高答案作为语音请求的答案，例如选取的答案为“北京市2月1日晴，16-30℃”。The voice request is "Help me check the weather in Beijing on February 1st", the semantic parsing result indicates that the scene of the voice request is "weather", and the entity values are "Beijing" and "February 1st" respectively, the intent recognition result In order to clarify the intent, it is determined that the intent of the voice request is classified as a full intent. The voice request has two entity values, the semantic slot corresponding to the entity value "Beijing" is "weather location", and the semantic slot corresponding to the entity value "February 1" is "weather time". Since the two entity values do not satisfy the same semantic slot. Therefore, in the "weather" scenario, searching for answers that satisfy both "weather location" and "weathertime", there are two answers corresponding to the entity values, namely "Beijing February 1 is sunny, 16-30℃" and " Thunderstorms in Beijing on February 1, 16-30℃”. The answer with the highest confidence is selected from the two search answers as the answer to the voice request, for example, the selected answer is "Beijing is sunny on February 1, 16-30 ℃".

由此，本发明实施例能够识别多意图的语音请求，完善了对话系统的功能，提高了对话系统对的智能性。Thus, the embodiments of the present invention can recognize multi-intent voice requests, improve the function of the dialogue system, and improve the intelligence of the dialogue system pair.

应理解，在本发明的各种实施例中，上述各过程的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在的逻辑确定，而不应对本发明实施例的实施过程构成任何限定。It should be understood that, in various embodiments of the present invention, the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and inherent logic, rather than the implementation of the present invention. The implementation of the examples constitutes no limitation.

如图4所示，为本发明一实施例的问答处理装置的示意图；该装置400包括：获取模块401，用于获取语音请求；处理模块402，用于对语音请求进行处理，得到与语音请求相应的意图分类；搜索模块403，用于对语音请求执行与意图分类相应的答案搜索操作，并反馈搜索答案。As shown in FIG. 4, it is a schematic diagram of a question and answer processing apparatus according to an embodiment of the present invention; the apparatus 400 includes: an acquisition module 401, for acquiring a voice request; and a processing module 402, for processing the voice request, and obtaining and the voice request Corresponding intent classification; the search module 403 is configured to perform an answer search operation corresponding to the intent classification for the voice request, and feed back the search answer.

在可选的实施例中，处理模块402包括：语义解析单元，用于对语音请求进行语义解析；意图识别单元，用于利用模型对语音请求进行意图识别；确定单元，用于将语义解析结果为无场景且意图识别结果为不明确意图的语音请求确定为不完整意图；并将语义解析结果为有场景且意图识别结果为明确意图的语音请求确定为完整意图。In an optional embodiment, the processing module 402 includes: a semantic parsing unit for performing semantic parsing on the voice request; an intent recognition unit for performing intent recognition on the voice request by using a model; a determining unit for converting the semantic parsing result A voice request without a scene and an intent recognition result of an ambiguous intent is determined as an incomplete intent; and a voice request with a semantic analysis result of a scene and an intent recognition result of a clear intent is determined as a complete intent.

在可选的实施例中，搜索模块403包括：第一搜索单元，用于若意图分类是完整意图，则在语义解析结果指示的场景下搜索语音请求的答案；第二搜索单元，用于若意图分类是不完整意图，则预测与语音请求相应的多个预选场景，从多个预选场景中选取满足第一预设条件的预选场景作为候选场景，发送与候选场景相应的询问请求，在询问请求结果指示的候选场景下搜索语音请求的答案。In an optional embodiment, the search module 403 includes: a first search unit, configured to search for the answer to the voice request in the scenario indicated by the semantic analysis result if the intent classification is a complete intent; If the intent classification is an incomplete intent, predict multiple preselected scenarios corresponding to the voice request, select a preselected scenario that satisfies the first preset condition from the multiple preselected scenarios as candidate scenarios, and send a query request corresponding to the candidate scenario. Search for the answer to the voice request in the candidate scene indicated by the request result.

在可选的实施例中，第一搜索单元包括：实体值单元，用于若语义解析结果存在实体值，则在语义解析结果指示的场景下搜索与实体值相应的答案，并从搜索答案中选取语音请求的答案；场景单元，用于若语义解析结果不存在实体值，则搜索与语义解析结果指示的场景相应的答案，并从搜索答案中选取语音请求的答案。In an optional embodiment, the first search unit includes: an entity value unit, configured to search for an answer corresponding to the entity value in the scenario indicated by the semantic parsing result if an entity value exists in the semantic parsing result, and retrieve an answer from the searched answer Select the answer of the voice request; the scene unit is used to search for the answer corresponding to the scene indicated by the semantic parsing result if there is no entity value in the semantic parsing result, and select the answer of the voice request from the search answers.

在可选的实施例中，实体值单元包括：第一实体值单元，用于若存在一个实体值，则在语义解析结果指示的场景下搜索与实体值相应的答案；第二实体值单元，用于若存在多个实体值，则确定多个实体值是否同时满足相同语义槽；若是，则在语义解析结果指示的场景下针对每个实体值分别搜索答案，从每个实体值对应的答案中选取任意一个答案；针对多个实体值，将所选取的多个答案进行拼接，得到与实体值对应的搜索答案；若否，则在语义解析结果指示的场景下搜索同时满足多个实体值的答案。In an optional embodiment, the entity value unit includes: a first entity value unit, used for searching for an answer corresponding to the entity value in the scenario indicated by the semantic parsing result if there is an entity value; a second entity value unit, It is used to determine whether multiple entity values satisfy the same semantic slot at the same time if there are multiple entity values; if so, search the answer for each entity value separately in the scenario indicated by the semantic parsing result, and select the corresponding answer from each entity value. Select any answer from the list; for multiple entity values, splicing the selected multiple answers to obtain the search answer corresponding to the entity value; if not, search in the scenario indicated by the semantic parsing result and satisfy multiple entity values at the same time s answer.

在可选的实施例中，第一预设条件是指置信度排在前两位且有搜索答案的预选场景。In an optional embodiment, the first preset condition refers to a preselected scene with the top two confidence levels and a search answer.

上述装置可执行本发明实施例所提供的问答处理方法，具备执行问答处理方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节，可参见本发明实施例所提供的问答处理方法。The above apparatus can execute the question and answer processing method provided by the embodiment of the present invention, and has functional modules and beneficial effects corresponding to executing the question and answer processing method. For technical details not described in detail in this embodiment, reference may be made to the question and answer processing method provided by the embodiment of the present invention.

如图5所示，为本发明实施例可以应用于其中的示例性系统架构图，该系统架构500可以包括终端设备501、502、503，网络504和服务器505。网络504用以在终端设备501、502、503和服务器505之间提供通信链路的介质。网络504可以包括各种连接类型，例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 5 , which is an exemplary system architecture diagram to which the embodiments of the present invention may be applied, the system architecture 500 may include terminal devices 501 , 502 , and 503 , a network 504 , and a server 505 . The network 504 is a medium used to provide a communication link between the terminal devices 501 , 502 , 503 and the server 505 . Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

用户可以使用终端设备501、502、503通过网络504与服务器505交互，以接收或发送消息等。终端设备501、502、503上可以安装有各种通讯客户端应用，例如购物类应用、网页浏览器应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等(仅为示例)。The user can use the terminal devices 501, 502, 503 to interact with the server 505 through the network 504 to receive or send messages and the like. Various communication client applications may be installed on the terminal devices 501 , 502 and 503 , such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social platform software, etc. (only examples).

终端设备501、502、503可以是具有显示屏并且支持网页浏览的各种电子设备，包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。The terminal devices 501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and the like.

服务器505可以是提供各种服务的服务器，例如对用户利用终端设备501、502、503所产生的点击事件提供支持的后台管理服务器(仅为示例)。后台管理服务器可以对接收到的点击数据、文本内容等数据进行分析等处理，并将处理结果(例如目标推送信息、产品信息--仅为示例)反馈给终端设备。The server 505 may be a server that provides various services, such as a background management server that supports the click events generated by the user using the terminal devices 501 , 502 , and 503 (just an example). The background management server can analyze and process the received click data, text content and other data, and feed back the processing results (such as target push information, product information—just an example) to the terminal device.

需要说明的是，本申请实施例所提供的问答处理方法一般由服务器505执行，相应地，解读装置一般设置于服务器505中。It should be noted that the question and answer processing method provided by the embodiment of the present application is generally executed by the server 505 , and accordingly, the interpretation device is generally set in the server 505 .

应该理解，图5中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要，可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks and servers in FIG. 5 are only illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.

下面参考图6，其示出了适于用来实现实施例的终端设备或服务器的计算机系统的结构示意图。图6示出的终端设备仅仅是一个示例，不应对本发明实施例的功能和使用范围带来任何限制。Referring next to FIG. 6 , it shows a schematic structural diagram of a computer system suitable for implementing the terminal device or server of the embodiment. The terminal device shown in FIG. 6 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present invention.

如图6所示，计算机系统600包括中央处理单元(CPU)601，其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM603中，还存储有系统600操作所需的各种程序和数据。CPU601、ROM602以及RAM603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。以下部件连接至I/O接口605：包括键盘、鼠标等的输入部分606；包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607；包括硬盘等的存储部分608；以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611，诸如磁盘、光盘、磁光盘、半导体存储器等等，根据需要安装在驱动器610上，以便于从其上读出的计算机程序根据需要被安装入存储部分608。As shown in FIG. 6, a computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read only memory (ROM) 602 or a program from a storage section 608 Instead, various appropriate actions and processes are performed. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601 , the ROM 602 and the RAM 603 are connected to each other through a bus 604 . An input/output (I/O) interface 605 is also connected to bus 604 . The following components are connected to the I/O interface 605: an input section 606 including a keyboard, a mouse, etc.; an output section 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 608 including a hard disk, etc. ; and a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the Internet. A drive 610 is also connected to the I/O interface 605 as needed. A removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 610 as needed so that a computer program read therefrom is installed into the storage section 608 as needed.

特别地，根据本发明公开的实施例，上文参考流程图描述的过程可以被实现为计算机软件程序。例如，本发明公开的实施例包括一种计算机程序产品，其包括承载在计算机可读介质上的计算机程序，该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中，该计算机程序可以通过通信部分609从网络上被下载和安装，和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时，执行本发明的系统中限定的上述功能。In particular, the processes described above with reference to the flowcharts may be implemented as computer software programs in accordance with the disclosed embodiments of the present invention. For example, embodiments disclosed herein include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 609 and/or installed from the removable medium 611 . When the computer program is executed by the central processing unit (CPU) 601, the above-described functions defined in the system of the present invention are performed.

需要说明的是，本发明所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、系统或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本发明中，计算机可读存储介质可以是任何包含或存储序的有形介质，该程序可以被指令执行系统、系统或者器件使用或者与其结合使用。而在本发明中，计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读介质可以发送、传播或者传输用于由指令执行系统、系统或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：无线、电线、光缆、RF等等，或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the present invention may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, system or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In the present invention, a computer-readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or in conjunction with an instruction execution system, system, or device. In the present invention, however, a computer-readable signal medium may include a data signal in baseband or propagated as part of a carrier wave with computer-readable program code embodied therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium, other than a computer-readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, system, or device . Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

附图中的流程图和框图，图示了按照本发明各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分，上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图或流程图中的每个方框、以及框图或流程图中的方框的组合，可以用执行定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams or flowchart illustrations, and combinations of blocks in the block diagrams or flowchart illustrations, can be implemented in special purpose hardware-based systems that perform the specified functions or operations, or can be implemented using A combination of dedicated hardware and computer instructions is implemented.

描述于本发明实施例中所涉及到的模块可以通过软件的方式实现，也可以通过硬件的方式来实现。所描述的模块也可以设置在处理器中，例如，可以描述为：一种处理器包括发送模块、获取模块、确定模块和第一处理模块。其中，这些模块的名称在某种情况下并不构成对该单元本身的限定，例如，发送模块还可以被描述为“向所连接的服务端发送图片获取请求的模块”。The modules involved in the embodiments of the present invention may be implemented in a software manner, and may also be implemented in a hardware manner. The described modules can also be provided in the processor, for example, it can be described as: a processor includes a sending module, an obtaining module, a determining module and a first processing module. Among them, the names of these modules do not constitute a limitation on the unit itself under certain circumstances. For example, the sending module can also be described as "a module that sends a request for image acquisition to the connected server".

作为另一方面，本发明还提供了一种计算机可读介质，该计算机可读介质可以是上述实施例中描述的设备中所包含的；也可以是单独存在，而未装配入该设备中。上述计算机可读介质承载有一个或者多个程序，当上述一个或者多个程序被一个该设备执行时，使得该设备包括：S101，获取语音请求。S102，对所述语音请求进行处理，得到与所述语音请求相应的意图分类。S103，对所述语音请求执行与所述意图分类相应的答案搜索操作，并反馈搜索答案。As another aspect, the present invention also provides a computer-readable medium, which may be included in the device described in the above embodiments; or may exist alone without being assembled into the device. The above computer-readable medium carries one or more programs, and when the above one or more programs are executed by a device, the device includes: S101: Acquire a voice request. S102: Process the voice request to obtain an intent classification corresponding to the voice request. S103: Perform an answer search operation corresponding to the intent classification on the voice request, and feed back a search answer.

本发明实施例的系统不仅能够处理单意图对话，而且能够处理不完整意图和多意图对话，从而提升了对话系统的智能化。The system of the embodiment of the present invention can handle not only single-intent dialogue, but also incomplete-intent and multi-intent dialogue, thereby improving the intelligence of the dialogue system.

在本说明书的描述中，参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。而且，描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外，在不相互矛盾的情况下，本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, description with reference to the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples", etc., mean specific features described in connection with the embodiment or example , structure, material or feature is included in at least one embodiment or example of the present invention. Furthermore, the particular features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may combine and combine the different embodiments or examples described in this specification, as well as the features of the different embodiments or examples, without conflicting each other.

此外，术语“第一”、“第二”仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或隐含地包括至少一个该特征。在本发明的描述中，“多个”的含义是两个或两个以上，除非另有明确具体的限定。In addition, the terms "first" and "second" are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with "first", "second" may expressly or implicitly include at least one of that feature. In the description of the present invention, "plurality" means two or more, unless otherwise expressly and specifically defined.

以上所述，仅为本发明的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，可轻易想到变化或替换，都应涵盖在本发明的保护范围之内。因此，本发明的保护范围应以所述权利要求的保护范围为准。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed by the present invention. should be included within the protection scope of the present invention. Therefore, the protection scope of the present invention should be based on the protection scope of the claims.

Claims

1. A question-answer processing method, characterized by comprising:

acquiring a voice request;

processing the voice request to obtain an intention classification corresponding to the voice request;

and executing answer searching operation corresponding to the intention classification on the voice request, and feeding back a searching answer.

2. The method of claim 1, wherein the processing the voice request to obtain an intent classification corresponding to the voice request comprises:

performing semantic analysis on the voice request;

performing intention recognition on the voice request by utilizing a model;

determining the voice request with the semantic parsing result being scene-free and the intention recognition result being ambiguous intention as incomplete intention; and determining the voice request with the semantic parsing result of having scenes and the intention recognition result of being clear intention as a complete intention.

3. The method of claim 2, wherein performing an answer search operation on the voice request corresponding to the intent classification comprises:

if the intention classification is a complete intention, searching answers of the voice request under the scene indicated by the semantic parsing result;

if the intention classification is an incomplete intention, predicting a plurality of preselected scenes corresponding to the voice request, and selecting a preselected scene meeting a first preset condition from the preselected scenes as a candidate scene; and sending a query request corresponding to the candidate scene, and searching for an answer of the voice request under the candidate scene indicated by the query request result.

4. The method according to claim 3, wherein the searching for the answer of the voice request in the scene indicated by the semantic parsing result comprises:

if the semantic analysis result has an entity value, searching answers corresponding to the entity value in a scene indicated by the semantic analysis result, and selecting answers of the voice request from the searched answers;

and if the semantic analysis result does not have an entity value, searching answers corresponding to the scene indicated by the semantic analysis result, and selecting answers of the voice request from the searched answers.

5. The method according to claim 4, wherein if an entity value exists in the semantic analysis result, searching for an answer corresponding to the entity value in a scene indicated by the semantic analysis result comprises:

if an entity value exists, searching an answer corresponding to the entity value under the scene indicated by the semantic analysis result;

if a plurality of entity values exist, determining whether the entity values simultaneously satisfy the same semantic slot; if yes, respectively searching answers aiming at each entity value under the scene indicated by the semantic analysis result, and selecting any one answer from answers corresponding to each entity value; aiming at a plurality of entity values, splicing a plurality of selected answers to obtain search answers corresponding to the entity values; if not, searching answers meeting the entity values simultaneously under the scene indicated by the semantic analysis result.

6. The method according to claim 3, wherein the first preset condition is a preselected scene with the first two confidence levels and the search answer.

7. A question-answering processing apparatus characterized by comprising:

the acquisition module is used for acquiring a voice request;

the processing module is used for processing the voice request to obtain an intention classification corresponding to the voice request;

and the searching module is used for executing answer searching operation corresponding to the intention classification on the voice request and feeding back searching answers.

8. The apparatus of claim 7, wherein the processing module comprises:

the semantic analysis unit is used for carrying out semantic analysis on the voice request;

an intention recognition unit for performing intention recognition on the voice request by using a model;

a determining unit configured to determine a voice request, the semantic parsing result of which is scene-free and the intention recognition result of which is an ambiguous intention, as an incomplete intention; and determining the voice request with the semantic parsing result of having scenes and the intention recognition result of being clear intention as a complete intention.

9. The apparatus of claim 8, wherein the search module comprises:

the first searching unit is used for searching answers of the voice requests under the scene indicated by the semantic parsing result if the intention classification is a complete intention;

and the second searching unit is used for predicting a plurality of preselected scenes corresponding to the voice request if the intention classification is an incomplete intention, selecting the preselected scenes meeting a first preset condition from the preselected scenes as candidate scenes, sending an inquiry request corresponding to the candidate scenes, and searching answers of the voice request under the candidate scenes indicated by the inquiry request result.

10. The apparatus of claim 9, wherein the first search unit comprises:

an entity value unit, configured to search, if an entity value exists in the semantic analysis result, an answer corresponding to the entity value in a scene indicated by the semantic analysis result, and select an answer to the voice request from the search answers;

and the scene unit is used for searching answers corresponding to the scenes indicated by the semantic analysis result if the semantic analysis result does not have an entity value, and selecting the answer of the voice request from the searched answers.