JP2021135412A

JP2021135412A - Information processing device, information processing method, and program

Info

Publication number: JP2021135412A
Application number: JP2020032288A
Authority: JP
Inventors: 真里斎藤; Mari Saito
Original assignee: Sony Group Corp
Current assignee: Sony Group Corp
Priority date: 2020-02-27
Filing date: 2020-02-27
Publication date: 2021-09-13
Also published as: WO2021171820A1

Abstract

To provide a device and method that enable one user utterance corresponding to an information processing device to be selected from among a plurality of user utterance candidates.SOLUTION: An information processing device includes a data processing unit that, from among a plurality of user utterance candidates generated as a voice analysis result of user utterance, selects one user utterance corresponding to the information processing device. The data processing unit determines a context type which is a type of context which is current status information, selects a context type having high similarity to or correlation with the determined context type, and selects a user utterance having the same slot as or a similar slot to a slot registered in a database in association with the selected context type as one user utterance corresponding to the information processing device.SELECTED DRAWING: Figure 2

Description

本開示は、情報処理装置、および情報処理方法、並びにプログラムに関する。さらに詳細には、ユーザ発話に応じた処理や応答を実行する情報処理装置、および情報処理方法、並びにプログラムに関する。 The present disclosure relates to an information processing device, an information processing method, and a program. More specifically, the present invention relates to an information processing device that executes processing and a response according to a user's utterance, an information processing method, and a program.

昨今、ユーザ発話の音声認識を行い、認識結果に基づく様々な処理や応答を行う音声対話システムの利用が増大している。
この音声対話システムにおいては、マイクを介して入力するユーザ発話の解析を行い、解析結果に応じた処理を行う。 In recent years, the use of voice dialogue systems that perform voice recognition of user utterances and perform various processes and responses based on the recognition results is increasing.
In this voice dialogue system, the user's utterance input via the microphone is analyzed, and processing is performed according to the analysis result.

例えばユーザが、「明日の天気を教えて」と発話した場合、天気情報提供サーバから天気情報を取得して、取得情報に基づくシステム応答を生成して、生成した応答をスピーカーから出力する。具体的には、例えば、
システム発話＝「明日の天気は晴れです。ただし、夕方、雷雨があるかもしれません」
このようなシステム発話を出力する。 For example, when the user says "Tell me the weather tomorrow", the weather information is acquired from the weather information providing server, a system response based on the acquired information is generated, and the generated response is output from the speaker. Specifically, for example
System utterance = "Tomorrow's weather will be sunny, but there may be thunderstorms in the evening."
Output such a system utterance.

このように、ユーザとの対話を行なう情報処理装置は、エージェント装置やスマートスピーカーと呼ばれる。 Information processing devices that interact with users in this way are called agent devices and smart speakers.

このエージェント装置やスマートスピーカーは、マイクを介して入力するユーザ発話を認識理解して、それに応じた処理を行う。
しかし、ユーザ発話の音声認識結果あるいは意図推定結果に対する信頼度が低い場合や、認識した結果の解釈が複数通りある場合等、システムが行動を一意に決定できない場合がある。
すなわち、ユーザの発話意図を正確に読み取った応答や処理を行なうことが困難になる場合がある。 The agent device and the smart speaker recognize and understand the user's utterance input through the microphone, and perform processing accordingly.
However, the system may not be able to uniquely determine the action, such as when the reliability of the voice recognition result or the intention estimation result of the user's utterance is low, or when there are multiple interpretations of the recognition result.
That is, it may be difficult to accurately read the user's utterance intention and perform a response or processing.

なお、このような問題を解決する手法を開示した従来技術として、例えば特許文献１（特開２００７−１０８４０７号公報）がある。
この特許文献１は、ユーザ発話が長い場合に、その長いユーザ発話から優先度の高い発話部分を選択して、選択された発話部分に対する応答処理を行なう構成を開示している。 As a conventional technique that discloses a method for solving such a problem, for example, Patent Document 1 (Japanese Unexamined Patent Publication No. 2007-108407) is available.
This Patent Document 1 discloses a configuration in which when a user utterance is long, a high-priority utterance portion is selected from the long user utterance and response processing is performed for the selected utterance portion.

しかし、この文献には、ユーザの長い発話から優先度の高いと推定されるユーザの意図を解析する構成を開示しているのみであり、音声認識結果あるいは意図推定結果に対する信頼度が低い場合の対応については開示していない。 However, this document only discloses a configuration for analyzing the user's intention presumed to have high priority from the user's long utterance, and when the reliability of the voice recognition result or the intention estimation result is low. The response is not disclosed.

特開２００７−１０８４０７号公報JP-A-2007-108407

本開示は、例えば、上記問題点に鑑みてなされたものであり、ユーザ発話に対する音声認識結果あるいは意図推定結果に対する信頼度が低い場合でも、ユーザ発話に対する応答や処理を高精度に実行することを可能とした情報処理装置、および情報処理方法、並びにプログラムを提供することを目的とする。 The present disclosure has been made in view of the above problems, for example, and even when the reliability of the voice recognition result or the intention estimation result for the user utterance is low, the response or processing for the user utterance can be executed with high accuracy. It is an object of the present invention to provide an information processing device, an information processing method, and a program that enable it.

また、カメラ等を介して観測される情報等を利用してコンテクスト（状況情報）を解析し、解析結果を利用してユーザ発話を推定する処理や、次の処理を予測する情報処理装置、および情報処理方法、並びにプログラムを提供することを目的とする。 In addition, a process that analyzes the context (situation information) using information observed through a camera or the like and estimates user utterances using the analysis results, an information processing device that predicts the next process, and an information processing device. The purpose is to provide information processing methods and programs.

本開示の第１の側面は、
ユーザ発話の音声解析結果として生成された複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択するデータ処理部を有し、
前記データ処理部は、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記複数のユーザ発話候補から情報処理装置が対応する１つのユーザ発話を選択するユーザ発話選択処理を実行する情報処理装置にある。 The first aspect of the disclosure is
It has a data processing unit that selects one user utterance corresponding to the information processing device from a plurality of user utterance candidates generated as a voice analysis result of the user utterance.
The data processing unit
It is in an information processing device that analyzes a context that is current status information and executes a user utterance selection process in which the information processing device selects one corresponding user utterance from the plurality of user utterance candidates using the context analysis result. ..

さらに、本開示の第２の側面は、
情報処理装置が次に実行する処理の候補である次処理候補を決定するデータ処理部を有し、
前記データ処理部は、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記次処理候補を決定する情報処理装置にある。 Further, the second aspect of the present disclosure is
It has a data processing unit that determines the next processing candidate, which is the next processing candidate to be executed by the information processing device.
The data processing unit
It is in an information processing device that analyzes a context that is current status information and determines the next processing candidate by using the context analysis result.

さらに、本開示の第３の側面は、
情報処理装置において実行する情報処理方法であり、
前記情報処理装置は、ユーザ発話の音声解析結果として生成された複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択するデータ処理部を有し、
前記データ処理部が、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記複数のユーザ発話候補から情報処理装置が対応する１つのユーザ発話を選択するユーザ発話選択処理を実行する情報処理方法にある。 Further, the third aspect of the present disclosure is
It is an information processing method executed in an information processing device.
The information processing device has a data processing unit that selects one user utterance corresponding to the information processing device from a plurality of user utterance candidates generated as a result of voice analysis of user utterances.
The data processing unit
It is an information processing method that analyzes a context that is current situation information and executes a user utterance selection process in which an information processing device selects one user utterance corresponding to the plurality of user utterance candidates from the plurality of user utterance candidates using the context analysis result. ..

さらに、本開示の第４の側面は、
情報処理装置において実行する情報処理方法であり、
前記情報処理装置は、
前記情報処理装置が次に実行する処理の候補である次処理候補を決定するデータ処理部を有し、
前記データ処理部が、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記次処理候補を決定する情報処理方法にある。 Further, the fourth aspect of the present disclosure is
It is an information processing method executed in an information processing device.
The information processing device
The information processing apparatus has a data processing unit that determines a next processing candidate that is a candidate for processing to be executed next.
The data processing unit
It is an information processing method that analyzes the context which is the current situation information and determines the next processing candidate by using the context analysis result.

さらに、本開示の第５の側面は、
情報処理装置において情報処理を実行させるプログラムであり、
前記情報処理装置は、ユーザ発話の音声解析結果として生成された複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択するデータ処理部を有し、
前記プログラムは、前記データ処理部に、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記複数のユーザ発話候補から情報処理装置が対応する１つのユーザ発話を選択するユーザ発話選択処理を実行させるプログラムにある。 Further, the fifth aspect of the present disclosure is
A program that executes information processing in an information processing device.
The information processing device has a data processing unit that selects one user utterance corresponding to the information processing device from a plurality of user utterance candidates generated as a result of voice analysis of the user utterance.
The program is installed in the data processing unit.
It is in a program that analyzes a context which is current situation information and executes a user utterance selection process in which an information processing device selects one user utterance corresponding to the plurality of user utterance candidates from the plurality of user utterance candidates by using the context analysis result.

さらに、本開示の第６の側面は、
情報処理装置において情報処理を実行させるプログラムであり、
前記情報処理装置は、
前記情報処理装置が次に実行する処理の候補である次処理候補を決定するデータ処理部を有し、
前記プログラムは、前記データ処理部に、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記次処理候補を決定させるプログラムにある。 Further, the sixth aspect of the present disclosure is
A program that executes information processing in an information processing device.
The information processing device
The information processing apparatus has a data processing unit that determines a next processing candidate that is a candidate for processing to be executed next.
The program is installed in the data processing unit.
It is in a program that analyzes the context which is the current situation information and determines the next processing candidate by using the context analysis result.

なお、本開示のプログラムは、例えば、様々なプログラム・コードを実行可能な情報処理装置やコンピュータ・システムに対して、コンピュータ可読な形式で提供する記憶媒体、通信媒体によって提供可能なプログラムである。このようなプログラムをコンピュータ可読な形式で提供することにより、情報処理装置やコンピュータ・システム上でプログラムに応じた処理が実現される。 The program of the present disclosure is, for example, a program that can be provided by a storage medium or a communication medium that is provided in a computer-readable format to an information processing device or a computer system that can execute various program codes. By providing such a program in a computer-readable format, processing according to the program can be realized on an information processing device or a computer system.

本開示のさらに他の目的、特徴や利点は、後述する本開示の実施例や添付する図面に基づくより詳細な説明によって明らかになるであろう。なお、本明細書においてシステムとは、複数の装置の論理的集合構成であり、各構成の装置が同一筐体内にあるものには限らない。 Still other objectives, features and advantages of the present disclosure will become apparent by more detailed description based on the examples of the present disclosure and the accompanying drawings described below. In the present specification, the system is a logical set configuration of a plurality of devices, and the devices having each configuration are not limited to those in the same housing.

本開示の一実施例の構成によれば、複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択することを可能とした装置、方法が実現される。
具体的には、例えば、ユーザ発話の音声解析結果として生成された複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択するデータ処理部を有する。データ処理部は、現在の状況情報であるコンテクストの種類であるコンテクスト種類を判別し、判別したコンテクスト種類と類似性または関連性の高いコンテクスト種類を選択して、選択したコンテクスト種類に対応付けてデータベースに登録されたスロットと同一、または類似するスロットを有するユーザ発話を、情報処理装置が対応する１つのユーザ発話として選択する。
本構成により、複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択することを可能とした装置、方法が実現される。
なお、本明細書に記載された効果はあくまで例示であって限定されるものではなく、また付加的な効果があってもよい。 According to the configuration of one embodiment of the present disclosure, a device and a method that enable the information processing device to select one corresponding user utterance from a plurality of user utterance candidates are realized.
Specifically, for example, it has a data processing unit that selects one user utterance corresponding to the information processing device from a plurality of user utterance candidates generated as a result of voice analysis of user utterances. The data processing unit determines the context type, which is the type of context that is the current status information, selects a context type that is similar to or highly related to the determined context type, and associates it with the selected context type in the database. A user utterance having a slot that is the same as or similar to the slot registered in is selected as one corresponding user utterance by the information processing apparatus.
With this configuration, a device and a method that enable the information processing device to select one corresponding user utterance from a plurality of user utterance candidates are realized.
The effects described in the present specification are merely exemplary and not limited, and may have additional effects.

ユーザ発話に基づく応答や処理を行う情報処理装置の構成と処理例について説明する図である。It is a figure explaining the structure of the information processing apparatus which performs the response and processing based on the user's utterance, and the processing example. 本開示の情報処理装置が実行する処理の一例について説明する図である。It is a figure explaining an example of the process executed by the information processing apparatus of this disclosure. コンテクスト情報データベースの格納データの一例について説明する図である。It is a figure explaining an example of the stored data of a context information database. コンテクスト間距離テーブルの格納データの一例について説明する図である。It is a figure explaining an example of the stored data of the inter-context distance table. 本開示の情報処理装置が実行する処理の一例について説明する図である。It is a figure explaining an example of the process executed by the information processing apparatus of this disclosure. 本開示の情報処理装置が実行する処理の一例について説明する図である。It is a figure explaining an example of the process executed by the information processing apparatus of this disclosure. 本開示の情報処理装置が実行する処理の一例について説明する図である。It is a figure explaining an example of the process executed by the information processing apparatus of this disclosure. 本開示の情報処理装置が実行する処理の一例について説明する図である。It is a figure explaining an example of the process executed by the information processing apparatus of this disclosure. 本開示の情報処理装置が実行する処理の一例について説明する図である。It is a figure explaining an example of the process executed by the information processing apparatus of this disclosure. 本開示の情報処理装置が実行する処理の一例について説明する図である。It is a figure explaining an example of the process executed by the information processing apparatus of this disclosure. 情報処理装置の実行する処理シーケンスについて説明するフローチャートを示す図である。It is a figure which shows the flowchart explaining the processing sequence executed by an information processing apparatus. 本開示の情報処理装置が実行する処理の一例について説明する図である。It is a figure explaining an example of the process executed by the information processing apparatus of this disclosure. コンテクスト間距離テーブルの格納データの一例について説明する図である。It is a figure explaining an example of the stored data of the inter-context distance table. 本開示の情報処理装置が実行する処理の一例について説明する図である。It is a figure explaining an example of the process executed by the information processing apparatus of this disclosure. 本開示の情報処理装置が実行する処理の一例について説明する図である。It is a figure explaining an example of the process executed by the information processing apparatus of this disclosure. 情報処理装置の実行する処理シーケンスについて説明するフローチャートを示す図である。It is a figure which shows the flowchart explaining the processing sequence executed by an information processing apparatus. 本開示の情報処理装置の構成例について説明する図である。It is a figure explaining the structural example of the information processing apparatus of this disclosure. 本開示の情報処理装置とサーバを有するネットワーク構成例でについて説明する図である。It is a figure explaining in the example of the network configuration which has the information processing apparatus and a server of this disclosure. 本開示の情報処理装置とサーハの構成例について説明する図である。It is a figure explaining the configuration example of the information processing apparatus and Saha of this disclosure. 本開示の情報処理装置とサーバの構成例について説明する図である。It is a figure explaining the configuration example of the information processing apparatus and a server of this disclosure. 本開示の情報処理装置のハードウェア構成例について説明する図である。It is a figure explaining the hardware configuration example of the information processing apparatus of this disclosure.

以下、図面を参照しながら本開示の情報処理装置、および情報処理方法、並びにプログラムの詳細について説明する。なお、説明は以下の項目に従って行なう。
１．ユーザ発話に応じた処理を行う情報処理装置の概要について
２．（実施例１）ユーザ発話の音声認識結果あるいは意図推定結果に対する信頼度が低い場合に本開示の情報処理装置が実行する処理について
３．コンテクスト解析に基づくユーザ発話選択処理の具体例について
４．ユーザ発話候補が多数ある場合のコンテクスト解析に基づくユーザ発話選択処理の具体例について
５．本開示の情報処理装置が実行する処理のシーケンスについて
６．（実施例２）コンテクスト間距離を利用した処理制御を行う実施例について
７．情報処理装置が実行する実施例２の処理シーケンスについて
８．その他の実施例、変形例について
９．情報処理装置の構成例について
１０．情報処理装置のハードウェア構成例について
１１．本開示の構成のまとめ Hereinafter, the details of the information processing apparatus, the information processing method, and the program of the present disclosure will be described with reference to the drawings. The explanation will be given according to the following items.
1. 1. Outline of information processing device that performs processing according to user's utterance 2. (Example 1) Regarding the process executed by the information processing apparatus of the present disclosure when the reliability of the voice recognition result or the intention estimation result of the user's utterance is low. Specific examples of user utterance selection processing based on context analysis 4. 5. Specific examples of user utterance selection processing based on context analysis when there are many user utterance candidates. 6. Regarding the sequence of processing executed by the information processing apparatus of the present disclosure. (Example 2) Example of performing processing control using the distance between contexts 7. 8. Regarding the processing sequence of Example 2 executed by the information processing apparatus. About other examples and modifications 9. About the configuration example of the information processing device 10. About hardware configuration example of information processing device 11. Summary of the structure of this disclosure

［１．ユーザ発話に応じた処理を行う情報処理装置の概要について］
まず、図１以下を参照して、本開示の情報処理装置、すなわち、ユーザ発話に応じた処理を行う情報処理装置の概要について説明する。
なお、前述したように、ユーザとの対話を行なう情報処理装置は、例えばエージェント装置やスマートスピーカーと呼ばれる。 [1. Overview of information processing equipment that performs processing according to user utterances]
First, with reference to FIGS. 1 and 1 and below, an outline of the information processing device of the present disclosure, that is, an information processing device that performs processing according to a user's utterance will be described.
As described above, the information processing device that interacts with the user is called, for example, an agent device or a smart speaker.

図１は、ユーザ１の発するユーザ発話を認識して応答を行う情報処理装置１０の一処理例を示す図である。
情報処理装置１０は、ユーザの発話、例えば、
ユーザ発話＝「大阪の明日、午後の天気を教えて」
このユーザ発話の音声認識処理を実行する。 FIG. 1 is a diagram showing an example of processing of an information processing apparatus 10 that recognizes a user's utterance uttered by the user 1 and makes a response.
The information processing device 10 is a user's utterance, for example,
User utterance = "Tell me the weather in Osaka tomorrow and afternoon"
The voice recognition process of this user utterance is executed.

さらに、情報処理装置１０は、ユーザ発話の音声認識結果に基づく処理を実行する。
図１に示す例では、ユーザ発話＝「大阪の明日、午後の天気を教えて」に応答するためのデータを取得し、取得データに基づいて応答を生成して生成した応答を、スピーカー１４を介して出力する。
図１に示す例では、情報処理装置１０は、以下のシステム応答を行っている。
システム応答＝「大阪の明日、午後の天気は晴れですが、夕方、にわか雨がある可能性があります。」
情報処理装置１０は、音声合成処理（ＴＴＳ：ＴｅｘｔＴｏＳｐｅｅｃｈ）を実行して上記のシステム応答を生成して出力する。 Further, the information processing device 10 executes a process based on the voice recognition result of the user's utterance.
In the example shown in FIG. 1, the speaker 14 is used to acquire data for responding to user utterance = "Tell me the weather tomorrow and afternoon in Osaka" and generate a response based on the acquired data. Output via.
In the example shown in FIG. 1, the information processing apparatus 10 makes the following system response.
System response = "Tomorrow in Osaka, the weather will be fine in the afternoon, but there may be a shower in the evening."
The information processing device 10 executes voice synthesis processing (TTS: Text To Speech) to generate and output the above system response.

情報処理装置１０は、装置内の記憶部から取得した知識データ、またはネットワークを介して取得した知識データを利用して応答を生成して出力する。
図１に示す情報処理装置１０は、カメラ１１、マイク１２、表示部１３、スピーカー１４、センサ１５を有しており、音声入出力と画像入出力が可能な構成を有する。センサ１５は距離センサ、温度センサ等、ＧＰＳ等、様々なセンサによって構成される。 The information processing device 10 generates and outputs a response by using the knowledge data acquired from the storage unit in the device or the knowledge data acquired via the network.
The information processing device 10 shown in FIG. 1 includes a camera 11, a microphone 12, a display unit 13, a speaker 14, and a sensor 15, and has a configuration capable of audio input / output and image input / output. The sensor 15 is composed of various sensors such as a distance sensor, a temperature sensor, GPS, and the like.

なお、ユーザ発話に対する音声認識処理や意味解析処理は、情報処理装置１０内で行ってもよいし、クラウド側のサーバにおいて実行する構成としもよい。 The voice recognition process and the semantic analysis process for the user's utterance may be performed in the information processing device 10 or may be executed in the server on the cloud side.

情報処理装置１０は、ユーザ１の発話を認識して、ユーザ発話に基づく応答を行う他、例えば、ユーザ発話に応じた様々な処理、例えば、家の中のテレビ、エアコン等の外部機器の制御も実行する。
例えばユーザ発話が「テレビのチャンネルを１に変えて」、あるいは「エアコンの設定温度を２０度にして」といった要求である場合、情報処理装置１０は、このユーザ発話の音声認識結果に基づいて、外部機器に対して制御信号（Ｗｉ−Ｆｉ、赤外光など）を出力して、ユーザ発話に従った制御を実行する。 The information processing device 10 recognizes the utterance of the user 1 and performs a response based on the utterance of the user. In addition, for example, various processes according to the utterance of the user, for example, control of an external device such as a television or an air conditioner in the house. Also run.
For example, when the user utterance is a request such as "change the TV channel to 1" or "set the temperature of the air conditioner to 20 degrees", the information processing apparatus 10 is based on the voice recognition result of the user utterance. A control signal (Wi-Fi, infrared light, etc.) is output to an external device to execute control according to the user's utterance.

［２．（実施例１）ユーザ発話の音声認識結果あるいは意図推定結果に対する信頼度が低い場合に本開示の情報処理装置が実行する処理について］
次に、実施例１として、ユーザ発話の音声認識結果あるいは意図推定結果に対する信頼度が低い場合における本開示の情報処理装置の実行する処理について説明する。 [2. (Example 1) Regarding the process executed by the information processing apparatus of the present disclosure when the reliability of the voice recognition result or the intention estimation result of the user's utterance is low]
Next, as the first embodiment, the processing executed by the information processing apparatus of the present disclosure when the reliability of the voice recognition result or the intention estimation result of the user's utterance is low will be described.

図１を参照して説明したように、本開示の情報処理装置１０は、マイクを介して入力するユーザ発話を認識理解して、それに応じた処理を行う。
しかし、ユーザ発話の音声認識結果あるいは意図推定結果に対する信頼度が低い場合や、認識した結果の解釈が複数通りある場合等には、情報処理装置１０は、ユーザ発話に対して行う応答や処理を一意に決定できない場合がある。
すなわち、ユーザの発話意図を正確に反映した応答や処理を行なうことが困難になる場合がある。 As described with reference to FIG. 1, the information processing apparatus 10 of the present disclosure recognizes and understands the user's utterance input via the microphone, and performs processing accordingly.
However, when the reliability of the voice recognition result or the intention estimation result of the user utterance is low, or when there are a plurality of interpretations of the recognition result, the information processing device 10 performs a response or processing to the user utterance. It may not be possible to determine uniquely.
That is, it may be difficult to perform a response or processing that accurately reflects the user's utterance intention.

本開示の情報処理装置は、この問題点を解決するものであり、ユーザ発話に対する音声認識結果あるいは意図推定結果に対する信頼度が低い場合でも、ユーザ発話の意図を反映した応答や処理を高精度に実行することを可能とした構成を持つ。 The information processing apparatus of the present disclosure solves this problem, and even when the reliability of the voice recognition result or the intention estimation result for the user utterance is low, the response and processing reflecting the intention of the user utterance can be performed with high accuracy. It has a configuration that allows it to be executed.

図２を参照して、本開示の情報処理装置１０が実行する処理シーケンスについて説明する。
図２に示すように、まず、ユーザ１が、以下のユーザ発話を行ったとする。
ユーザ発話＝「＊＊んき、おねがい」
上記ユーザ発話中、「＊＊」は、はっきり聞き取れない発話であることを意味する。 The processing sequence executed by the information processing apparatus 10 of the present disclosure will be described with reference to FIG.
As shown in FIG. 2, first, it is assumed that the user 1 makes the following user utterance.
User utterance = "** Nki, please"
In the above user utterance, "**" means that the utterance cannot be clearly heard.

このユーザ発話を入力した情報処理装置１０は、図２に示すステップＳ０１〜Ｓ０４の処理を順次、実行する。以下、各ステップの処理について説明する。 The information processing device 10 that has input the user utterance sequentially executes the processes of steps S01 to S04 shown in FIG. The processing of each step will be described below.

（ステップＳ０１）
まず、情報処理装置１０は、ステップＳ０１において、入力したユーザ発話に対する音声認識処理を実行する。 (Step S01)
First, in step S01, the information processing device 10 executes a voice recognition process for the input user utterance.

情報処理装置１０の音声解析部は、音声入力部であるマイク１２から入力したユーザ発話音声を自動音声認識（ＡＳＲ：ＡｕｔｏｍａｔｉｃＳｐｅｅｃｈＲｅｃｏｇｎｉｔｉｏｎ）機能を有する音声認識部に入力して、音声データをテキストデータに変換する。 The voice analysis unit of the information processing device 10 inputs the user-spoken voice input from the microphone 12 which is a voice input unit to a voice recognition unit having an automatic voice recognition (ASR: Automatic Speech Recognition) function, and inputs the voice data to text data. Convert to.

ここで生成されるテキストデータは、図２に示すユーザ発話音声認識結果２１であり、例えば以下のテキストデータとなる。
テキストデータ＝「ＸＸｅｎｋｉｏｎｅｇａｉ」
上記テキストデータ中「ＸＸ」はテキストデータヘの変換不可と判定された音声区間、または信頼度が既定しきい値以下の変換テキストに変換された発話区間を示す。 The text data generated here is the user-spoken voice recognition result 21 shown in FIG. 2, and is, for example, the following text data.
Text data = "XXenki onegai"
In the text data, "XX" indicates a voice section determined to be unconvertible to text data, or an utterance section converted into converted text having a reliability of less than or equal to a default threshold.

（ステップＳ０２）
次に、情報処理装置１０の音声解析部は、さらに、ステップＳ０２において、ステップＳ０１で生成したテキストデータ＝「ＸＸｅｎｋｉｏｎｅｇａｉ」に基づく発話意味解析処理を実行する。 (Step S02)
Next, in step S02, the voice analysis unit of the information processing apparatus 10 further executes an utterance semantic analysis process based on the text data = "XXenki one gei" generated in step S01.

情報処理装置１０の音声解析部は、テキストデータ＝「ＸＸｅｎｋｉｏｎｅｇａｉ」、このテキストの解析処理を行う。
音声解析部は発話意味解析機能を有する。例えばＮＬＵ（ＮａｔｕｒａｌＬａｎｇｕａｇｅＵｎｄｅｒｓｔａｎｄｉｎｇ）等の自然言語理解機能を有し、テキストデータから、ユーザ発話の意図（インテント：Ｉｎｔｅｎｔ）や、発話に含まれる意味のある有意要素（スロット：Ｓｌｏｔ）等の推定処理を実行する。
具体的には、例えば、様々な発話文例を構文解析データとともに記録したコーパス等を用いて、テキストデータに基づくユーザ意図の解析を実行する。 The voice analysis unit of the information processing device 10 performs analysis processing of text data = "XXenki onegai" and this text.
The voice analysis unit has a speech meaning analysis function. For example, it has a natural language understanding function such as NLU (Natural Language Understanding), and estimates the intention of the user's utterance (Intent) and the meaningful significant element (slot) included in the utterance from the text data. Execute the process.
Specifically, for example, the analysis of the user's intention based on the text data is executed by using a corpus or the like in which various utterance sentence examples are recorded together with the syntactic analysis data.

ユーザ発話から、意図（エンティティ）と、有意要素（スロット：Ｓｌｏｔ）を正確に推定、取得することができれば、情報処理装置１０は、ユーザ発話を正確に解釈し、解釈結果に基づく応答等の処理を行うことができる。 If the intention (entity) and the significant element (slot) can be accurately estimated and acquired from the user utterance, the information processing apparatus 10 accurately interprets the user utterance and processes a response or the like based on the interpretation result. It can be performed.

しかし、ステップＳ０１で生成したテキストデータ＝「ＸＸｅｎｋｉｏｎｅｇａｉ」には、テキストデータヘの変換不可と判定された音声区間、または信頼度が既定しきい値以下の変換テキストに変換された発話区間が含まれる。
このような場合、情報処理装置１０の音声解析部は、このテキストデータ＝「ＸＸｅｎｋｉｏｎｅｇａｉ」に基づいて、複数のユーザ発話候補を生成する。 However, the text data = "XXenki onegai" generated in step S01 includes a voice section determined to be unconvertible to text data, or an utterance section converted to converted text having a reliability equal to or lower than the default threshold value. ..
In such a case, the voice analysis unit of the information processing apparatus 10 generates a plurality of user utterance candidates based on the text data = "XXenki onegai".

具体的には、図２に示すユーザ発話候補ａ，ｂ、これらのユーザ発話候補である。すなわち、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
情報処理装置１０の音声解析部は、これら２つのユーザ発話候補を生成する。 Specifically, the user utterance candidates a and b shown in FIG. 2 and these user utterance candidates. That is,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
The voice analysis unit of the information processing device 10 generates these two user utterance candidates.

（ステップＳ０３）
次に情報処理装置１０は、ステップＳ０３において、コンテクスト（状況情報）に基づいて、複数のユーザ発話候補から、応答対象とすべき１つのユーザ発話候補を選択する。 (Step S03)
Next, in step S03, the information processing device 10 selects one user utterance candidate to be the response target from the plurality of user utterance candidates based on the context (situation information).

この処理は、情報処理装置１０のコンテクスト解析部が実行する。
情報処理装置１０のコンテクスト解析部は、まず、情報処理装置１０のカメラ１１、マイク１２、センサ１５から取得される観測情報、さらに表示部１３に表示中の情報や、通信部を介して外部のサーバ等から得られる情報、すなわち図２に示すコンテクスト（状況情報）２３を取得、解析し、現在のコンテクストの種類を判別する。 This process is executed by the context analysis unit of the information processing apparatus 10.
The context analysis unit of the information processing device 10 first receives observation information acquired from the camera 11, microphone 12, and sensor 15 of the information processing device 10, information displayed on the display unit 13, and external information via the communication unit. Information obtained from a server or the like, that is, the context (situation information) 23 shown in FIG. 2 is acquired and analyzed, and the current type of context is determined.

コンテクストの種類とは、例えば、
（ａ）ユーザ１がスケジュール検索を実行中、
（ｂ）情報処理装置１０が、ユーザ１に対して天気情報を提供中、
（ｃ）ユーザが帰宅し、家のスマートロックが解除されたことを検出、
このような、様々なコンテクスト（状況状況）の種類である。 The type of context is, for example,
(A) While user 1 is executing a schedule search,
(B) While the information processing device 10 is providing weather information to the user 1,
(C) It is detected that the user has returned home and the smart lock of the house has been released.
There are various types of contexts like this.

コンテクスト情報データベース１８には、様々なコンテクスト種類が予め登録されており、情報処理装置１０のコンテクスト解析部は、カメラ１１等を介して取得したコンテクスト（状況情報）２３が、コンテクスト情報データベース１８に登録されたどのコンテクスト種類に対応するかを判別する処理を行う。 Various context types are registered in advance in the context information database 18, and in the context analysis unit of the information processing apparatus 10, the context (situation information) 23 acquired via the camera 11 or the like is registered in the context information database 18. Performs processing to determine which context type corresponds to.

図３を参照して、コンテクスト情報データベース１８の格納データの例について説明する。
図３に示すように、コンテクスト情報データベース１８には、以下の各データが対応付けて記録されている。
（ａ）コンテクストＩＤ
（ｂ）コンテクスト種類
（ｃ）コンテクスト対応主要スロット（有意要素） An example of stored data in the context information database 18 will be described with reference to FIG.
As shown in FIG. 3, the following data are recorded in association with each other in the context information database 18.
(A) Context ID
(B) Context type (c) Context-compatible main slot (significant element)

（ａ）コンテクストＩＤは、コンテクスト種類の識別子である。
（ｂ）コンテクスト種類は、コンテクストの種類である。例えば、図３に示すように、「スケジュール検索」、「地図情報提供」、「予定変更処理」、「スマートロック制御検出」等の様々なコンテクスト（状況情報）の種類が登録される。 (A) The context ID is an identifier of the context type.
(B) The context type is the type of context. For example, as shown in FIG. 3, various context (situation information) types such as "schedule search", "map information provision", "schedule change processing", and "smart lock control detection" are registered.

情報処理装置１０のコンテクスト解析部は、このコンテクスト情報データベース１８に登録されたコンテクスト（状況情報）の種類に基づいて、図２に示すカメラ１１等から得られるリアルタイムの観測情報、すなわちコンテクスト（状況情報）２３が、どのコンテクスト種類に対応するかを判別する。 The context analysis unit of the information processing apparatus 10 is based on the type of context (situation information) registered in the context information database 18, real-time observation information obtained from the camera 11 or the like shown in FIG. 2, that is, the context (situation information). ) 23 determines which context type corresponds to.

（ｃ）コンテクスト対応主要スロット（有意要素）は、（ｂ）に登録されたコンテクスト種類に対応する主要スロット、具体的には、情報処理装置１０がユーザ発話に対する応答処理を行う際に参照するユーザ発話内の主要スロットを記録した領域である。
なお、スロットとは、前述したように、ユーザ発話に含まれる意味のある有意要素である。 (C) The context-compatible main slot (significant element) is the main slot corresponding to the context type registered in (b), specifically, the user referred to when the information processing apparatus 10 performs response processing to the user's utterance. This is the area where the main slots in the utterance are recorded.
As described above, the slot is a meaningful and significant element included in the user's utterance.

図３に示す例では、例えば、
（ｂ）コンテクスト種類＝「スケジュール検索」の場合、
（ｃ）コンテクスト対応主要スロット（有意要素）として、「検索、〇月、×日、午前、午後、・・・」
これらのスロット（ユーザ発話ワード）が登録されている。 In the example shown in FIG. 3, for example
(B) When the context type = "schedule search"
(C) As the main context-compatible slots (significant elements), "Search, October, × day, morning, afternoon, ..."
These slots (user utterance words) are registered.

例えば、ユーザが「スケジュール検索」を行っている際に、情報処理装置１０がユーザ発話を入力して、入力ユーザ発話に対応する応答処理を実行する場合、処理内容を決定するために参照するユーザ発話内の主なスロット（有意要素）が、コンテクスト対応主要スロット（有意要素）として登録された「検索、〇月、×日、午前、午後、・・・」、これらのスロットである。 For example, when the information processing device 10 inputs a user utterance and executes a response process corresponding to the input user utterance while the user is performing a "schedule search", the user referred to for determining the processing content. The main slots (significant elements) in the utterance are "Search, October, X-day, morning, afternoon, ..." registered as the main context-compatible slots (significant elements), these slots.

また、図３に示す例において、例えば、
（ｂ）コンテクスト種類＝「地図情報提供」の場合、
（ｃ）コンテクスト対応主要スロット（有意要素）として、「近く、〇ｍ、駅、道順、・・・」
これらのスロット（ユーザ発話ワード）が登録されている。 Further, in the example shown in FIG. 3, for example,
(B) When the context type = "Map information provision"
(C) As the main context-compatible slots (significant elements), "Nearby, 〇m, station, directions, ..."
These slots (user utterance words) are registered.

例えば、情報処理装置１０が、ユーザ１に対して「地図情報提供」を行っている際に、情報処理装置１０がユーザ発話を入力して、入力ユーザ発話に対応する応答処理を実行する場合、処理内容を決定するために参照するユーザ発話内の主なスロット（有意要素）が、コンテクスト対応主要スロット（有意要素）として登録された「近く、〇ｍ、駅、道順、・・・」、これらのスロットとなる。 For example, when the information processing device 10 is performing "map information provision" to the user 1, the information processing device 10 inputs a user utterance and executes a response process corresponding to the input user utterance. The main slots (significant elements) in the user utterance referred to to determine the processing content are registered as context-compatible main slots (significant elements), "near, 〇m, station, directions, ...", these It becomes a slot of.

このように、コンテクスト情報データベース１８には、ＩＤ（識別子）の設定された様々なコンテクスト種類が登録され、さらに、各コンテクスト種類に対応した「（ｃ）コンテクスト対応主要スロット（有意要素）」が登録されている。
（ｃ）コンテクスト対応主要スロット（有意要素）は、（ｂ）コンテクスト種類に対応する主要スロット、具体的には、情報処理装置１０がユーザ発話に対する応答処理を行う際に参照するユーザ発話内の主要スロットを記録した領域である。 In this way, various context types for which IDs (identifiers) have been set are registered in the context information database 18, and "(c) context-compatible main slots (significant elements)" corresponding to each context type are registered. Has been done.
(C) The main slot corresponding to the context (significant element) is (b) the main slot corresponding to the context type, specifically, the main slot in the user utterance referred to when the information processing apparatus 10 performs response processing to the user utterance. This is the area where the slot is recorded.

情報処理装置１０のコンテクスト解析部は、ステップＳ０３において、まず、情報処理装置１０のカメラ１１、マイク１２、センサ１５の取得情報、表示部１３に表示中の情報、通信部を介して外部のサーバ等から得られるリアルタイムの観測情報、すなわちコンテクスト（状況情報）２３が、どのコンテクスト種類に対応するかを判別する。
このコンテクスト種類判別に際して、図３を参照して説明したコンテクスト情報データベース１８が利用される。 In step S03, the context analysis unit of the information processing device 10 first obtains information from the camera 11, microphone 12, and sensor 15 of the information processing device 10, information displayed on the display unit 13, and an external server via the communication unit. It is determined which context type the real-time observation information obtained from the above, that is, the context (situation information) 23 corresponds to.
In this context type determination, the context information database 18 described with reference to FIG. 3 is used.

さらに、情報処理装置１０のコンテクスト解析部は、ステップＳ０３の後半の処理として、コンテクスト（状況情報）２３のコンテクスト種類と、その他の様々なコンテクスト種類間の距離値を取得する。 Further, the context analysis unit of the information processing apparatus 10 acquires the context type of the context (situation information) 23 and the distance value between various other context types as the latter half of the process of step S03.

図３を参照して説明したように、コンテクスト情報データベース１８には様々なコンテクスト種類がＩＤとともに記録されている。
これら様々なコンテクスト種類には、相互の類似性や関連性が高いコンテクスト種類や、相互の類似性や関連性が低いコンテクスト種類が混在している。 As described with reference to FIG. 3, various context types are recorded together with IDs in the context information database 18.
These various context types include a mixture of context types that are highly similar or highly related to each other and context types that are less similar or related to each other.

相互の類似性や関連性が高いコンテクスト種類間の距離値は小さい値となり、相互の類似性や関連性が低いコンテクスト種類間の距離値は大きくなる。
これらの各コンテクスト種類間の距離値を登録したテーブルが、コンテクスト間距離テーブル１９である。 The distance value between context types with high mutual similarity and relevance is small, and the distance value between context types with low mutual similarity and relevance is large.
The table in which the distance values between each of these context types are registered is the inter-context distance table 19.

図４を参照して、コンテクスト間距離テーブル１９の記録データ構成例について説明する。
図４に示すように、コンテクスト間距離テーブル１９は、２つのコンテクスト種類間の距離値を記録したテーブルである。 An example of the recorded data configuration of the inter-context distance table 19 will be described with reference to FIG.
As shown in FIG. 4, the inter-context distance table 19 is a table in which distance values between two context types are recorded.

例えば、
コンテクストＩＤ＝００１（スケジュール検索）と、各コンテクスト（ＩＤ＝００１，００２，００３・・・）との距離値は、以下の通りである。
００１（スケジュール検索）と、００１（スケジュール検索）間の距離値＝０
００１（スケジュール検索）と、００２（地図情報提供）間の距離値＝４
００１（スケジュール検索）と、００３（予定変更処理）間の距離値＝２
００１（スケジュール検索）と、００４（天気情報提供）間の距離値＝３
・・・・・ for example,
The distance values between the context ID = 001 (schedule search) and each context (ID = 001,002,003 ...) Are as follows.
Distance value between 001 (schedule search) and 001 (schedule search) = 0
Distance value between 001 (schedule search) and 002 (map information provided) = 4
Distance value between 001 (schedule search) and 003 (schedule change processing) = 2
Distance value between 001 (schedule search) and 004 (weather information provided) = 3
・・・・・・・

このように、コンテクスト間距離テーブル１９には、２つのコンテクスト種類（ＩＤ）間の距離値を記録している。
２つのコンテクスト種類（ＩＤ）間の距離値の値が小さいほど、その２つのコンテクスト種類の類似度や関連性が高いことを意味する。
一方、２つのコンテクスト種類（ＩＤ）間の距離値の値が大きいほど、その２つのコンテクスト種類の類似度や関連性が低いことを意味する。 As described above, the distance value between the two context types (IDs) is recorded in the inter-context distance table 19.
The smaller the value of the distance value between the two context types (ID), the higher the similarity and relevance of the two context types.
On the other hand, the larger the value of the distance value between the two context types (ID), the lower the similarity and relevance of the two context types.

情報処理装置１０のコンテクスト解析部は、図２に示すステップＳ０３において、まず、情報処理装置のカメラ１１、マイク１２、センサ１５からの取得情報や、表示部１３に表示中の情報や、通信部を介して外部のサーバ等から得られる情報等、様々なコンテクスト（状況情報）２３を取得する。 In step S03 shown in FIG. 2, the context analysis unit of the information processing device 10 first obtains information from the camera 11, microphone 12, and sensor 15 of the information processing device, information displayed on the display unit 13, and a communication unit. Various contexts (status information) 23 such as information obtained from an external server or the like are acquired via the above.

次に、取得したコンテクスト（状況情報）２３のコンテクスト種類を、図３を参照して説明したコンテクスト情報データベース１８に基づいて判別する。
次に、取得したコンテクスト（状況情報）２３のコンテクスト種類と、その他のコンテクスト種類との距離値を、図４を参照して説明したコンテクスト間距離テーブル１９から取得する。 Next, the context type of the acquired context (situation information) 23 is determined based on the context information database 18 described with reference to FIG.
Next, the distance values between the acquired context type (status information) 23 and the other context types are acquired from the inter-context distance table 19 described with reference to FIG.

次に、情報処理装置１０のコンテクスト解析部は、ユーザ発話候補に含まれるスロット（有意要素）と同一、または類似度の高いスロットを、コンテクスト対応主要スロットとして含むコンテクスト種類を選択する。
この処理は、図３を参照して説明したコンテクスト情報データベース１８を参照して実行する。 Next, the context analysis unit of the information processing apparatus 10 selects a context type that includes a slot that is the same as or has a high degree of similarity to the slot (significant element) included in the user utterance candidate as the context-corresponding main slot.
This process is executed with reference to the context information database 18 described with reference to FIG.

次に、ユーザ発話候補に含まれるスロット（有意要素）と同一、または類似度の高いスロットを、コンテクスト対応主要スロットとして含むコンテクスト種類の中で、最も距離値が小さいコンテクスト種類を選択する。 Next, the context type having the smallest distance value is selected from the context types including the slots having the same or high similarity as the slots (significant elements) included in the user utterance candidates as the context-compatible main slots.

すなわち、ユーザ発話候補に含まれるスロット（有意要素）を、コンテクスト対応主要スロットとして含むコンテクスト種類の中から、カメラ１１等を介して取得したコンテクスト（状況情報）２３のコンテクスト種類との距離値が最も小さいコンテクスト種類を１つ選択する。 That is, the distance value from the context type of the context (situation information) 23 acquired via the camera 11 or the like is the largest among the context types including the slots (significant elements) included in the user utterance candidates as the context-compatible main slots. Select one small context type.

この最終的に選択された１つのコンテクスト種類のコンテクスト対応主要スロットと同一、または類似度の高いスロット（有意要素）を含むユーザ発話候補を、最終的に情報処理装置１０が応答処理を実行する１つのユーザ発話として選択する。 The information processing apparatus 10 finally executes response processing for a user utterance candidate including a slot (significant element) that is the same as or has a high degree of similarity to the context-compatible main slot of one finally selected context type. Select as one user utterance.

なお、具体的な取得コンテクストに基づくユーザ発話候補の選択処理例については、後段において、図５以下を参照して説明する。 An example of user utterance candidate selection processing based on a specific acquisition context will be described later with reference to FIG. 5 and below.

情報処理装置１０が、図２に示すステップＳ０３において、カメラ１１等の観測情報から取得したコンテクスト（状況情報）２３に基づいて、複数のユーザ発話候補から１つの応答処理対象とすべきユーザ発話候補を選択する処理が完了すると、ステップＳ０４に進む。 In step S03 shown in FIG. 2, the information processing apparatus 10 is a user utterance candidate to be one response processing target from a plurality of user utterance candidates based on the context (situation information) 23 acquired from the observation information of the camera 11 or the like. When the process of selecting is completed, the process proceeds to step S04.

（ステップＳ０４）
最後に、情報処理装置１０は、ステップＳ０４において、ステップＳ０３で情報処理装置１０のコンテクスト解析部が選択した１つのユーザ発話候補に対応した応答や処理を実行する。 (Step S04)
Finally, in step S04, the information processing device 10 executes a response or process corresponding to one user utterance candidate selected by the context analysis unit of the information processing device 10 in step S03.

本例では、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
これら２つのユーザ発話候補のいずれか１つのユーザ発話候補をコンテクストに基づいて選択して、選択したユーザ発話候補に対応した処理を実行することになる。 In this example,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
One of these two user utterance candidates is selected based on the context, and the process corresponding to the selected user utterance candidate is executed.

例えば、ユーザ発話候補ａ＝「天気、おねがい」を選択した場合は、情報処理装置１０は、外部サーバ等から天気情報を取得してユーザ１に通知する。具体的にはマイク１２を介して天気情報を音声情報で出力し、また表示部１３に天気図を示す等の処理を行う。 For example, when the user utterance candidate a = "weather, please" is selected, the information processing device 10 acquires the weather information from an external server or the like and notifies the user 1. Specifically, the weather information is output as voice information via the microphone 12, and processing such as showing a weather map on the display unit 13 is performed.

一方、例えば、ユーザ発話候補ｂ＝「電気、おねがい」を選択した場合は、情報処理装置１０は、室内の電気のスイッチのＯＮ／ＯＦＦ制御を実行する等の処理を行う。 On the other hand, for example, when the user utterance candidate b = "electricity, please" is selected, the information processing device 10 performs processing such as executing ON / OFF control of the electric switch in the room.

［３．コンテクスト解析に基づくユーザ発話選択処理の具体例について］
次に、本開示の情報処理装置１０が実行するコンテクスト解析に基づくユーザ発話選択処理の具体例について説明する。 [3. Specific example of user utterance selection processing based on context analysis]
Next, a specific example of the user utterance selection process based on the context analysis executed by the information processing apparatus 10 of the present disclosure will be described.

以下、図５〜図８を参照して、先に図２を参照して説明したと同様のユーザ発話、すなわち、
ユーザ発話＝「＊＊んき、おねがい」
上記のような不明瞭な発話部分（＊＊）を含むユーザ発話を情報処理装置１０が入力し、音声解析部による音声解析の結果として、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
これら２つのユーザ発話候補が生成された場合のコンテクスト解析に基づくユーザ発話選択処理の具体例について説明する。 Hereinafter, with reference to FIGS. 5 to 8, the same user utterance as described above with reference to FIG. 2, that is, the same user utterance, that is,
User utterance = "** Nki, please"
The information processing device 10 inputs a user utterance including the above-mentioned unclear utterance portion (**), and as a result of voice analysis by the voice analysis unit,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
A specific example of the user utterance selection process based on the context analysis when these two user utterance candidates are generated will be described.

以下では、以下の２つの処理例について説明する。
（処理例１）コンテクスト解析に基づくユーザ発話選択の結果として、
ユーザ発話候補ａ＝「天気、おねがい」
このユーザ発話候補ａが選択される場合の処理シーケンス The following two processing examples will be described below.
(Processing example 1) As a result of user utterance selection based on context analysis,
User utterance candidate a = "Weather, please"
Processing sequence when this user utterance candidate a is selected

（処理例２）コンテクスト解析に基づくユーザ発話選択の結果として、
ユーザ発話候補ｂ＝「電気、おねがい」
このユーザ発話候補ｂが選択される場合の処理シーケンス (Processing example 2) As a result of user utterance selection based on context analysis,
User utterance candidate b = "Electricity, please"
Processing sequence when this user utterance candidate b is selected

（３−１（処理例１）コンテクスト解析に基づいて、ユーザ発話候補ａ＝「天気、おねがい」が選択される場合の処理シーケンス）
まず、図５、図６を参照して、コンテクスト解析に基づいて、ユーザ発話候補ａ＝「天気、おねがい」が選択される場合の処理シーケンスについて説明する。 (3-1 (Processing example 1) Processing sequence when user utterance candidate a = "weather, please" is selected based on context analysis)
First, with reference to FIGS. 5 and 6, a processing sequence when the user utterance candidate a = "weather, please" is selected based on the context analysis will be described.

図５は、コンテクスト解析に基づくユーザ発話選択の結果として、
ユーザ発話候補ａ＝「天気、おねがい」
このユーザ発話候補ａが選択された場合の処理シーケンスの具体例を説明する図である。 FIG. 5 shows the result of user utterance selection based on context analysis.
User utterance candidate a = "Weather, please"
It is a figure explaining the specific example of the processing sequence when this user utterance candidate a is selected.

図５に示す（ステップＳ０３）と、（ステップＳ０４）の処理は、図２を参照して説明した（ステップＳ０３）と、（ステップＳ０４）の処理に対応する。
これらの処理ステップの具体的処理について説明する。 The processes of (step S03) and (step S04) shown in FIG. 5 correspond to the processes of (step S03) and (step S04) described with reference to FIG.
Specific processing of these processing steps will be described.

（ステップＳ０３）
情報処理装置１０は、ステップＳ０３において、コンテクスト（状況情報）に基づいて、複数のユーザ発話候補から、１つの応答処理対象とすべきユーザ発話候補を選択する。 (Step S03)
In step S03, the information processing device 10 selects one user utterance candidate to be targeted for response processing from a plurality of user utterance candidates based on the context (situation information).

図５に示す例は、ユーザ発話候補ａ＝「天気、おねがい」が選択される場合の処理例である。 The example shown in FIG. 5 is a processing example when the user utterance candidate a = "weather, please" is selected.

情報処理装置１０のコンテクスト解析部は、ステップＳ０３において、情報処理装置のカメラ１１、マイク１２、センサ１５から取得される観測情報、さらに表示部１３に表示中の情報や、通信部を介して外部のサーバ等から得られる情報等、様々なコンテクスト（状況情報）２３を取得し、図５に示すステップＳ０３ａ〜Ｓ０３ｄの処理を実行する。
これらの処理ステップについて説明する。 In step S03, the context analysis unit of the information processing device 10 includes observation information acquired from the camera 11, microphone 12, and sensor 15 of the information processing device, information displayed on the display unit 13, and external information via the communication unit. Various contexts (status information) 23 such as information obtained from the server of the above are acquired, and the processes of steps S03a to S03d shown in FIG. 5 are executed.
These processing steps will be described.

（ステップＳ０３ａ）
まず、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ａにおいて、カメラ１１等の観測情報から取得したコンテクスト（状況情報）２３のコンテクスト種類を判別する。 (Step S03a)
First, in step S03a, the context analysis unit of the information processing apparatus 10 determines the context type of the context (situation information) 23 acquired from the observation information of the camera 11 or the like.

このコンテクスト種類判別処理は、図３を参照して説明したコンテクスト情報データベース１８を参照して実行する。 This context type determination process is executed with reference to the context information database 18 described with reference to FIG.

（ステップＳ０３ｂ）
次に、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｂにおいて、取得したコンテクスト（状況情報）２３のコンテクスト種類と他の様々なコンテクスト種類との間の距離値を取得する。 (Step S03b)
Next, in step S03b, the context analysis unit of the information processing apparatus 10 acquires a distance value between the context type of the acquired context (situation information) 23 and various other context types.

このコンテクスト間距離値取得処理は、図４を参照して説明したコンテクスト間距離テーブル１９を参照して実行する。 This inter-text distance value acquisition process is executed with reference to the inter-context distance table 19 described with reference to FIG.

（ステップＳ０３ｃ）
次に、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｃにおいて、ユーザ発話候補内スロットと同一または類似スロットを主要スロットとして含み、距離値最小のコンテクスト種類を選択する。 (Step S03c)
Next, in step S03c, the context analysis unit of the information processing apparatus 10 includes a slot that is the same as or similar to the slot in the user utterance candidate as the main slot, and selects the context type having the smallest distance value.

（ステップＳ０３ｄ）
次に、ステップＳ０３ｄにおいて、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｃで選択したコンテクスト種類の主要スロットと同一または類似スロットを含むユーザ発話候補を応答処理対象ユーザ発話として選択する。
本例では、ユーザ発話候補ａ「天気お願い」を応答処理対象ユーザ発話として選択する。 (Step S03d)
Next, in step S03d, the context analysis unit of the information processing apparatus 10 selects a user utterance candidate including the same or similar slot as the main slot of the context type selected in step S03c as the response processing target user utterance.
In this example, the user utterance candidate a "weather request" is selected as the response processing target user utterance.

これらのステップＳ０３ａ〜０３ｄの処理の具体例について図６を参照して説明する。
図６に示す（ステップＳ０１〜Ｓ０２）の処理は、先に図２を参照して説明した（ステップＳ０１〜Ｓ０２）の処理に対応し、ユーザ発話の音声解析処理である。
ユーザ発話を情報処理装置１０が入力し、音声解析部による音声解析の結果として、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
これら２つのユーザ発話候補が生成されたことを示している。 Specific examples of the processes in steps S03a to 03d will be described with reference to FIG.
The process shown in FIG. 6 (steps S01 to S02) corresponds to the process of (steps S01 to S02) described above with reference to FIG. 2, and is a voice analysis process of user utterance.
The information processing device 10 inputs the user's utterance, and as a result of voice analysis by the voice analysis unit,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
It shows that these two user utterance candidates have been generated.

図６には、図５を参照して説明したステップＳ０３ａ〜Ｓ０３ｄの処理の詳細を示している。
なお、図６に示す処理例は、コンテクスト解析処理により、ユーザ発話候補ａ「天気お願い」を応答処理対象ユーザ発話として選択した処理例である。
以下、図５を参照して説明したステップＳ０３ａ〜Ｓ０３ｄの処理の具体例について、図６を参照して説明する。 FIG. 6 shows the details of the processes of steps S03a to S03d described with reference to FIG.
The processing example shown in FIG. 6 is a processing example in which the user utterance candidate a "weather request" is selected as the response processing target user utterance by the context analysis processing.
Hereinafter, specific examples of the processes of steps S03a to S03d described with reference to FIG. 5 will be described with reference to FIG.

（ステップＳ０３ａ）
情報処理装置１０のコンテクスト解析部は、ステップＳ０３ａにおいて、情報処理装置のカメラ１１、マイク１２、センサ１５から取得される観測情報、さらに表示部１３に表示中の情報や、通信部を介して外部のサーバ等から得られる情報等、様々なコンテクスト（状況情報）２３を取得し、取得コンテクストのコンテクスト種類を判別する。 (Step S03a)
In step S03a, the context analysis unit of the information processing device 10 includes observation information acquired from the camera 11, microphone 12, and sensor 15 of the information processing device, information displayed on the display unit 13, and external information via the communication unit. Various contexts (status information) 23 such as information obtained from the server of the above are acquired, and the context type of the acquired context is determined.

すなわち、情報処理装置のカメラ１１等から得られるコンテクスト（状況情報）２３が、図３を参照して説明したコンテクスト情報データベース１８に登録された（ｂ）コンテクスト種類のどれに対応するかを判定する。 That is, it is determined which of the (b) context types registered in the context information database 18 described with reference to FIG. 3 the context (situation information) 23 obtained from the camera 11 or the like of the information processing device corresponds to. ..

この解析処理の結果として、本処理例においては、図６（Ｓ０３ａ）の矢印の先に示すように、取得コンテクスト（状況情報）２３のコンテクスト種類は、
コンテクスト種類＝００１（スケジュール検索）
であると判定される。 As a result of this analysis processing, in this processing example, as shown at the tip of the arrow in FIG. 6 (S03a), the context type of the acquired context (situation information) 23 is
Context type = 001 (schedule search)
Is determined to be.

（ステップＳ０３ｂ）
次に、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｂにおいて、取得したコンテクスト（状況情報）２３のコンテクスト種類（ＩＤ＝００１）と他の様々なコンテクスト種類との間の距離値を取得する。 (Step S03b)
Next, in step S03b, the context analysis unit of the information processing apparatus 10 acquires a distance value between the context type (ID = 001) of the acquired context (situation information) 23 and various other context types.

図６には、取得コンテクスト（状況情報）２３のコンテクスト種類、すなわち、
コンテクスト種類＝００１（スケジュール検索）と、図３に示すコンテクスト情報データベース１８に登録された複数のコンテクスト種類（００１，００２，００３，００４，０１２）との距離値を示している。 FIG. 6 shows the context types of the acquired context (situation information) 23, that is,
The distance value between the context type = 001 (schedule search) and the plurality of context types (001,002,003,004,012) registered in the context information database 18 shown in FIG. 3 is shown.

００１（スケジュール検索）と、００１（スケジュール検索）間の距離値＝０
００１（スケジュール検索）と、００２（地図情報提供）間の距離値＝４
００１（スケジュール検索）と、００３（予定変更処理）間の距離値＝２
００１（スケジュール検索）と、００４（天気情報提供）間の距離値＝３
００１（スケジュール検索）と、０１２（スマートロック制御検出）間の距離値＝１２
これらの距離値は、図４に示すコンテクスト間距離テーブル１９から取得される距離値である。 Distance value between 001 (schedule search) and 001 (schedule search) = 0
Distance value between 001 (schedule search) and 002 (map information provided) = 4
Distance value between 001 (schedule search) and 003 (schedule change processing) = 2
Distance value between 001 (schedule search) and 004 (weather information provided) = 3
Distance value between 001 (schedule search) and 012 (smart lock control detection) = 12
These distance values are distance values obtained from the inter-context distance table 19 shown in FIG.

この処理には、先に図３を参照して説明したコンテクスト情報データベース１８の登録データと、図４を参照して説明したコンテクスト間距離テーブル１９の登録データが利用される。 For this process, the registration data of the context information database 18 described above with reference to FIG. 3 and the registration data of the context distance table 19 described with reference to FIG. 4 are used.

まず、情報処理装置１０のコンテクスト解析部は、ユーザ発話候補に含まれるスロット（有意要素）を取得する。
ユーザ発話候補は、以下の２つの候補である。
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」 First, the context analysis unit of the information processing apparatus 10 acquires slots (significant elements) included in the user utterance candidates.
The user utterance candidates are the following two candidates.
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"

ユーザ発話候補ａ＝「天気、おねがい」に含まれるスロット（有意要素）は、「天気」、「おねがい」である。
また、ユーザ発話候補ｂ＝「電気、おねがい」に含まれるスロット（有意要素）は、「電気」、「おねがい」である。 The slots (significant elements) included in the user utterance candidate a = "weather, please" are "weather" and "request".
Further, the slots (significant elements) included in the user utterance candidate b = "electricity, please" are "electricity" and "request".

情報処理装置１０のコンテクスト解析部は、次に、図３を参照して説明したコンテクスト情報データベース１８を参照して、
（ｃ）コンテクスト対応主要スロット（有意要素）として、
上記のユーザ発話候補に含まれるスロット（有意要素）と同一のスロット（有意要素）を含むコンテクスト種類を選択する。 The context analysis unit of the information processing apparatus 10 then refers to the context information database 18 described with reference to FIG.
(C) As a context-compatible main slot (significant element)
Select a context type that includes the same slot (significant element) as the slot (significant element) included in the above user utterance candidate.

本例において、ユーザ発話候補ａ，ｂから取得されたスロット（有意要素）は、「天気」、「お願い」、「電気」、「おねがい」である。
上記のユーザ発話候補に含まれるスロット（有意要素）と同一のスロット（有意要素）が、図３に示すコンテクスト情報データベース１８中の、
（ｃ）コンテクスト対応主要スロット（有意要素）
に登録されたコンテクスト種類を選択する。 In this example, the slots (significant elements) acquired from the user utterance candidates a and b are "weather", "request", "electricity", and "request".
The same slot (significant element) as the slot (significant element) included in the above user utterance candidate is included in the context information database 18 shown in FIG.
(C) Context-enabled major slots (significant elements)
Select the context type registered in.

図３に示すコンテクスト情報データベース１８の「コンテクスト種類＝００４（天気情報提供）」の主要スロットには、「天気」が含まれている。
また、図３に示すコンテクスト情報データベース１８の「コンテクスト種類＝０１２（スマートロック制御検出）」の主要スロットには、「電気」が含まれている。 "Weather" is included in the main slot of "context type = 004 (weather information provision)" of the context information database 18 shown in FIG.
Further, "electricity" is included in the main slot of "context type = 012 (smart lock control detection)" of the context information database 18 shown in FIG.

このように、図３に示す例では、
ユーザ発話候補ａ，ｂから取得されたスロット（有意要素）である、
スロット（有意要素）＝「天気」
このスロット（有意要素）が、
コンテクスト種類＝００４（天気情報提供）
の（ｃ）コンテクスト対応主要スロット（有意要素）として登録されている。 Thus, in the example shown in FIG. 3,
Slots (significant elements) acquired from user utterance candidates a and b,
Slot (significant element) = "weather"
This slot (significant element)
Context type = 004 (weather information provided)
(C) It is registered as a context-compatible main slot (significant element).

また、ユーザ発話候補ａ，ｂから取得されたスロット（有意要素）である、
スロット（有意要素）＝「電気」
このスロット（有意要素）が、
コンテクスト種類＝０１２（スマートロック制御検出）
の（ｃ）コンテクスト対応主要スロット（有意要素）として登録されている。 In addition, the slots (significant elements) acquired from the user utterance candidates a and b,
Slot (significant element) = "electricity"
This slot (significant element)
Context type = 012 (smart lock control detection)
(C) It is registered as a context-compatible main slot (significant element).

このように、本処理例において、
ユーザ発話候補に含まれるスロット（有意要素）＝「天気」、「お願い」、「電気」、「おねがい」
これらのスロット（有意要素）を含むコンテクスト種類は、
コンテクスト種類＝００４（天気情報提供）
コンテクスト種類＝０１２（スマートロック制御検出）
これら２つのコンテクスト種類である。 Thus, in this processing example,
Slots (significant elements) included in user utterance candidates = "weather", "request", "electricity", "request"
The context type that contains these slots (significant elements) is
Context type = 004 (weather information provided)
Context type = 012 (smart lock control detection)
These two context types.

情報処理装置１０のコンテクスト解析部は、これら２つのコンテクスト種類、すなわち、
コンテクスト種類＝００４（天気情報提供）
コンテクスト種類＝０１２（スマートロック制御検出）
これら２つのコンテクスト種類から、取得コンテクスト（状況情報）２３のコンテクスト種類との距離値が最小のコンテクスト種類を選択する。 The context analysis unit of the information processing apparatus 10 has these two context types, that is,
Context type = 004 (weather information provided)
Context type = 012 (smart lock control detection)
From these two context types, the context type having the smallest distance value from the context type of the acquired context (situation information) 23 is selected.

図６に示すように、
００１（スケジュール検索）と、００４（天気情報提供）間の距離値＝３
００１（スケジュール検索）と、０１２（スマートロック制御検出）間の距離値＝１２
である。 As shown in FIG.
Distance value between 001 (schedule search) and 004 (weather information provided) = 3
Distance value between 001 (schedule search) and 012 (smart lock control detection) = 12
Is.

情報処理装置１０のコンテクスト解析部は、取得コンテクスト（状況情報）２３のコンテクスト種類との距離値が最小のコンテクスト種類として、
コンテクスト種類＝００４（天気情報提供）
このコンテクスト種類を選択する。 The context analysis unit of the information processing apparatus 10 sets the context type having the smallest distance value from the context type of the acquired context (situation information) 23 as the context type.
Context type = 004 (weather information provided)
Select this context type.

この処理は、図３を参照して説明したコンテクスト情報データベース１８と、図４を参照して説明したコンテクスト間距離テーブル１９を参照して実行される。 This process is executed with reference to the context information database 18 described with reference to FIG. 3 and the inter-context distance table 19 described with reference to FIG.

上述したように、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｃにおいて、ユーザ発話候補内スロットと同一または類似スロットを主要スロットとして含み、距離値最小のコンテクスト種類取得コンテクスト（状況情報）２３のコンテクスト種類との距離が最短のコンテクスト種類として、
コンテクスト種類＝００４（天気情報提供）
を選択する。 As described above, in step S03c, the context analysis unit of the information processing apparatus 10 includes the same or similar slot as the slot in the user utterance candidate as the main slot, and the context of the context type acquisition context (situation information) 23 having the minimum distance value. As the context type with the shortest distance to the type,
Context type = 004 (weather information provided)
Select.

（ステップＳ０３ｄ）
最後に、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｄにおいて、ステップＳ０３ｃで選択したコンテクスト種類、すなわち、
コンテクスト種類＝００４（天気情報提供）
このコンテクスト種類（００４（天気情報提供））の主要スロットと同一または類似スロットを含むユーザ発話候補を応答処理対象ユーザ発話として選択する。 (Step S03d)
Finally, in step S03d, the context analysis unit of the information processing apparatus 10 determines the context type selected in step S03c, that is,
Context type = 004 (weather information provided)
A user utterance candidate including the same or similar slot as the main slot of this context type (004 (weather information provision)) is selected as the response processing target user utterance.

コンテクスト種類＝００４（天気情報提供）の主要スロットは、図３に示すコンテクスト情報データベース１８に記録されている。
図３に示すように、コンテクスト種類＝００４（天気情報提供）の主要スロットには、「天気」、「降水確率」、「晴れ」、「雨」、「〇駅」・・・、これらのスロットが含まれており、「天気」が、
ユーザ発話候補ａ＝「天気、おねがい」のスロット「天気」に一致する。
この結果、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｄにおいて、
ユーザ発話候補ａ＝「天気、おねがい」
このユーザ発話候補ａを、情報処理装置１０が対応処理を実行すべきユーザ発話候補として選択する。 The main slots of the context type = 004 (weather information provision) are recorded in the context information database 18 shown in FIG.
As shown in FIG. 3, the main slots of the context type = 004 (weather information provision) include "weather", "precipitation probability", "sunny", "rain", "○ station", and so on. Is included and the "weather" is
User utterance candidate a = Matches the slot "weather" of "weather, please".
As a result, the context analysis unit of the information processing apparatus 10 in step S03d
User utterance candidate a = "Weather, please"
The user utterance candidate a is selected as the user utterance candidate for which the information processing apparatus 10 should execute the corresponding processing.

このように、本処理例では、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
これら２つのユーザ発話候補が生成された場合、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ａ〜０３ｄの処理を行って、
ユーザ発話候補ａ＝「天気、おねがい」
この１つのユーザ発話候補を、応答処理を実行すべきユーザ発話として選択する。 Thus, in this processing example,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
When these two user utterance candidates are generated, the context analysis unit of the information processing apparatus 10 performs the processes of steps S03a to 03d to perform the processes.
User utterance candidate a = "Weather, please"
This one user utterance candidate is selected as the user utterance for which response processing should be executed.

このように、情報処理装置１０のコンテクスト解析部は、図６を参照して説明したステップＳ０３ａ〜Ｓ０３ｄの処理を実行し、取得コンテクスト（状況情報）に基づいて、複数のユーザ発話候補から１つのユーザ発話候補を応答対象として選択する処理を行う。
この処理が完了すると、図５に示すステップＳ０４に進む。 As described above, the context analysis unit of the information processing apparatus 10 executes the processes of steps S03a to S03d described with reference to FIG. 6, and is one of the plurality of user utterance candidates based on the acquired context (situation information). Performs a process of selecting a user utterance candidate as a response target.
When this process is completed, the process proceeds to step S04 shown in FIG.

（ステップＳ０４）
図５に示すステップＳ０４の処理について説明する。
最後に、情報処理装置１０は、ステップＳ０４において、ステップＳ０３で情報処理装置１０のコンテクスト解析部が選択した１つのユーザ発話候補に対応した応答や処理を実行する。 (Step S04)
The process of step S04 shown in FIG. 5 will be described.
Finally, in step S04, the information processing device 10 executes a response or process corresponding to one user utterance candidate selected by the context analysis unit of the information processing device 10 in step S03.

本例では、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
これら２つのユーザ発話候補から、ステップＳ０３において、
ユーザ発話候補ａ＝「天気、おねがい」
この１つのユーザ発話候補がコンテクストに基づいて選択されている。 In this example,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
From these two user utterance candidates, in step S03,
User utterance candidate a = "Weather, please"
This one user utterance candidate is selected based on the context.

このように、ユーザ発話候補ａ＝「天気、おねがい」が選択された場合、情報処理装置１０は、ステップＳ０４において、例えば外部サーバ等から天気情報を取得してユーザ１に通知する。具体的にはマイク１２を介して天気情報を音声情報で出力し、また表示部１３に天気図を示す等の処理を行う。 In this way, when the user utterance candidate a = "weather, please" is selected, the information processing apparatus 10 acquires weather information from, for example, an external server, and notifies the user 1 in step S04. Specifically, the weather information is output as voice information via the microphone 12, and processing such as showing a weather map on the display unit 13 is performed.

（３−２（処理例２）コンテクスト解析に基づいて、ユーザ発話候補ｂ＝「電気、おねがい」が選択される場合の処理シーケンス）
次に、図７、図８を参照して、コンテクスト解析に基づいて、ユーザ発話候補ｂ＝「電気、おねがい」が選択される場合の処理シーケンスについて説明する。 (3-2 (Processing example 2) Processing sequence when user utterance candidate b = "electricity, please" is selected based on context analysis)
Next, with reference to FIGS. 7 and 8, a processing sequence when the user utterance candidate b = “electricity, please” is selected based on the context analysis will be described.

図７は、コンテクスト解析に基づくユーザ発話選択の結果として、
ユーザ発話候補ｂ＝「電気、おねがい」
このユーザ発話候補ｂが選択された場合の処理シーケンスの具体例を説明する図である。 FIG. 7 shows the result of user utterance selection based on context analysis.
User utterance candidate b = "Electricity, please"
It is a figure explaining the specific example of the processing sequence when this user utterance candidate b is selected.

図７に示す（ステップＳ０３）と、（ステップＳ０４）の処理は、図２を参照して説明した（ステップＳ０３）と、（ステップＳ０４）の処理に対応する。
これらの処理ステップの具体的処理について説明する。 The processes of (step S03) and (step S04) shown in FIG. 7 correspond to the processes of (step S03) and (step S04) described with reference to FIG.
Specific processing of these processing steps will be described.

図７に示す例は、ユーザ発話候補ｂ＝「電気、おねがい」が選択される場合の処理例である。 The example shown in FIG. 7 is a processing example when the user utterance candidate b = "electricity, please" is selected.

情報処理装置１０のコンテクスト解析部は、ステップＳ０３において、情報処理装置のカメラ１１、マイク１２、センサ１５から取得される観測情報、さらに表示部１３に表示中の情報や、通信部を介して外部のサーバ等から得られる情報等、様々なコンテクスト（状況情報）２３を取得し、図７に示すステップＳ０３ａ〜Ｓ０３ｄの処理を実行する。
これらの処理ステップについて説明する。 In step S03, the context analysis unit of the information processing device 10 includes observation information acquired from the camera 11, microphone 12, and sensor 15 of the information processing device, information displayed on the display unit 13, and external information via the communication unit. Various contexts (status information) 23 such as information obtained from the server of the above are acquired, and the processes of steps S03a to S03d shown in FIG. 7 are executed.
These processing steps will be described.

（ステップＳ０３ｄ）
次に、ステップＳ０３ｄにおいて、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｃで選択したコンテクスト種類の主要スロットと同一または類似スロットを含むユーザ発話候補を応答処理対象ユーザ発話として選択する。
本例では、ユーザ発話候補ｂ「電気お願い」を応答処理対象ユーザ発話として選択する。 (Step S03d)
Next, in step S03d, the context analysis unit of the information processing apparatus 10 selects a user utterance candidate including the same or similar slot as the main slot of the context type selected in step S03c as the response processing target user utterance.
In this example, the user utterance candidate b "electricity request" is selected as the response processing target user utterance.

これらのステップＳ０３ａ〜０３ｄの処理の具体例について図８を参照して説明する。
図８に示す（ステップＳ０１〜Ｓ０２）の処理は、先に図２を参照して説明した（ステップＳ０１〜Ｓ０２）の処理に対応し、ユーザ発話の音声解析処理である。
ユーザ発話を情報処理装置１０が入力し、音声解析部による音声解析の結果として、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
これら２つのユーザ発話候補が生成されたことを示している。 Specific examples of the processes of steps S03a to 03d will be described with reference to FIG.
The process shown in FIG. 8 (steps S01 to S02) corresponds to the process of (steps S01 to S02) described above with reference to FIG. 2, and is a voice analysis process of user utterance.
The information processing device 10 inputs the user's utterance, and as a result of voice analysis by the voice analysis unit,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
It shows that these two user utterance candidates have been generated.

図８には、図７を参照して説明したステップＳ０３ａ〜Ｓ０３ｄの処理の詳細を示している。 FIG. 8 shows the details of the processes of steps S03a to S03d described with reference to FIG.

なお、図８に示す処理例は、コンテクスト解析処理により、ユーザ発話候補ｂ「電気お願い」を応答処理対象ユーザ発話として選択した処理例である。
以下、図７を参照して説明したステップＳ０３ａ〜Ｓ０３ｄの処理の具体例について、図８を参照して説明する。 The processing example shown in FIG. 8 is a processing example in which the user utterance candidate b “electrical request” is selected as the response processing target user utterance by the context analysis processing.
Hereinafter, specific examples of the processes of steps S03a to S03d described with reference to FIG. 7 will be described with reference to FIG.

この解析処理の結果として、本処理例においては、図８（Ｓ０３ａ）の矢印の先に示すように、取得コンテクスト（状況情報）２３のコンテクスト種類は、
コンテクスト種類＝０１２（スマートロック制御検出）
であると判定される。 As a result of this analysis processing, in this processing example, as shown at the tip of the arrow in FIG. 8 (S03a), the context type of the acquired context (situation information) 23 is
Context type = 012 (smart lock control detection)
Is determined to be.

（ステップＳ０３ｂ）
次に、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｂにおいて、取得したコンテクスト（状況情報）２３のコンテクスト種類（ＩＤ＝０１２）と他の様々なコンテクスト種類との間の距離値を取得する。 (Step S03b)
Next, in step S03b, the context analysis unit of the information processing apparatus 10 acquires a distance value between the context type (ID = 012) of the acquired context (situation information) 23 and various other context types.

図８には、取得コンテクスト（状況情報）２３のコンテクスト種類、すなわち、
コンテクスト種類＝０１２（スマートロック制御検出）と、図３に示すコンテクスト情報データベース１８に登録された複数のコンテクスト種類（００１，００２，００３，００４，０１２）との距離値を示している。 FIG. 8 shows the context types of the acquired context (situation information) 23, that is,
The distance value between the context type = 012 (smart lock control detection) and the plurality of context types (001,002,003,004,012) registered in the context information database 18 shown in FIG. 3 is shown.

０１２（スマートロック制御検出）と、００１（スケジュール検索）間の距離値＝１２
０１２（スマートロック制御検出）と、００２（地図情報提供）間の距離値＝１３
０１２（スマートロック制御検出）と、００３（予定変更処理）間の距離値＝１１
０１２（スマートロック制御検出）と、００４（天気情報提供）間の距離値＝１０
０１２（スマートロック制御検出）と、０１２（スマートロック制御検出）間の距離値＝０
これらの距離値は、図４に示すコンテクスト間距離テーブル１９から取得される距離値である。 Distance value between 012 (smart lock control detection) and 001 (schedule search) = 12
Distance value between 012 (smart lock control detection) and 002 (map information provision) = 13
Distance value between 012 (smart lock control detection) and 003 (schedule change processing) = 11
Distance value between 012 (smart lock control detection) and 004 (weather information provided) = 10
Distance value between 012 (smart lock control detection) and 012 (smart lock control detection) = 0
These distance values are distance values obtained from the inter-context distance table 19 shown in FIG.

本例において、ユーザ発話候補ａ，ｂから取得されたスロット（有意要素）は、「天気」、「お願い」、「電気」、「おねがい」である。
情報処理装置１０のコンテクスト解析部は、上記のユーザ発話候補に含まれるスロット（有意要素）の少なくともいずれかと同一または類似するスロット（有意要素）が、図３に示すコンテクスト情報データベース１８中の、
（ｃ）コンテクスト対応主要スロット（有意要素）
に登録されたコンテクスト種類を選択する。 In this example, the slots (significant elements) acquired from the user utterance candidates a and b are "weather", "request", "electricity", and "request".
In the context analysis unit of the information processing device 10, a slot (significant element) that is the same as or similar to at least one of the slots (significant element) included in the user utterance candidate is described in the context information database 18 shown in FIG.
(C) Context-enabled major slots (significant elements)
Select the context type registered in.

このように、本処理例において、
ユーザ発話候補に含まれるスロット（有意要素）＝「天気」、「お願い」、「電気」、「おねがい」の少なくともいずれかをコンテクスト種類対応の主要スロットとして登録したコンテクスト種類は、
コンテクスト種類＝００４（天気情報提供）
コンテクスト種類＝０１２（スマートロック制御検出）
これら２つのコンテクスト種類である。 Thus, in this processing example,
Slots (significant elements) included in user utterance candidates = The context type in which at least one of "weather", "request", "electricity", and "request" is registered as the main slot corresponding to the context type is
Context type = 004 (weather information provided)
Context type = 012 (smart lock control detection)
These two context types.

図８に示すように、
０１２（スマートロック制御検出）と、００４（天気情報提供）間の距離値＝１０
０１２（スマートロック制御検出）と、０１２（スマートロック制御検出）間の距離値＝０
である。 As shown in FIG.
Distance value between 012 (smart lock control detection) and 004 (weather information provided) = 10
Distance value between 012 (smart lock control detection) and 012 (smart lock control detection) = 0
Is.

情報処理装置１０のコンテクスト解析部は、取得コンテクスト（状況情報）２３のコンテクスト種類との距離値が最小のコンテクスト種類として、
コンテクスト種類＝０１２（スマートロック制御検出）
このコンテクスト種類を選択する。 The context analysis unit of the information processing apparatus 10 sets the context type having the smallest distance value from the context type of the acquired context (situation information) 23 as the context type.
Context type = 012 (smart lock control detection)
Select this context type.

上述したように、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｃにおいて、ユーザ発話候補内スロットと同一または類似スロットを主要スロットとして含み、距離値最小のコンテクスト種類取得コンテクスト（状況情報）２３のコンテクスト種類との距離が最短のコンテクスト種類として、
コンテクスト種類＝０１２（スマートロック制御検出）
を選択する。 As described above, in step S03c, the context analysis unit of the information processing apparatus 10 includes the same or similar slot as the slot in the user utterance candidate as the main slot, and the context of the context type acquisition context (situation information) 23 having the minimum distance value. As the context type with the shortest distance to the type,
Context type = 012 (smart lock control detection)
Select.

（ステップＳ０３ｄ）
最後に、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｄにおいて、ステップＳ０３ｃで選択したコンテクスト種類、すなわち、
コンテクスト種類＝０１２（スマートロック制御検出）
このコンテクスト種類（０１２（スマートロック制御検出））の主要スロットと同一または類似スロットを含むユーザ発話候補を応答処理対象ユーザ発話として選択する。 (Step S03d)
Finally, in step S03d, the context analysis unit of the information processing apparatus 10 determines the context type selected in step S03c, that is,
Context type = 012 (smart lock control detection)
A user utterance candidate including the same or similar slot as the main slot of this context type (012 (smart lock control detection)) is selected as the response processing target user utterance.

コンテクスト種類＝０１２（スマートロック制御検出）の主要スロットは、図３に示すコンテクスト情報データベース１８に記録されている。
図３に示すように、コンテクスト種類＝０１２（スマートロック制御検出）の主要スロットには、「電気」、「点けて」、「消して」、「暗く」、「明るく」・・・、これらのスロットが含まれており、「電気」が、
ユーザ発話候補ｂ＝「電気、おねがい」のスロット「電気」に一致する。
この結果、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｄにおいて、
ユーザ発話候補ｂ＝「電気、おねがい」
このユーザ発話候補ｂを、情報処理装置１０が対応処理を実行すべきユーザ発話候補として選択する。 The main slot of the context type = 012 (smart lock control detection) is recorded in the context information database 18 shown in FIG.
As shown in FIG. 3, in the main slots of the context type = 012 (smart lock control detection), "electricity", "turn on", "turn off", "dark", "bright" ... Includes slots, "electricity",
User utterance candidate b = Matches the slot "electricity" of "electricity, please".
As a result, the context analysis unit of the information processing apparatus 10 in step S03d
User utterance candidate b = "Electricity, please"
The user utterance candidate b is selected as the user utterance candidate for which the information processing apparatus 10 should execute the corresponding processing.

このように、本処理例では、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
これら２つのユーザ発話候補が生成された場合、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ａ〜０３ｄの処理を行って、
ユーザ発話候補ｂ＝「電気、おねがい」
この１つのユーザ発話候補を、応答処理を実行すべきユーザ発話として選択する。 Thus, in this processing example,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
When these two user utterance candidates are generated, the context analysis unit of the information processing apparatus 10 performs the processes of steps S03a to 03d to perform the processes.
User utterance candidate b = "Electricity, please"
This one user utterance candidate is selected as the user utterance for which response processing should be executed.

このように、情報処理装置１０のコンテクスト解析部は、図８を参照して説明したステップＳ０３ａ〜Ｓ０３ｄの処理を実行し、取得コンテクスト（状況情報）に基づいて、複数のユーザ発話候補から１つのユーザ発話候補を応答対象として選択する処理を行う。
この処理が完了すると、図７に示すステップＳ０４に進む。 In this way, the context analysis unit of the information processing apparatus 10 executes the processes of steps S03a to S03d described with reference to FIG. 8, and based on the acquired context (situation information), one from a plurality of user utterance candidates. Performs a process of selecting a user utterance candidate as a response target.
When this process is completed, the process proceeds to step S04 shown in FIG.

（ステップＳ０４）
図７に示すステップＳ０４の処理について説明する。
最後に、情報処理装置１０は、ステップＳ０４において、ステップＳ０３で情報処理装置１０のコンテクスト解析部が選択した１つのユーザ発話候補に対応した応答や処理を実行する。 (Step S04)
The process of step S04 shown in FIG. 7 will be described.
Finally, in step S04, the information processing device 10 executes a response or process corresponding to one user utterance candidate selected by the context analysis unit of the information processing device 10 in step S03.

本例では、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
これら２つのユーザ発話候補から、ステップＳ０３において、
ユーザ発話候補ｂ＝「電気、おねがい」
この１つのユーザ発話候補がコンテクストに基づいて選択されている。 In this example,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
From these two user utterance candidates, in step S03,
User utterance candidate b = "Electricity, please"
This one user utterance candidate is selected based on the context.

このように、ユーザ発話候補ｂ＝「電気、おねがい」が選択された場合、情報処理装置１０は、ステップＳ０４において、室内の電気のスイッチのＯＮ／ＯＦＦ制御を実行する等の処理を行う。 In this way, when the user utterance candidate b = "electricity, please" is selected, the information processing apparatus 10 performs processing such as executing ON / OFF control of the electric switch in the room in step S04.

［４．ユーザ発話候補が多数ある場合のコンテクスト解析に基づくユーザ発話選択処理の具体例について］
次に、ユーザ発話候補が多数ある場合のコンテクスト解析に基づくユーザ発話選択処理の具体例について説明する。 [4. Specific example of user utterance selection processing based on context analysis when there are many user utterance candidates]
Next, a specific example of the user utterance selection process based on the context analysis when there are many user utterance candidates will be described.

図２や、図５〜図８参照して説明した処理例は、ユーザ発話候補が２つの場合の処理例である。
本開示の情報処理装置１０は、ユーザ発話候補が２つの場合のみならず、３つ以上の多数の場合にも、コンテクスト解析に基づいて、情報処理装置１０が対応すべき１つのユーザ発話を選択することができる。 The processing examples described with reference to FIGS. 2 and 5 to 8 are processing examples when there are two user utterance candidates.
The information processing device 10 of the present disclosure selects one user utterance to be supported by the information processing device 10 based on the context analysis not only when there are two user utterance candidates but also when there are a large number of three or more user utterance candidates. can do.

以下、図９、図１０を参照して、ユーザ発話候補が３つの場合のコンテクスト解析に基づくユーザ発話選択処理例について説明する。 Hereinafter, an example of user utterance selection processing based on context analysis when there are three user utterance candidates will be described with reference to FIGS. 9 and 10.

図９、図１０には、ユーザ発話を情報処理装置１０が入力し、音声解析部による音声解析の結果として、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
ユーザ発話候補ｃ＝「延期、おねがい」
これら３つのユーザ発話候補が生成され、コンテクスト解析に基づくユーザ発話選択の結果として、
ユーザ発話候補ｃ＝「延期、おねがい」
このユーザ発話候補ｃが選択される場合の処理シーケンスの具体例を説明する図である。 In FIGS. 9 and 10, the information processing device 10 inputs the user's utterance, and as a result of voice analysis by the voice analysis unit,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
User utterance candidate c = "Postponed, please"
These three user utterance candidates are generated, and as a result of user utterance selection based on context analysis,
User utterance candidate c = "Postponed, please"
It is a figure explaining the specific example of the processing sequence when this user utterance candidate c is selected.

図９に示す（ステップＳ０３）と、（ステップＳ０４）の処理は、図２を参照して説明した（ステップＳ０３）と、（ステップＳ０４）の処理に対応する。
これらの処理ステップの具体的処理について説明する。 The processes of (step S03) and (step S04) shown in FIG. 9 correspond to the processes of (step S03) and (step S04) described with reference to FIG.
Specific processing of these processing steps will be described.

（ステップＳ０３）
情報処理装置１０は、ステップＳ０３において、コンテクスト（状況情報）に基づいて、複数のユーザ発話候補から、情報処理装置１０が対応する１つの応答対象のユーザ発話候補を選択する。 (Step S03)
In step S03, the information processing device 10 selects one response target user utterance candidate corresponding to the information processing device 10 from a plurality of user utterance candidates based on the context (situation information).

図９に示す例は、ユーザ発話候補ｃ＝「延期、おねがい」が選択される場合の処理例である。 The example shown in FIG. 9 is a processing example when the user utterance candidate c = "postponement, please" is selected.

（ステップＳ０３ｄ）
次に、ステップＳ０３ｄにおいて、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｃで選択したコンテクスト種類の主要スロットと同一または類似スロットを含むユーザ発話候補を、情報処理装置１０が対応する応答対象ユーザ発話として選択する。
本例では、ユーザ発話候補ｃ＝「延期、おねがい」を、情報処理装置１０が対応する応答対象ユーザ発話として選択する。 (Step S03d)
Next, in step S03d, the context analysis unit of the information processing device 10 selects a user utterance candidate including a slot that is the same as or similar to the main slot of the context type selected in step S03c, and the response target user utterance corresponding to the information processing device 10. Select as.
In this example, the user utterance candidate c = "postponement, please" is selected as the response target user utterance corresponding to the information processing apparatus 10.

これらのステップＳ０３ａ〜０３ｄの処理の具体例について図１０を参照して説明する。
図１０に示す（ステップＳ０１〜Ｓ０２）の処理は、先に図２を参照して説明した（ステップＳ０１〜Ｓ０２）の処理に対応し、ユーザ発話の音声解析処理である。
ユーザ発話を情報処理装置１０が入力し、音声解析部による音声解析の結果として、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
ユーザ発話候補ｃ＝「延期、おねがい」
これら３つのユーザ発話候補が生成されたことを示している。 Specific examples of the processes in steps S03a to 03d will be described with reference to FIG.
The process shown in FIG. 10 (steps S01 to S02) corresponds to the process of (steps S01 to S02) described above with reference to FIG. 2, and is a voice analysis process of user utterance.
The information processing device 10 inputs the user's utterance, and as a result of voice analysis by the voice analysis unit,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
User utterance candidate c = "Postponed, please"
It shows that these three user utterance candidates have been generated.

図１０には、図９を参照して説明したステップＳ０３ａ〜Ｓ０３ｄの処理の詳細を示している。 FIG. 10 shows the details of the processes of steps S03a to S03d described with reference to FIG.

なお、図１０に示す処理例は、コンテクスト解析処理により、ユーザ発話候補ｃ「延期お願い」を、情報処理装置１０が対応する応答対象のユーザ発話として選択した処理例である。
以下、図９を参照して説明したステップＳ０３ａ〜Ｓ０３ｄの処理の具体例について、図１０を参照して説明する。 The processing example shown in FIG. 10 is a processing example in which the user utterance candidate c "postponement request" is selected as the corresponding response target user utterance by the information processing apparatus 10 by the context analysis processing.
Hereinafter, specific examples of the processes of steps S03a to S03d described with reference to FIG. 9 will be described with reference to FIG.

この解析処理の結果として、本処理例においては、図１０（Ｓ０３ａ）の矢印の先に示すように、取得コンテクスト（状況情報）２３のコンテクスト種類は、
コンテクスト種類＝００１（スケジュール検索）
であると判定される。 As a result of this analysis processing, in this processing example, as shown at the tip of the arrow in FIG. 10 (S03a), the context type of the acquired context (situation information) 23 is
Context type = 001 (schedule search)
Is determined to be.

図１０には、取得コンテクスト（状況情報）２３のコンテクスト種類、すなわち、
コンテクスト種類＝００１（スケジュール検索）と、図３に示すコンテクスト情報データベース１８に登録された複数のコンテクスト種類（００１，００２，００３，００４，０１２）との距離値を示している。 In FIG. 10, the context type of the acquired context (situation information) 23, that is,
The distance value between the context type = 001 (schedule search) and the plurality of context types (001,002,003,004,012) registered in the context information database 18 shown in FIG. 3 is shown.

まず、情報処理装置１０のコンテクスト解析部は、ユーザ発話候補に含まれるスロット（有意要素）を取得する。
ユーザ発話候補は、以下の３つの候補である。
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
ユーザ発話候補ｃ＝「延期、おねがい」 First, the context analysis unit of the information processing apparatus 10 acquires slots (significant elements) included in the user utterance candidates.
The user utterance candidates are the following three candidates.
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
User utterance candidate c = "Postponed, please"

ユーザ発話候補ａ＝「天気、おねがい」に含まれるスロット（有意要素）は、「天気」、「おねがい」である。
また、ユーザ発話候補ｂ＝「電気、おねがい」に含まれるスロット（有意要素）は、「電気」、「おねがい」である。
さらに、ユーザ発話候補ｃ＝「延期、おねがい」に含まれるスロット（有意要素）は、「延期」、「おねがい」である。 The slots (significant elements) included in the user utterance candidate a = "weather, please" are "weather" and "request".
Further, the slots (significant elements) included in the user utterance candidate b = "electricity, please" are "electricity" and "request".
Further, the slots (significant elements) included in the user utterance candidate c = "postponement, please" are "postponement" and "request".

本例において、ユーザ発話候補ａ，ｂから取得されたスロット（有意要素）は、「天気」、「お願い」、「電気」、「おねがい」、「延期」、「おねがい」である。
上記のユーザ発話候補に含まれるスロット（有意要素）と同一のスロット（有意要素）が、図３に示すコンテクスト情報データベース１８中の、
（ｃ）コンテクスト対応主要スロット（有意要素）
に登録されたコンテクスト種類を選択する。 In this example, the slots (significant elements) acquired from the user utterance candidates a and b are "weather", "request", "electricity", "request", "postponement", and "request".
The same slot (significant element) as the slot (significant element) included in the above user utterance candidate is included in the context information database 18 shown in FIG.
(C) Context-enabled major slots (significant elements)
Select the context type registered in.

図３に示すコンテクスト情報データベース１８の「コンテクスト種類＝００４（天気情報提供）」の主要スロットには、「天気」が含まれている。
また、図３に示すコンテクスト情報データベース１８の「コンテクスト種類＝０１２（スマートロック制御検出）」の主要スロットには、「電気」が含まれている。
さらに、図３に示すコンテクスト情報データベース１８の「コンテクスト種類＝００３（予定変更処理）」の主要スロットには、「延期」が含まれている。 "Weather" is included in the main slot of "context type = 004 (weather information provision)" of the context information database 18 shown in FIG.
Further, "electricity" is included in the main slot of "context type = 012 (smart lock control detection)" of the context information database 18 shown in FIG.
Further, the main slot of "context type = 003 (schedule change processing)" of the context information database 18 shown in FIG. 3 includes "postponement".

このように、
図３に示す例では、
ユーザ発話候補ａ，ｂ，ｃから取得されたスロット（有意要素）である、
スロット（有意要素）＝「天気」
このスロット（有意要素）が、
コンテクスト種類＝００４（天気情報提供）
の（ｃ）コンテクスト対応主要スロット（有意要素）として登録されている。 in this way,
In the example shown in FIG.
Slots (significant elements) acquired from user utterance candidates a, b, c,
Slot (significant element) = "weather"
This slot (significant element)
Context type = 004 (weather information provided)
(C) It is registered as a context-compatible main slot (significant element).

また、ユーザ発話候補ａ，ｂ，ｃから取得されたスロット（有意要素）である、
スロット（有意要素）＝「電気」
このスロット（有意要素）が、
コンテクスト種類＝０１２（スマートロック制御検出）
の（ｃ）コンテクスト対応主要スロット（有意要素）として登録されている。 Further, it is a slot (significant element) acquired from user utterance candidates a, b, and c.
Slot (significant element) = "electricity"
This slot (significant element)
Context type = 012 (smart lock control detection)
(C) It is registered as a context-compatible main slot (significant element).

さらに、ユーザ発話候補ａ，ｂ，ｃから取得されたスロット（有意要素）である、
スロット（有意要素）＝「延期」
このスロット（有意要素）が、
コンテクスト種類＝００３（予定変更処理）
の（ｃ）コンテクスト対応主要スロット（有意要素）として登録されている。 Further, it is a slot (significant element) acquired from user utterance candidates a, b, and c.
Slot (significant factor) = "postponed"
This slot (significant element)
Context type = 003 (schedule change process)
(C) It is registered as a context-compatible main slot (significant element).

このように、本処理例において、
ユーザ発話候補に含まれるスロット（有意要素）＝「天気」、「お願い」、「電気」、「おねがい」、「延期」、「おねがい」
これらのスロット（有意要素）を含むコンテクスト種類は、
コンテクスト種類＝００３（予定変更処理）
コンテクスト種類＝００４（天気情報提供）
コンテクスト種類＝０１２（スマートロック制御検出）
これら３つのコンテクスト種類である。 Thus, in this processing example,
Slots (significant elements) included in user utterance candidates = "weather", "request", "electricity", "request", "postponement", "request"
The context type that contains these slots (significant elements) is
Context type = 003 (schedule change process)
Context type = 004 (weather information provided)
Context type = 012 (smart lock control detection)
These three context types.

情報処理装置１０のコンテクスト解析部は、これら３つのコンテクスト種類、すなわち、
コンテクスト種類＝００３（予定変更処理）
コンテクスト種類＝００４（天気情報提供）
コンテクスト種類＝０１２（スマートロック制御検出）
これら２つのコンテクスト種類から、取得コンテクスト（状況情報）２３のコンテクスト種類との距離値が最小のコンテクスト種類を選択する。 The context analysis unit of the information processing apparatus 10 has these three context types, that is,
Context type = 003 (schedule change process)
Context type = 004 (weather information provided)
Context type = 012 (smart lock control detection)
From these two context types, the context type having the smallest distance value from the context type of the acquired context (situation information) 23 is selected.

図１０に示すように、
００１（スケジュール検索）と、００３（予定変更処理）間の距離値＝２
００１（スケジュール検索）と、００４（天気情報提供）間の距離値＝３
００１（スケジュール検索）と、０１２（スマートロック制御検出）間の距離値＝１２
である。 As shown in FIG.
Distance value between 001 (schedule search) and 003 (schedule change processing) = 2
Distance value between 001 (schedule search) and 004 (weather information provided) = 3
Distance value between 001 (schedule search) and 012 (smart lock control detection) = 12
Is.

情報処理装置１０のコンテクスト解析部は、取得コンテクスト（状況情報）２３のコンテクスト種類との距離値が最小のコンテクスト種類として、
コンテクスト種類＝００３（予定変更処理）
このコンテクスト種類を選択する。 The context analysis unit of the information processing apparatus 10 sets the context type having the smallest distance value from the context type of the acquired context (situation information) 23 as the context type.
Context type = 003 (schedule change process)
Select this context type.

上述したように、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｃにおいて、ユーザ発話候補内スロットと同一または類似スロットを主要スロットとして含み、距離値最小のコンテクスト種類取得コンテクスト（状況情報）２３のコンテクスト種類との距離が最短のコンテクスト種類として、
コンテクスト種類＝００３（予定変更処理）
を選択する。 As described above, in step S03c, the context analysis unit of the information processing apparatus 10 includes the same or similar slot as the slot in the user utterance candidate as the main slot, and the context of the context type acquisition context (situation information) 23 having the minimum distance value. As the context type with the shortest distance to the type,
Context type = 003 (schedule change process)
Select.

（ステップＳ０３ｄ）
最後に、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｄにおいて、ステップＳ０３ｃで選択したコンテクスト種類、すなわち、
コンテクスト種類＝００３（予定変更処理）
このコンテクスト種類（００３（予定変更処理））の主要スロットと同一または類似スロットを含むユーザ発話候補を、情報処理装置１０が対応する応答対象ユーザ発話として選択する。 (Step S03d)
Finally, in step S03d, the context analysis unit of the information processing apparatus 10 determines the context type selected in step S03c, that is,
Context type = 003 (schedule change process)
The information processing apparatus 10 selects a user utterance candidate including the same or similar slot as the main slot of the context type (003 (schedule change process)) as the corresponding response target user utterance.

コンテクスト種類＝００３（予定変更処理）の主要スロットは、図３に示すコンテクスト情報データベース１８に記録されている。
図３に示すように、コンテクスト種類＝００３（予定変更処理）の主要スロットには、「〇月」、「×日」、「午前」、・・・「延期」・・・、これらのスロットが含まれており、「延期」が、
ユーザ発話候補ｃ＝「延期、おねがい」のスロット「延期」に一致する。
この結果、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ｄにおいて、
ユーザ発話候補ｃ＝「延期、おねがい」
このユーザ発話候補ｃを、情報処理装置１０が対応処理を実行すべきユーザ発話候補として選択する。 The main slots of the context type = 003 (schedule change processing) are recorded in the context information database 18 shown in FIG.
As shown in FIG. 3, the main slots of the context type = 003 (schedule change processing) include "October", "x day", "am", ... "postponed" ..., these slots. Included, "Postponed",
User utterance candidate c = Matches the slot "postponement" of "postponement, please".
As a result, the context analysis unit of the information processing apparatus 10 in step S03d
User utterance candidate c = "Postponed, please"
The user utterance candidate c is selected as the user utterance candidate for which the information processing apparatus 10 should execute the corresponding processing.

このように、本処理例では、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
ユーザ発話候補ｃ＝「延期、おねがい」
これら３つのユーザ発話候補が生成された場合、情報処理装置１０のコンテクスト解析部は、ステップＳ０３ａ〜０３ｄの処理を行って、
ユーザ発話候補ｃ＝「延期、おねがい」
この１つのユーザ発話候補を、情報処理装置１０が対応する応答対象のユーザ発話として選択する。 Thus, in this processing example,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
User utterance candidate c = "Postponed, please"
When these three user utterance candidates are generated, the context analysis unit of the information processing apparatus 10 performs the processes of steps S03a to 03d to perform the processes.
User utterance candidate c = "Postponed, please"
This one user utterance candidate is selected by the information processing device 10 as the corresponding user utterance of the response target.

このように、情報処理装置１０のコンテクスト解析部は、図１０を参照して説明したステップＳ０３ａ〜Ｓ０３ｄの処理を実行し、取得コンテクスト（状況情報）に基づいて、複数のユーザ発話候補から１つのユーザ発話候補を情報処理装置１０が対応する応答対象として選択する処理を行う。
この処理が完了すると、図９に示すステップＳ０４に進む。 As described above, the context analysis unit of the information processing apparatus 10 executes the processes of steps S03a to S03d described with reference to FIG. 10, and is one of the plurality of user utterance candidates based on the acquired context (situation information). The information processing apparatus 10 performs a process of selecting a user utterance candidate as a corresponding response target.
When this process is completed, the process proceeds to step S04 shown in FIG.

（ステップＳ０４）
図９に示すステップＳ０４の処理について説明する。
最後に、情報処理装置１０は、ステップＳ０４において、ステップＳ０３で情報処理装置１０のコンテクスト解析部が選択した１つのユーザ発話候補に対応した応答や処理を実行する。 (Step S04)
The process of step S04 shown in FIG. 9 will be described.
Finally, in step S04, the information processing device 10 executes a response or process corresponding to one user utterance candidate selected by the context analysis unit of the information processing device 10 in step S03.

本例では、
ユーザ発話候補ａ＝「天気、おねがい」
ユーザ発話候補ｂ＝「電気、おねがい」
ユーザ発話候補ｃ＝「延期、おねがい」
これら３つのユーザ発話候補から、ステップＳ０３において、
ユーザ発話候補ｃ＝「延期、おねがい」
この１つのユーザ発話候補がコンテクストに基づいて選択されている。 In this example,
User utterance candidate a = "Weather, please"
User utterance candidate b = "Electricity, please"
User utterance candidate c = "Postponed, please"
From these three user utterance candidates, in step S03,
User utterance candidate c = "Postponed, please"
This one user utterance candidate is selected based on the context.

このように、ユーザ発話候補ｃ＝「延期、おねがい」が選択された場合、情報処理装置１０は、ステップＳ０４において、ユーザの予定の延期処理を実行する。 In this way, when the user utterance candidate c = "postponement, please" is selected, the information processing apparatus 10 executes the user's scheduled postponement processing in step S04.

［５．本開示の情報処理装置が実行する処理のシーケンスについて］
次に、本開示の情報処理装置１０が実行する処理のシーケンスについて説明する。 [5. Sequence of processing executed by the information processing apparatus of the present disclosure]
Next, a sequence of processes executed by the information processing apparatus 10 of the present disclosure will be described.

図１１に示すフローチャートを参照して、本開示の情報処理装置１０が実行する処理シーケンスについて説明する。
図１１に示すフローチャートに従った処理は、情報処理装置１０の記憶部に格納されたプログラムに従って実行される。例えばプログラム実行機能を有するＣＰＵ等のプロセッサによるプログラム実行処理として実行可能である。
図１１に示すフローの各ステップの処理について説明する。 The processing sequence executed by the information processing apparatus 10 of the present disclosure will be described with reference to the flowchart shown in FIG.
The process according to the flowchart shown in FIG. 11 is executed according to the program stored in the storage unit of the information processing apparatus 10. For example, it can be executed as a program execution process by a processor such as a CPU having a program execution function.
The processing of each step of the flow shown in FIG. 11 will be described.

（ステップＳ１０１）
まず、情報処理装置１０は、ステップＳ１０１において、ユーザ発話を入力する。 (Step S101)
First, the information processing device 10 inputs the user's utterance in step S101.

情報処理装置１０は、音声入力部であるマイク１２を介してユーザ発話を入力する。 The information processing device 10 inputs the user's utterance through the microphone 12 which is a voice input unit.

（ステップＳ１０２）
次に、情報処理装置１０は、ステップＳ１０２において、入力したユーザ発話に対する音声解析処理を実行する。 (Step S102)
Next, in step S102, the information processing device 10 executes a voice analysis process for the input user utterance.

情報処理装置１０の音声解析部は、音声入力部であるマイク１２から入力したユーザ発話音声を自動音声認識（ＡＳＲ：ＡｕｔｏｍａｔｉｃＳｐｅｅｃｈＲｅｃｏｇｎｉｔｉｏｎ）機能を有する音声認識部に入力して、音声データをテキストデータに変換する。
さらに、音声解析部は、生成したテキストデータの発話意味解析処理を行う。例えばＮＬＵ（ＮａｔｕｒａｌＬａｎｇｕａｇｅＵｎｄｅｒｓｔａｎｄｉｎｇ）等の自然言語理解機能を利用して、テキストデータから、ユーザ発話の意図（インテント：Ｉｎｔｅｎｔ）や、発話に含まれる意味のある有意要素（スロット：Ｓｌｏｔ）等の推定処理を実行する。 The voice analysis unit of the information processing device 10 inputs the user-spoken voice input from the microphone 12, which is a voice input unit, into a voice recognition unit having an automatic voice recognition (ASR: Automatic Speech Recognition) function, and inputs the voice data to text data. Convert to.
Further, the voice analysis unit performs utterance semantic analysis processing of the generated text data. For example, by using a natural language understanding function such as NLU (Natural Language Understanding), the intention of the user's utterance (Intent) and the meaningful significant element (slot) included in the utterance can be obtained from the text data. Perform estimation processing.

（ステップＳ１０３）
次に、情報処理装置１０は、ステップＳ１０３において、ステップＳ１０２における音声解析結果として得られたユーザ発話候補が１つのみであるか否かを判定する。 (Step S103)
Next, in step S103, the information processing device 10 determines whether or not there is only one user utterance candidate obtained as the voice analysis result in step S102.

ユーザ発話が明瞭であれば、音声解析結果として得られるユーザ発話候補は１つのみである場合が多い。しかし、先に説明したようにユーザ発話が不明瞭な場合は、音声解析結果として得られるユーザ発話候補が２つ以上となってしまうことがある。 If the user utterance is clear, there is often only one user utterance candidate obtained as a voice analysis result. However, as described above, when the user utterance is unclear, the number of user utterance candidates obtained as the voice analysis result may be two or more.

ステップＳ１０２における音声解析結果として得られたユーザ発話候補が１つのみである場合はステップＳ１０８に進む。
一方、ステップＳ１０２における音声解析結果として得られたユーザ発話候補が２つ以上である場合はステップＳ１０４進む。 If there is only one user utterance candidate obtained as the voice analysis result in step S102, the process proceeds to step S108.
On the other hand, if there are two or more user utterance candidates obtained as the voice analysis result in step S102, the process proceeds to step S104.

（ステップＳ１０４）
ステップＳ１０３において、ステップＳ１０２における音声解析結果として得られたユーザ発話候補が２つ以上であると判定した場合には、ステップＳ１０４〜Ｓ１０７の処理を実行する。 (Step S104)
If it is determined in step S103 that there are two or more user utterance candidates obtained as the voice analysis result in step S102, the processes of steps S104 to S107 are executed.

まず、情報処理装置１０は、ステップＳ１０４において、コンテクスト（状況情報）を取得する。
情報処理装置１０のカメラ１１、マイク１２、センサ１５から取得される観測情報、さらに表示部１３に表示中の情報や、通信部を介して外部のサーバ等から得られる情報、すなわち図２に示すコンテクスト（状況情報）２３を取得し、取得コンテクストの種類を判別する。 First, the information processing device 10 acquires the context (situation information) in step S104.
Observation information acquired from the camera 11, microphone 12, and sensor 15 of the information processing device 10, information displayed on the display unit 13, and information obtained from an external server or the like via the communication unit, that is, shown in FIG. The context (situation information) 23 is acquired, and the type of the acquired context is determined.

（ステップＳ１０５）
次に、情報処理装置１０は、ステップＳ１０５において、ステップＳ１０４で取得したコンテクスト（状況情報）２３のコンテクスト種類と他の様々なコンテクスト種類との間の距離値を取得する。 (Step S105)
Next, in step S105, the information processing apparatus 10 acquires a distance value between the context type of the context (situation information) 23 acquired in step S104 and various other context types.

（ステップＳ１０６）
次に、情報処理装置１０のコンテクスト解析部は、ステップＳ１０６において、ユーザ発話候補内スロットと同一または類似スロットを主要スロットとして含み、距離値最小のコンテクスト種類を選択する。 (Step S106)
Next, in step S106, the context analysis unit of the information processing apparatus 10 includes a slot that is the same as or similar to the slot in the user utterance candidate as the main slot, and selects the context type having the smallest distance value.

（ステップＳ１０７）
次に、ステップＳ１０７において、情報処理装置１０のコンテクスト解析部は、ステップＳ１０６で選択したコンテクスト種類の主要スロットと同一または類似スロットを含むユーザ発話候補を、情報処理装置１０が対応すべき応答対象ユーザ発話として選択する。 (Step S107)
Next, in step S107, the context analysis unit of the information processing device 10 responds to the user utterance candidate including the same or similar slot as the main slot of the context type selected in step S106 by the information processing device 10. Select as an utterance.

これらステップＳ１０４〜Ｓ１０７の処理は、先に図５〜図１０を参照して説明したステップＳ０３ａ〜Ｓ０３ｄの各処理に対応する処理である。 The processes of steps S104 to S107 correspond to the processes of steps S03a to S03d described above with reference to FIGS. 5 to 10.

（ステップＳ１０８）
ステップＳ１０３において、ステップＳ１０２における音声解析結果として得られたユーザ発話候補が１つのみであると判定した場合、または、
一方、ステップＳ１０２における音声解析結果として得られたユーザ発話候補が２つ以上であると判定され、ステップＳ１０４〜Ｓ１０７においてコンテクスト解析結果に基づいて、複数のユーザ発話候補から１つのユーザ発話候補が選択されると、ステップＳ１０８の処理が実行される。 (Step S108)
In step S103, when it is determined that there is only one user utterance candidate obtained as the voice analysis result in step S102, or
On the other hand, it is determined that there are two or more user utterance candidates obtained as the voice analysis result in step S102, and one user utterance candidate is selected from a plurality of user utterance candidates based on the context analysis result in steps S104 to S107. Then, the process of step S108 is executed.

情報処理装置１０は、ステップＳ１０８において、１つのユーザ発話候補に対応した応答や処理を実行する。 In step S108, the information processing device 10 executes a response or processing corresponding to one user utterance candidate.

このように、本開示の情報処理装置１０は、ユーザ発話が不明瞭で、ユーザ発話に対する音声解析結果として、複数のユーザ発話候補が取得された場合であっても、コンテクストの解析に基づいて、観測状況に応じた最も確からしい１つのユーザ発話候補を選択し、選択した１つのユーザ発話候補に対する情報処理装置１０の対応処理、例えば応答処理を行うことができる。 As described above, the information processing apparatus 10 of the present disclosure is based on the context analysis even when a plurality of user utterance candidates are acquired as a voice analysis result for the user utterance because the user utterance is unclear. One of the most probable user utterance candidates according to the observation situation can be selected, and the information processing apparatus 10 can perform response processing, for example, response processing to the selected one user utterance candidate.

［６．（実施例２）コンテクスト間距離を利用した処理制御を行う実施例について］
次に、実施例２としてコンテクスト間距離を利用した処理制御を行う実施例について説明する。 [6. (Example 2) Example of performing processing control using the distance between contexts]
Next, as Example 2, an example in which processing control using the inter-context distance is performed will be described.

先に図４を参照して説明したように、コンテクスト間距離テーブル１９は、２つのコンテクスト間の距離値を記録したテーブルである。
上述した実施例では、コンテクスト間距離テーブル１９に記録されたコンテクスト種類間の距離値を参照して、ユーザ発話が不明瞭な場合のユーザ発話推定処理を行っていた。
以下では、実施例２として、情報処理装置１０が、コンテクスト間距離テーブル１９に記録されたコンテクスト種類間の距離値を参照して、情報処理装置１０において実行する処理を制御する実施例について説明する。 As described above with reference to FIG. 4, the inter-context distance table 19 is a table in which the distance values between two contexts are recorded.
In the above-described embodiment, the user utterance estimation process is performed when the user utterance is unclear by referring to the distance value between the context types recorded in the inter-context distance table 19.
Hereinafter, as the second embodiment, an embodiment in which the information processing apparatus 10 controls the processing executed by the information processing apparatus 10 with reference to the distance values between the context types recorded in the inter-context distance table 19 will be described. ..

例えば、ユーザ１や情報処理装置１０が、現在、実行中の処理をコンテクスト解析により判断し、情報処理装置１０が次に実行すべき処理を、コンテクスト間距離テーブル１９に記録されたコンテクスト種類間の距離値に応じて決定する処理を行うものである。 For example, the user 1 or the information processing device 10 determines the process currently being executed by context analysis, and the process to be executed next by the information processing device 10 is determined between the context types recorded in the inter-text distance table 19. The process of determining according to the distance value is performed.

現在、実行中の処理に相当するコンテクストの種類との距離値が小さい他のコンテクスト種類を選択して、情報処理装置１０が次に実行する処理を、これらのコンテクスト種類に限定して処理の効率化を図るものである。 By selecting another context type that has a small distance value from the context type that corresponds to the processing that is currently being executed, the processing efficiency that the information processing device 10 executes next is limited to these context types. It is intended to be converted.

図１２を参照して、本実施例２の具体的処理例について説明する。
図１２に示す（ステップＳ５１）〜（ステップＳ５３）の各処理について説明する。 A specific processing example of the second embodiment will be described with reference to FIG.
Each process of (step S51) to (step S53) shown in FIG. 12 will be described.

（ステップＳ５１）
まず、情報処理装置は、ステップＳ５１において、ユーザ発話を入力して音声解析処理を実行する。
ユーザ発話を情報処理装置１０が入力し、音声解析部による音声解析の結果として、
１つのユーザ発話候補＝「渋谷でおすすめのレストラン教えて」
この１つのユーザ発話候補が生成されたことを示している。 (Step S51)
First, in step S51, the information processing device inputs the user's utterance and executes the voice analysis process.
The information processing device 10 inputs the user's utterance, and as a result of voice analysis by the voice analysis unit,
One user utterance candidate = "Tell me a recommended restaurant in Shibuya"
It shows that this one user utterance candidate has been generated.

この場合、情報処理装置１０のデータ処理部は、ユーザ発話に基づいた処理、すなわち応答処理として渋谷地域のレストラン検索処理を開始することになる。 In this case, the data processing unit of the information processing device 10 starts the restaurant search process in the Shibuya area as a process based on the user's utterance, that is, a response process.

（ステップＳ５２）
次に、情報処理装置１０のコンテクスト解析部は、ステップＳ５２において、情報処理装置のカメラ１１、マイク１２、センサ１５から取得される観測情報、さらに表示部１３に表示中の情報や、通信部を介して外部のサーバ等から得られる情報等、様々なコンテクスト（状況情報）を取得する。 (Step S52)
Next, in step S52, the context analysis unit of the information processing device 10 obtains the observation information acquired from the camera 11, the microphone 12, and the sensor 15 of the information processing device, the information displayed on the display unit 13, and the communication unit. Acquire various contexts (status information) such as information obtained from an external server or the like via the system.

すなわち、図２を参照して説明したコンテクスト（状況情報）２３と同様のコンテクスト（状況情報）を取得し、取得コンテクストのコンテクスト種類を判別する。
情報処理装置のカメラ１１等から得られるコンテクスト（状況情報）が、図３を参照して説明したコンテクスト情報データベース１８に登録された（ｂ）コンテクスト種類のどれに対応するかを判定する。 That is, the same context (situation information) as the context (situation information) 23 described with reference to FIG. 2 is acquired, and the context type of the acquired context is determined.
It is determined which of the (b) context types registered in the context information database 18 described with reference to FIG. 3 the context (situation information) obtained from the camera 11 or the like of the information processing apparatus corresponds to.

この解析処理の結果として、本処理例においては、図１２（Ｓ５２）の矢印の先に示すように、取得コンテクスト（状況情報）２３のコンテクスト種類は、
コンテクスト種類＝００１（データ検索）
であると判定される。 As a result of this analysis processing, in this processing example, as shown at the tip of the arrow in FIG. 12 (S52), the context type of the acquired context (situation information) 23 is
Context type = 001 (data search)
Is determined to be.

（ステップＳ５３）
次に、情報処理装置１０のコンテクスト解析部は、ステップＳ５３において、取得したコンテクスト（状況情報）のコンテクスト種類（ＩＤ＝００１）と他の様々なコンテクスト種類との間の距離値を取得し、距離値が予め規定したしきい値以下のコンテクスト種類を次処理候補として決定する。 (Step S53)
Next, in step S53, the context analysis unit of the information processing apparatus 10 acquires a distance value between the context type (ID = 001) of the acquired context (situation information) and various other context types, and obtains a distance value. The context type whose value is equal to or less than the predetermined threshold value is determined as the next processing candidate.

コンテクスト間距離値取得処理は、コンテクスト間距離テーブル１９を参照して実行する。
本実施例２において利用するコンテクスト間距離テーブル１９の具体例を図１３に示す。 The inter-text distance value acquisition process is executed with reference to the inter-context distance table 19.
A specific example of the inter-context distance table 19 used in the second embodiment is shown in FIG.

図１３に示すコンテクスト間距離テーブル１９は、先に図４を参照して説明したコンテクスト間距離テーブル１９と同様、コンテクスト種類間の距離値を格納したテーブルである。
ただし、図１３に示すコンテクスト間距離テーブル１９に登録されたコンテクスト種類は、先に図４を参照して説明したコンテクスト間距離テーブル１９に登録されたコンテクスト種類より、粒度が細かいコンテクスト種類である。 The inter-context distance table 19 shown in FIG. 13 is a table that stores distance values between context types, similar to the inter-context distance table 19 described above with reference to FIG. 4.
However, the context type registered in the inter-text distance table 19 shown in FIG. 13 is a context type having a finer particle size than the context type registered in the inter-text distance table 19 described with reference to FIG. 4 above.

図１３に示すコンテクスト間距離テーブル１９に登録されたコンテクスト種類は、全て検索処理に関連する処理の種類をコンテクスト種類として登録している。
図１３に示すように、
ＩＤ＝００１（データ検索）
ＩＤ＝００２（検索詳細条件指定）
ＩＤ＝００３（検索ジャンル変更）
ＩＤ＝００４（検索結果出力）
ＩＤ＝００５（検索絞り込み）
・・・
これらは、全てデータ検索に関連する処理である。 As for the context types registered in the inter-text distance table 19 shown in FIG. 13, all the processing types related to the search processing are registered as the context types.
As shown in FIG.
ID = 001 (data search)
ID = 002 (specify search detailed conditions)
ID = 003 (change search genre)
ID = 004 (search result output)
ID = 005 (search narrowing down)
・・・
These are all processes related to data retrieval.

情報処理装置１０は、これらの処理（コンテクスト種類）に応じたデータ処理を行う。例えば表示部１３の表示データの切り替え処理や、データ検索サーバとの通信処理等を行う。
情報処理装置１０が実行する処理は、コンテクスト間距離テーブル１９に登録されたコンテクスト種類に応じて異なるため、異なる処理間の遷移を伴う処理切り替えを実行する場合は、次の処理を予測して切り替えることで、より迅速な処理遷移を行うことができる。 The information processing device 10 performs data processing according to these processes (context types). For example, the display data switching process of the display unit 13 and the communication process with the data search server are performed.
Since the processing executed by the information processing device 10 differs depending on the context type registered in the inter-context distance table 19, when the processing switching involving the transition between different processes is executed, the next processing is predicted and switched. Therefore, a faster processing transition can be performed.

情報処理装置１０は、このような次処理候補の予測を行うために、図１３に示すコンテクスト間距離テーブル１９を利用する。
例えば、図１２に示す例では、現在のコンテクスト種類が、
コンテクスト種類＝００１（データ検索）
である。 The information processing apparatus 10 uses the inter-context distance table 19 shown in FIG. 13 in order to predict such a next processing candidate.
For example, in the example shown in FIG. 12, the current context type is
Context type = 001 (data search)
Is.

情報処理装置１０は、取得コンテクスト（状況情報）２３のコンテクスト種類、すなわち、
コンテクスト種類＝００１（データ検索）と、その他の複数のコンテクスト種類との距離値を取得し、距離値が予め規定したしきい値以下のコンテクスト種類のみを選択し、選択したコンテクスト種類を、次に実行する確率が高い次処理候補に決定する。 The information processing device 10 has a context type of the acquired context (situation information) 23, that is,
Obtain the distance value between the context type = 001 (data search) and a plurality of other context types, select only the context type whose distance value is less than or equal to the predetermined threshold value, and then select the selected context type. Determine the next processing candidate with a high probability of execution.

この次処理候補決定に基づいて、情報処理装置１０は次処理候補となった処理を実行するための準備を、事前に開始することができ、実際の処理遷移が発生した場合に、迅速な対応を行うことができる。 Based on this determination of the next processing candidate, the information processing apparatus 10 can start preparations for executing the processing that has become the next processing candidate in advance, and promptly responds when an actual processing transition occurs. It can be performed.

図１２に示す例では、取得コンテクスト（状況情報）２３のコンテクスト種類、すなわち、
コンテクスト種類＝００１（データ検索）と、その他の複数のコンテクスト種類との距離値は、以下の通りである。
００１（データ検索）と、００２（検索詳細条件指定）間の距離値＝１
００１（データ検索）と、００３（検索ジャンル変更）間の距離値＝２
００１（データ検索）と、００４（検索結果出力）間の距離値＝３
００１（データ検索）と、００５（検索絞り込み）間の距離値＝４
００１（データ検索）と、００６（検索結果のＳＮＳ共有）間の距離値＝５
００１（データ検索）と、００７（検索結果の並べ替え）間の距離値＝６
００１（データ検索）と、００８（検索結果を予定に入力）間の距離値＝７
これらの距離値は、図１３に示すコンテクスト間距離テーブル１９から取得される距離値である。 In the example shown in FIG. 12, the context type of the acquired context (situation information) 23, that is,
The distance values between the context type = 001 (data search) and a plurality of other context types are as follows.
Distance value between 001 (data search) and 002 (search detailed condition specification) = 1
Distance value between 001 (data search) and 003 (search genre change) = 2
Distance value between 001 (data search) and 004 (search result output) = 3
Distance value between 001 (data search) and 005 (search narrowing down) = 4
Distance value between 001 (data search) and 006 (SNS sharing of search results) = 5
Distance value between 001 (data search) and 007 (sort of search results) = 6
Distance value between 001 (data search) and 008 (enter search results in the schedule) = 7
These distance values are distance values obtained from the inter-context distance table 19 shown in FIG.

情報処理装置１０のコンテクスト解析部は、ステップＳ５３において、例えばしきい値を距離値３として、距離しきい値＝３以下のコンテクスト種類を次処理候補として決定する。 In step S53, the context analysis unit of the information processing apparatus 10 determines, for example, a threshold value of 3 and a context type having a distance threshold = 3 or less as a next processing candidate.

図１２に示す例では、距離しきい値＝３以下のコンテクスト種類として、
００２（検索詳細条件指定）
００３（検索ジャンル変更）
００４（検索結果出力）
これら３つのコンテクスト種類が選択される。 In the example shown in FIG. 12, the context type with the distance threshold value = 3 or less is set.
002 (Specify search detailed conditions)
003 (change search genre)
004 (search result output)
These three context types are selected.

情報処理装置１０は、選択したこれら３つのコンテクスト種類を、次に実行する確率が高い処理（次処理候補）であると予測し、この予測に基づいて、次処理候補となった処理を実行するための準備を、事前に開始する。
これらの事前処理により、実際の処理遷移が発生した場合に、迅速な対応を行うことができる。 The information processing device 10 predicts that these three selected context types are processes (next processing candidates) having a high probability of being executed next, and executes the processing that has become the next processing candidate based on this prediction. Start preparations in advance.
With these pre-processing, when an actual processing transition occurs, it is possible to take a prompt response.

実際の処理遷移を伴う処理例について、図１４、図１５を参照して説明する。
まず、図１４に示す（ステップＳ７１）〜（ステップＳ７３）の各処理について説明する。 A processing example involving an actual processing transition will be described with reference to FIGS. 14 and 15.
First, each process of (step S71) to (step S73) shown in FIG. 14 will be described.

（ステップＳ７１）
まず、情報処理装置は、ステップＳ７１において、ユーザ発話を入力して音声解析処理を実行する。
ユーザ発話を情報処理装置１０が入力し、音声解析部による音声解析の結果として、
１つのユーザ発話候補＝「フレンチレストラン教えて」
この１つのユーザ発話候補が生成されたことを示している。 (Step S71)
First, in step S71, the information processing device inputs the user's utterance and executes the voice analysis process.
The information processing device 10 inputs the user's utterance, and as a result of voice analysis by the voice analysis unit,
One user utterance candidate = "Tell me a French restaurant"
It shows that this one user utterance candidate has been generated.

この場合、情報処理装置１０のデータ処理部は、ユーザ発話に基づいた処理、すなわち情報処理装置１０の応答処理としてフレンチレストラン検索処理を開始することになる。 In this case, the data processing unit of the information processing device 10 starts the French restaurant search process as a process based on the user's utterance, that is, a response process of the information processing device 10.

（ステップＳ７２）
次に、情報処理装置１０のコンテクスト解析部は、ステップＳ７２において、情報処理装置のカメラ１１、マイク１２、センサ１５から取得される観測情報、さらに表示部１３に表示中の情報や、通信部を介して外部のサーバ等から得られる情報等、様々なコンテクスト（状況情報）を取得する。 (Step S72)
Next, in step S72, the context analysis unit of the information processing device 10 obtains the observation information acquired from the camera 11, the microphone 12, and the sensor 15 of the information processing device, the information displayed on the display unit 13, and the communication unit. Acquire various contexts (status information) such as information obtained from an external server or the like via the system.

この解析処理の結果として、本処理例においては、図１４（Ｓ７２）の矢印の先に示すように、取得コンテクスト（状況情報）２３のコンテクスト種類は、
コンテクスト種類＝００１（データ検索）
であると判定される。 As a result of this analysis processing, in this processing example, as shown at the tip of the arrow in FIG. 14 (S72), the context type of the acquired context (situation information) 23 is
Context type = 001 (data search)
Is determined to be.

（ステップＳ７３）
次に、情報処理装置１０のコンテクスト解析部は、ステップＳ７３において、取得したコンテクスト（状況情報）のコンテクスト種類（ＩＤ＝００１）と他の様々なコンテクスト種類との間の距離値を取得し、距離値が予め規定したしきい値以下のコンテクスト種類を次処理候補として決定する。 (Step S73)
Next, in step S73, the context analysis unit of the information processing apparatus 10 acquires a distance value between the context type (ID = 001) of the acquired context (situation information) and various other context types, and obtains a distance value. The context type whose value is equal to or less than the predetermined threshold value is determined as the next processing candidate.

コンテクスト間距離値取得処理は、図１３に示すコンテクスト間距離テーブル１９を参照して実行する。
例えば、図１４に示す例では、現在のコンテクスト種類が、
コンテクスト種類＝００１（データ検索）
である。 The inter-text distance value acquisition process is executed with reference to the inter-context distance table 19 shown in FIG.
For example, in the example shown in FIG. 14, the current context type is
Context type = 001 (data search)
Is.

図１４に示す例では、取得コンテクスト（状況情報）２３のコンテクスト種類、すなわち、
コンテクスト種類＝００１（データ検索）と、その他の複数のコンテクスト種類との距離値は、以下の通りである。
００１（データ検索）と、００２（検索詳細条件指定）間の距離値＝１
００１（データ検索）と、００３（検索ジャンル変更）間の距離値＝２
００１（データ検索）と、００４（検索結果出力）間の距離値＝３
００１（データ検索）と、００５（検索絞り込み）間の距離値＝４
００１（データ検索）と、００６（検索結果のＳＮＳ共有）間の距離値＝５
００１（データ検索）と、００７（検索結果の並べ替え）間の距離値＝６
００１（データ検索）と、００８（検索結果を予定に入力）間の距離値＝７
これらの距離値は、図１３に示すコンテクスト間距離テーブル１９から取得される距離値である。 In the example shown in FIG. 14, the context type of the acquired context (situation information) 23, that is,
The distance values between the context type = 001 (data search) and a plurality of other context types are as follows.
Distance value between 001 (data search) and 002 (search detailed condition specification) = 1
Distance value between 001 (data search) and 003 (search genre change) = 2
Distance value between 001 (data search) and 004 (search result output) = 3
Distance value between 001 (data search) and 005 (search narrowing down) = 4
Distance value between 001 (data search) and 006 (SNS sharing of search results) = 5
Distance value between 001 (data search) and 007 (sort of search results) = 6
Distance value between 001 (data search) and 008 (enter search results in the schedule) = 7
These distance values are distance values obtained from the inter-context distance table 19 shown in FIG.

情報処理装置１０のコンテクスト解析部は、ステップＳ７３において、例えばしきい値を距離値３として、距離しきい値＝３以下のコンテクスト種類を次処理候補として決定する。 In step S73, the context analysis unit of the information processing apparatus 10 determines, for example, a threshold value of 3 and a context type having a distance threshold = 3 or less as a next processing candidate.

図１４に示す例では、距離しきい値＝３以下のコンテクスト種類として、
００２（検索詳細条件指定）
００３（検索ジャンル変更）
００４（検索結果出力）
これら３つのコンテクスト種類が選択される。 In the example shown in FIG. 14, the context type with the distance threshold value = 3 or less is set.
002 (Specify search detailed conditions)
003 (change search genre)
004 (search result output)
These three context types are selected.

次に、ユーザ１の処理により、実際の処理遷移として、
００４（検索結果出力）
この処理に遷移した後の処理について、図１５を参照して説明する。 Next, by the processing of user 1, as an actual processing transition,
004 (search result output)
The process after the transition to this process will be described with reference to FIG.

図１５では、図１４に示すデータ検索処理後、検索結果の出力が実行され、さらに、図１５に示す（ステップＳ８１）〜（ステップＳ８３）の処理が実行される。
図１５に示す（ステップＳ８１）〜（ステップＳ８３）の各処理について説明する。 In FIG. 15, after the data search process shown in FIG. 14, the output of the search result is executed, and further, the processes (step S81) to (step S83) shown in FIG. 15 are executed.
Each process of (step S81) to (step S83) shown in FIG. 15 will be described.

（ステップＳ８１）
まず、情報処理装置は、ステップＳ８１において、ユーザ発話を入力して音声解析処理を実行する。
ユーザ発話を情報処理装置１０が入力し、音声解析部による音声解析の結果として、
１つのユーザ発話候補＝「渋谷でおすすめのレストラン教えて」
この１つのユーザ発話候補が生成されたことを示している。 (Step S81)
First, in step S81, the information processing device inputs the user's utterance and executes the voice analysis process.
The information processing device 10 inputs the user's utterance, and as a result of voice analysis by the voice analysis unit,
One user utterance candidate = "Tell me a recommended restaurant in Shibuya"
It shows that this one user utterance candidate has been generated.

（ステップＳ８２）
次に、情報処理装置１０のコンテクスト解析部は、ステップＳ８２において、情報処理装置のカメラ１１、マイク１２、センサ１５から取得される観測情報、さらに表示部１３に表示中の情報や、通信部を介して外部のサーバ等から得られる情報等、様々なコンテクスト（状況情報）を取得する。 (Step S82)
Next, in step S82, the context analysis unit of the information processing device 10 obtains the observation information acquired from the camera 11, the microphone 12, and the sensor 15 of the information processing device, the information displayed on the display unit 13, and the communication unit. Acquire various contexts (status information) such as information obtained from an external server or the like via the system.

この解析処理の結果として、本処理例においては、図１５（Ｓ８２）の矢印の先に示すように、取得コンテクスト（状況情報）２３のコンテクスト種類は、
コンテクスト種類＝００４（検索結果出力）
であると判定される。 As a result of this analysis processing, in this processing example, as shown at the tip of the arrow in FIG. 15 (S82), the context type of the acquired context (situation information) 23 is
Context type = 004 (search result output)
Is determined to be.

（ステップＳ８３）
次に、情報処理装置１０のコンテクスト解析部は、ステップＳ８３において、取得したコンテクスト（状況情報）のコンテクスト種類（ＩＤ＝００４）と他の様々なコンテクスト種類との間の距離値を取得し、距離値が予め規定したしきい値以下のコンテクスト種類を次処理候補として決定する。 (Step S83)
Next, in step S83, the context analysis unit of the information processing apparatus 10 acquires a distance value between the context type (ID = 004) of the acquired context (situation information) and various other context types, and obtains a distance value. The context type whose value is equal to or less than the predetermined threshold value is determined as the next processing candidate.

コンテクスト間距離値取得処理は、図１３に示すコンテクスト間距離テーブル１９を参照して実行する。
例えば、図１５に示す例では、現在のコンテクスト種類が、
コンテクスト種類＝００４（検索結果出力）
である。 The inter-text distance value acquisition process is executed with reference to the inter-context distance table 19 shown in FIG.
For example, in the example shown in FIG. 15, the current context type is
Context type = 004 (search result output)
Is.

情報処理装置１０は、取得コンテクスト（状況情報）２３のコンテクスト種類、すなわち、
コンテクスト種類＝００４（検索結果出力）と、その他の複数のコンテクスト種類との距離値を取得し、距離値が予め規定したしきい値以下のコンテクスト種類のみを選択し、選択したコンテクスト種類を、次に実行する確率が高い次処理候補に決定する。 The information processing device 10 has a context type of the acquired context (situation information) 23, that is,
Obtain the distance value between the context type = 004 (search result output) and a plurality of other context types, select only the context type whose distance value is less than or equal to the predetermined threshold value, and select the selected context type as follows. Determine the next processing candidate that has a high probability of being executed.

図１５に示す例では、取得コンテクスト（状況情報）２３のコンテクスト種類、すなわち、
コンテクスト種類＝００４（検索結果出力）と、その他の複数のコンテクスト種類との距離値は、以下の通りである。
００４（検索結果出力）と、００１（データ検索）間の距離値＝３
００４（検索結果出力）と、００２（検索詳細条件指定）間の距離値＝６
００４（検索結果出力）と、００３（検索ジャンル変更）間の距離値＝７
００４（検索結果出力）と、００５（検索絞り込み）間の距離値＝１
００４（検索結果出力）と、００６（検索結果のＳＮＳ共有）間の距離値＝２
００４（検索結果出力）と、００７（検索結果の並べ替え）間の距離値＝３
００４（検索結果出力）と、００８（検索結果を予定に入力）間の距離値＝４
これらの距離値は、図１３に示すコンテクスト間距離テーブル１９から取得される距離値である。 In the example shown in FIG. 15, the context type of the acquired context (situation information) 23, that is,
The distance values between the context type = 004 (search result output) and a plurality of other context types are as follows.
Distance value between 004 (search result output) and 001 (data search) = 3
Distance value between 004 (search result output) and 002 (search detailed condition specification) = 6
Distance value between 004 (search result output) and 003 (search genre change) = 7
Distance value between 004 (search result output) and 005 (search narrowing down) = 1
Distance value between 004 (search result output) and 006 (SNS sharing of search results) = 2
Distance value between 004 (search result output) and 007 (sort search results) = 3
Distance value between 004 (search result output) and 008 (search result input in schedule) = 4
These distance values are distance values obtained from the inter-context distance table 19 shown in FIG.

情報処理装置１０のコンテクスト解析部は、ステップＳ８３において、例えばしきい値を距離値５として、距離しきい値＝５以下のコンテクスト種類を次処理候補として決定する。 In step S83, the context analysis unit of the information processing apparatus 10 determines, for example, a threshold value of 5 and a context type having a distance threshold = 5 or less as a next processing candidate.

図１５に示す例では、距離しきい値＝５以下のコンテクスト種類として、
００１（データ検索）
００５（検索絞り込み）
００６（検索結果のＳＮＳ共有）
００７（検索結果の並べ替え）
００８（検索結果を予定に入力）
これら５つのコンテクスト種類が選択される。 In the example shown in FIG. 15, the context type with the distance threshold value = 5 or less is set.
001 (data search)
005 (search narrowing down)
006 (SNS sharing of search results)
007 (Sort search results)
008 (Enter search results in the schedule)
These five context types are selected.

情報処理装置１０は、選択したこれら５つのコンテクスト種類を、次に実行する確率が高い処理（次処理候補）であると予測し、この予測に基づいて、次処理候補となった処理を実行するための準備を、事前に開始する。
これらの事前処理により、実際の処理遷移が発生した場合に、迅速な対応を行うことができる。 The information processing device 10 predicts that these five selected context types are processes (next processing candidates) having a high probability of being executed next, and executes the processing that has become the next processing candidate based on this prediction. Start preparations in advance.
With these pre-processing, when an actual processing transition occurs, it is possible to take a prompt response.

［７．情報処理装置が実行する実施例２の処理シーケンスについて］
次に、本開示の情報処理装置１０が、図１２〜図１４を参照して説明した実施例２の処理を実行する場合の処理シーケンスについて説明する。 [7. About the processing sequence of the second embodiment executed by the information processing apparatus]
Next, a processing sequence in the case where the information processing apparatus 10 of the present disclosure executes the processing of the second embodiment described with reference to FIGS. 12 to 14 will be described.

図１６に示すフローチャートを参照して、本開示の情報処理装置１０が実行する実施例２の処理シーケンスについて説明する。
図１６に示すフローチャートに従った処理は、情報処理装置１０の記憶部に格納されたプログラムに従って実行される。例えばプログラム実行機能を有するＣＰＵ等のプロセッサによるプログラム実行処理として実行可能である。
図１６に示すフローの各ステップの処理について説明する。 The processing sequence of the second embodiment executed by the information processing apparatus 10 of the present disclosure will be described with reference to the flowchart shown in FIG.
The process according to the flowchart shown in FIG. 16 is executed according to the program stored in the storage unit of the information processing apparatus 10. For example, it can be executed as a program execution process by a processor such as a CPU having a program execution function.
The processing of each step of the flow shown in FIG. 16 will be described.

なお、図１６に示すフロー中、ステップＳ１０１〜Ｓ１０８の処理は、先に図１１を参照して説明した実施例１の処理フローにおけるステップＳ１０１〜Ｓ１０８の処理と同様の処理である。
本実施例２では、先に図１１を参照して説明した実施例１の処理フローにステップＳ１０９の処理を追加したシーケンスとなる。 In the flow shown in FIG. 16, the processes of steps S101 to S108 are the same as the processes of steps S101 to S108 in the process flow of the first embodiment described above with reference to FIG.
In the second embodiment, the sequence is obtained by adding the process of step S109 to the process flow of the first embodiment described above with reference to FIG.

（ステップＳ１０１〜Ｓ１０８）
上述したようにステップＳ１０１〜Ｓ１０８の処理は、先に図１１を参照して説明した実施例１の処理フローにおけるステップＳ１０１〜Ｓ１０８の処理と同様の処理であるので、簡単に説明する。 (Steps S101 to S108)
As described above, the processes of steps S101 to S108 are the same as the processes of steps S101 to S108 in the process flow of the first embodiment described above with reference to FIG. 11, and will be briefly described.

まず、情報処理装置１０は、ステップＳ１０１において、ユーザ発話を入力する。
次に、情報処理装置１０は、ステップＳ１０２において、入力したユーザ発話に対する音声解析処理を実行する。
次に、情報処理装置１０は、ステップＳ１０３において、ステップＳ１０２における音声解析結果として得られたユーザ発話候補が１つのみであるか否かを判定する。 First, the information processing device 10 inputs the user's utterance in step S101.
Next, in step S102, the information processing device 10 executes a voice analysis process for the input user utterance.
Next, in step S103, the information processing device 10 determines whether or not there is only one user utterance candidate obtained as the voice analysis result in step S102.

ステップＳ１０３において、ステップＳ１０２における音声解析結果として得られたユーザ発話候補が２つ以上であると判定した場合には、ステップＳ１０４〜Ｓ１０７の処理を実行する。 If it is determined in step S103 that there are two or more user utterance candidates obtained as the voice analysis result in step S102, the processes of steps S104 to S107 are executed.

まず、情報処理装置１０は、ステップＳ１０４において、コンテクスト（状況情報）を取得し、ステップＳ１０５において、取得したコンテクスト（状況情報）２３のコンテクスト種類と他の様々なコンテクスト種類との間の距離値を取得する。 First, the information processing apparatus 10 acquires the context (situation information) in step S104, and obtains the distance value between the context type of the acquired context (situation information) 23 and various other context types in step S105. get.

次に、情報処理装置１０のコンテクスト解析部は、ステップＳ１０６において、ユーザ発話候補内スロットと同一または類似スロットを主要スロットとして含み、距離値最小のコンテクスト種類を選択する。
次に、ステップＳ１０７において、情報処理装置１０のコンテクスト解析部は、ステップＳ１０６で選択したコンテクスト種類の主要スロットと同一または類似スロットを含むユーザ発話候補を、情報処理装置１０が対応する応答対象のユーザ発話として選択する。 Next, in step S106, the context analysis unit of the information processing apparatus 10 includes a slot that is the same as or similar to the slot in the user utterance candidate as the main slot, and selects the context type having the smallest distance value.
Next, in step S107, the context analysis unit of the information processing device 10 requests a user utterance candidate including a slot that is the same as or similar to the main slot of the context type selected in step S106, and the user to be responded to by the information processing device 10. Select as an utterance.

ステップＳ１０３において、ステップＳ１０２における音声解析結果として得られたユーザ発話候補が１つのみであると判定した場合、または、
一方、ステップＳ１０２における音声解析結果として得られたユーザ発話候補が２つ以上であると判定され、ステップＳ１０４〜Ｓ１０７においてコンテクスト解析結果に基づいて、複数のユーザ発話候補から１つのユーザ発話候補が選択されると、ステップＳ１０８の処理が実行される。 In step S103, when it is determined that there is only one user utterance candidate obtained as the voice analysis result in step S102, or
On the other hand, it is determined that there are two or more user utterance candidates obtained as the voice analysis result in step S102, and one user utterance candidate is selected from a plurality of user utterance candidates based on the context analysis result in steps S104 to S107. Then, the process of step S108 is executed.

（ステップＳ１０９）
最後に、情報処理装置１０は、ステップ１０９において、取得コンテクストとの距離がしきい値以下のコンテクストカテゴリを次処理候補として決定する。 (Step S109)
Finally, in step 109, the information processing apparatus 10 determines the context category in which the distance from the acquired context is equal to or less than the threshold value as the next processing candidate.

このステップＳ１０９の処理を行うことで、情報処理装置１０は、次処理候補となった処理を実行するための準備を、事前に開始することが可能となり、実際の処理遷移が発生した場合に、迅速な対応を行うことができる。 By performing the process of step S109, the information processing apparatus 10 can start preparations for executing the process that is the next process candidate in advance, and when an actual process transition occurs, the information processing device 10 can start the preparation in advance. A quick response can be taken.

［８．その他の実施例、変形例について］
次に、上述した実施例と異なる実施例、および変形例について説明する。 [8. About other examples and modifications]
Next, an example different from the above-mentioned example and a modified example will be described.

上述した実施例の実施例１は、音声解析処理の結果として複数のユーザ発話候補が生成された場合に、カメラ等を介して取得した現在のコンテクスト（状況情報）のコンテクスト種類との距離値が近いコンテクスト種類の主要スロットを含むユーザ発話候補を、情報処理装置１０が対応する応答対象のユーザ発話候補として選択する処理を行う実施例である。 In Example 1 of the above-described embodiment, when a plurality of user utterance candidates are generated as a result of voice analysis processing, the distance value from the context type of the current context (situation information) acquired via a camera or the like is set. In this embodiment, the information processing apparatus 10 performs a process of selecting a user utterance candidate including a main slot of a similar context type as a corresponding response target user utterance candidate.

また、実施例２は、カメラ等を介して取得した現在のコンテクスト（状況情報）のコンテクスト種類との距離値が近いコンテクスト種類を、情報処理装置が実行する次処理候補として決定して、次処理への遷移を効率的に行うことを可能とした実施例である。
いずれもコンテクスト種類の距離値を参照して処理を行う実施例である。 Further, in the second embodiment, the context type whose distance value is close to the context type of the current context (situation information) acquired via the camera or the like is determined as the next processing candidate to be executed by the information processing apparatus, and the next processing is performed. This is an example in which the transition to is made possible efficiently.
Both are examples in which processing is performed with reference to the distance value of the context type.

このコンテクスト種類の距離値を取得する際に、先に説明した実施例では、図４や図１３に示すコンテクスト間距離テーブル１９を利用していた。
また、コンテクスト種類を特定するために図３に示すコンテクスト情報データベース１８を利用していた。 When acquiring the distance value of this context type, the inter-context distance table 19 shown in FIGS. 4 and 13 was used in the embodiment described above.
In addition, the context information database 18 shown in FIG. 3 was used to specify the context type.

これらのコンテクスト距離テーブルやコンテクスト情報データベースは、一つの固定されたテーブルやデータベースを利用する構成に限らず、例えば、ユーザによるユーザ発話が行われる場所や、情報処理装置１０による処理が実行される場所に応じて変更する構成としてもよい。 These context distance tables and context information databases are not limited to configurations that use one fixed table or database, for example, a place where a user speaks or a place where processing by the information processing device 10 is executed. The configuration may be changed according to the above.

例えば、ユーザによるユーザ発話が行われる場所や、情報処理装置１０による処理が実行される場所がキッチンである場合は、キッチンに対応した固有のコンテクスト間距離テーブルや、コンテクスト情報データベースを利用する。
また、ユーザによるユーザ発話が行われる場所や、情報処理装置１０による処理が実行される場所がリビングである場合は、リビングに対応した固有のコンテクスト間距離テーブルを利用する等、場所に応じた固有のテーブルや、コンテクスト情報データベースを利用する構成としてもよい。 For example, when the place where the user speaks by the user or the place where the processing by the information processing device 10 is executed is the kitchen, a unique inter-text distance table corresponding to the kitchen or a context information database is used.
In addition, when the place where the user speaks by the user or the place where the processing by the information processing device 10 is executed is the living room, a unique inter-context distance table corresponding to the living room is used. The configuration may be such that the table of the above and the context information database are used.

さらに、時間帯や季節、例えば、朝、昼間、夜などの時間帯や季節に応じた固有のコンテクスト間距離テーブルや、コンテクスト情報データベースを利用する構成としてもよい。 Further, a configuration may be configured in which a unique inter-text distance table according to a time zone or season, for example, a time zone such as morning, daytime, or night, or a season, or a context information database is used.

また、情報処理装置１０との対話を行っているユーザを識別して、識別したユーザ固有のコンテクスト間距離テーブルや、コンテクスト情報データベースを利用する構成としてもよい。 Further, the user who is interacting with the information processing apparatus 10 may be identified, and the identified user-specific inter-text distance table or the context information database may be used.

また、ユーザ間のつながり、例えば親子、友達等のユーザ関係を登録したユーザデータベースを参照して、情報処理装置１０との対話を行っているユーザと関係の強い人を優先した発話選択や次処理決定を実行可能としたユーザとユーザ関係を考慮したコンテクスト間距離テーブルや、コンテクスト情報データベースを利用する構成としてもよい。 In addition, by referring to the user database in which the connection between users, for example, the user relationship such as parent and child, friends, etc. is registered, the speech selection and the next processing giving priority to the person who has a strong relationship with the user who is interacting with the information processing device 10. The configuration may be such that a context information database or a context distance table considering the user-user relationship that enables the decision to be executed is used.

また、上述した実施例では、音声解析処理を情報処理装置１０の内部で実行する処理例として説明したが、音声解析は、例えば情報処理装置１０と通信可能な外部さーば等の外部装置において行う構成としてもよい。 Further, in the above-described embodiment, the voice analysis process has been described as a processing example of executing the voice analysis process inside the information processing device 10, but the voice analysis is performed in an external device such as an external server capable of communicating with the information processing device 10, for example. It may be configured to be performed.

また、複数の外部装置を利用して、各々から音声解析結果を取得する構成としてもよい。
このように、複数の外部装置を利用して、各々から音声解析結果を取得する構成とした場合には、各外部装置から異なる音声解析結果（ユーザ発話候補）が得られる場合があるが、この場合は、上述した実施例１の処理を行うことで、複数の音声解析結果（ユーザ発話候補）から、情報処理装置１０が対応すべき、１つの音声解析結果（ユーザ発話候補）を選択することができる。 In addition, a plurality of external devices may be used to acquire voice analysis results from each of them.
In this way, when a plurality of external devices are used and the voice analysis results are acquired from each of them, different voice analysis results (user utterance candidates) may be obtained from each external device. In this case, by performing the process of the first embodiment described above, one voice analysis result (user utterance candidate) to be supported by the information processing apparatus 10 is selected from a plurality of voice analysis results (user utterance candidates). Can be done.

また、例えば外部装置から取得される音声解析結果が、複数のユーザ発話候補を生成することなく、不明瞭な部分をテキスト化しない音声解析結果を生成してしまう場合も想定される。例えば、
「×ｅｎｋｉ，ｏｎｅｇａｉ」
上記のような不明瞭な発話部分を認識不可部分（×）として設定したテキストデータを生成する外部装置も想定される。 Further, for example, it is assumed that the voice analysis result acquired from the external device may generate the voice analysis result in which the unclear part is not converted into text without generating a plurality of user utterance candidates. for example,
"× enki, onegai"
An external device that generates text data in which the above-mentioned unclear utterance portion is set as an unrecognizable portion (x) is also assumed.

このような場合、情報処理装置１０は、外部装置から取得した音声認識結果に含まれる認識不可部分に対して、発音類似単語を記録した発音類似単語データベースを参照してデータベースから発音類似単語を選択して、ユーザ発話候補を生成して、その後、実施例１に従った処理を実行する。 In such a case, the information processing device 10 selects a pronunciation-similar word from the database by referring to the pronunciation-similar word database in which the pronunciation-similar words are recorded for the unrecognizable portion included in the speech recognition result acquired from the external device. Then, a user speech candidate is generated, and then the process according to the first embodiment is executed.

［９．情報処理装置の構成例について］
次に、本開示の情報処理装置１０の構成例について説明する。
図１７は、ユーザ発話を入力して、ユーザ発話に対応する処理や応答を行う情報処理装置１０の一構成例を示す図である。 [9. Information processing device configuration example]
Next, a configuration example of the information processing device 10 of the present disclosure will be described.
FIG. 17 is a diagram showing a configuration example of an information processing device 10 that inputs a user utterance and performs a process or a response corresponding to the user utterance.

図１７に示すように、情報処理装置１０は、入力部１１０、出力部１２０、データ処理部１３０、記憶部１７０、通信部１８０を有する。
データ処理部１３０は、入力データ解析部１４０、データ処理実行部１５０、出力情報生成部１６０を有する。
また、記憶部１７０は、ユーザ情報ＤＢ（データベース）１７１、コンテクスト情報データベース１７２、コンテクスト間距離テーブル１７３を有する。 As shown in FIG. 17, the information processing device 10 includes an input unit 110, an output unit 120, a data processing unit 130, a storage unit 170, and a communication unit 180.
The data processing unit 130 includes an input data analysis unit 140, a data processing execution unit 150, and an output information generation unit 160.
Further, the storage unit 170 has a user information DB (database) 171, a context information database 172, and an inter-text distance table 173.

なお、入力部１１０、出力部１２０以外のデータ処理部１３０や記憶部１７０は、情報処理装置１０内に構成せず、外部サーバ内に構成してもよい。サーバを利用した構成の場合、情報処理装置１０は、入力部１１０から入力した入力データを、ネットワークを介してサーバに送信し、サーバのデー処理部１３０の処理結果を受信して、出力部１２０を介して出力する。 The data processing unit 130 and the storage unit 170 other than the input unit 110 and the output unit 120 may not be configured in the information processing device 10 but may be configured in an external server. In the case of a configuration using a server, the information processing apparatus 10 transmits the input data input from the input unit 110 to the server via the network, receives the processing result of the server data processing unit 130, and receives the processing result of the server data processing unit 130, and outputs the output unit 120. Output via.

次に、図１７に示す情報処理装置１０の構成要素について説明する。
入力部１１０は、音声入力部（マイク）１１１、画像入力部（カメラ）１１２、センサ１１３を有する。
出力部１２０は、音声出力部（スピーカー）１２１、画像出力部（表示部）１２２を有する。
情報処理装置１０は、最低限、これらの構成要素を有する。 Next, the components of the information processing apparatus 10 shown in FIG. 17 will be described.
The input unit 110 includes a voice input unit (microphone) 111, an image input unit (camera) 112, and a sensor 113.
The output unit 120 includes an audio output unit (speaker) 121 and an image output unit (display unit) 122.
The information processing device 10 has at least these components.

なお、音声入力部（マイク）１１１は、図１に示す情報処理装置１０のマイク１２に対応する。
画像入力部（カメラ）１１２は、図１に示す情報処理装置１０のカメラ１１に対応する。
センサ１１３は、図１に示す情報処理装置１０のセンサ１５に対応する。センサ１１３は、例えば距離センサ、ＧＰＳ等の位置センサ、温度センサ等、様々なセンサによって構成される。 The voice input unit (microphone) 111 corresponds to the microphone 12 of the information processing device 10 shown in FIG.
The image input unit (camera) 112 corresponds to the camera 11 of the information processing device 10 shown in FIG.
The sensor 113 corresponds to the sensor 15 of the information processing device 10 shown in FIG. The sensor 113 is composed of various sensors such as a distance sensor, a position sensor such as GPS, and a temperature sensor.

音声出力部（スピーカー）１２１は、図１に示す情報処理装置１０のスピーカー１４に対応する。
画像出力部（表示部）１２２は、図１に示す情報処理装置１０の表示部１３に対応する。
なお、画像出力部（表示部）１２２は、例えば、プロジェクタ等によって構成することも可能であり、また外部装置のテレビの表示部を利用した構成とすることも可能である。 The audio output unit (speaker) 121 corresponds to the speaker 14 of the information processing device 10 shown in FIG.
The image output unit (display unit) 122 corresponds to the display unit 13 of the information processing device 10 shown in FIG.
The image output unit (display unit) 122 can be configured by, for example, a projector or the like, or can be configured by using the display unit of a television of an external device.

データ処理部１３０は、入力データ解析部１４０、データ処理実行部１５０、出力情報生成部１６０を有する。 The data processing unit 130 includes an input data analysis unit 140, a data processing execution unit 150, and an output information generation unit 160.

入力データ解析部１４０は、音声解析部１４１、画像解析部１４２、センサ情報解析部１４を有する。
出力情報生成部１６０は、出力音声生成部１６１、表示情報生成部１６２を有する。 The input data analysis unit 140 includes a voice analysis unit 141, an image analysis unit 142, and a sensor information analysis unit 14.
The output information generation unit 160 includes an output voice generation unit 161 and a display information generation unit 162.

ユーザの発話音声はマイクなどの音声入力部１１１に入力される。
音声入力部（マイク）１１１は、入力したユーザ発話音声を音声解析部１４１に入力する。
音声解析部１４１は、例えばＡＳＲ（ＡｕｔｏｍａｔｉｃＳｐｅｅｃｈＲｅｃｏｇｎｉｔｉｏｎ）機能を有し、音声データを複数の単語から構成されるテキストデータに変換する。 The user's spoken voice is input to a voice input unit 111 such as a microphone.
The voice input unit (microphone) 111 inputs the input user-spoken voice to the voice analysis unit 141.
The voice analysis unit 141 has, for example, an ASR (Automatic Speech Recognition) function, and converts voice data into text data composed of a plurality of words.

音声解析部１４１は、さらに、テキストデータに対する発話意味解析処理を実行する。音声解析部１４１は、例えば、ＮＬＵ（ＮａｔｕｒａｌＬａｎｇｕａｇｅＵｎｄｅｒｓｔａｎｄｉｎｇ）等の自然言語理解機能を有し、テキストデータからユーザ発話の意図（インテント：Ｉｎｔｅｎｔ）や、発話に含まれる意味のある有意要素（スロット：Ｓｌｏｔ）を推定する。ユーザ発話から、意図（インテント）と、有意要素（スロット）を正確に推定、取得することができれば、情報処理装置１０は、ユーザ発話に対する正確な処理を行うことができる。
音声解析部１４１の解析結果はデータ処理実行部１５０に入力される。 The voice analysis unit 141 further executes an utterance semantic analysis process for the text data. The voice analysis unit 141 has a natural language understanding function such as NLU (Natural Language Understanding), and the intention (intent) of the user's utterance from the text data and a meaningful significant element (slot) included in the utterance. : Slot) is estimated. If the intention (intent) and the significant element (slot) can be accurately estimated and acquired from the user utterance, the information processing apparatus 10 can perform accurate processing for the user utterance.
The analysis result of the voice analysis unit 141 is input to the data processing execution unit 150.

画像入力部１１２は、発話ユーザおよびその周囲の画像を撮影して、画像解析部１６２に入力する。
画像解析部１４２は、発話ユーザの顔の表情やユーザの行動、発話ユーザの周囲情報等の解析を行い、この解析結果をデータ処理実行部１５０に入力する。 The image input unit 112 captures an image of the speaking user and its surroundings and inputs the image to the image analysis unit 162.
The image analysis unit 142 analyzes the facial expression of the speaking user, the user's behavior, the surrounding information of the speaking user, and the like, and inputs the analysis result to the data processing execution unit 150.

センサ１１３は、例えば距離センサ、ＧＰＳ等の位置センサ、温度センサ等の各種センサによって構成され、センサ１１３の取得情報は、センサ情報解析部１４３に入力される。
センサ情報解析部１４３は、センサ取得情報に基づいて、例えば現在の位置、気温等のデータを取得して、この解析結果をデータ処理実行部１５０に入力する。 The sensor 113 is composed of various sensors such as a distance sensor, a position sensor such as GPS, and a temperature sensor, and the acquired information of the sensor 113 is input to the sensor information analysis unit 143.
The sensor information analysis unit 143 acquires data such as the current position and temperature based on the sensor acquisition information, and inputs the analysis result to the data processing execution unit 150.

データ処理実行部１５０は、ユーザ識別部１５１、コンテクスト解析部１５２、処理実行部１５３を有する。
ユーザ識別部１５１は、入力データ解析部１４０から入力する情報、例えば画像解析部１４２からの入力情報等に基づいて、カメラ撮影画像に含まれるユーザを識別する。なお、ユーザの顔情報等、ユーザ識別に適用するための情報は記憶部１７０のユーザ情報ＤＢ（データベース）１７１に格納されている。 The data processing execution unit 150 includes a user identification unit 151, a context analysis unit 152, and a processing execution unit 153.
The user identification unit 151 identifies the user included in the image captured by the camera based on the information input from the input data analysis unit 140, for example, the input information from the image analysis unit 142. Information to be applied to user identification, such as user face information, is stored in the user information DB (database) 171 of the storage unit 170.

コンテクスト解析部１５２は、入力データ解析部１４０から入力する情報、例えば画像解析部１４２からの入力情報等に基づいて、現在の状況（コンテクスト）を解析する。
コンテクストとは、入力部１１０を構成するマイク、カメラ、各種センサの入力情報に基づいて把握されるユーザやその周囲の状況を示す情報である。例えば、「ユーザが食事をしている」「ユーザが電話をかけている」など、現在の状況を示すデータである。 The context analysis unit 152 analyzes the current situation (context) based on the information input from the input data analysis unit 140, for example, the input information from the image analysis unit 142.
The context is information indicating the user and the surrounding situation grasped based on the input information of the microphone, the camera, and various sensors constituting the input unit 110. For example, it is data showing the current situation such as "the user is eating" and "the user is making a phone call".

コンテクスト解析部１５２は、さらに、上述した実施例において説明した処理、すなわち、以下の処理を行う。
上述した実施例１に従った処理として、音声解析処理の結果として複数のユーザ発話候補が生成された場合に、カメラ等を介して取得した現在のコンテクスト（状況情報）のコンテクスト種類との距離値が近いコンテクスト種類の主要スロットを含むユーザ発話候補を、情報処理装置１０が対応する応答対象のユーザ発話候補として選択する処理を行う。 The context analysis unit 152 further performs the process described in the above-described embodiment, that is, the following process.
As the process according to the first embodiment described above, when a plurality of user utterance candidates are generated as a result of the voice analysis process, the distance value from the context type of the current context (situation information) acquired via the camera or the like. The information processing apparatus 10 performs a process of selecting a user utterance candidate including a main slot of a context type close to the same as the corresponding user utterance candidate of the response target.

また、実施例２に従った処理として、カメラ等を介して取得した現在のコンテクスト（状況情報）のコンテクスト種類との距離値が近いコンテクスト種類を、情報処理装置が実行する次処理候補として決定する処理を行う。
いずれもコンテクスト種類の距離値を参照して処理を行う実施例である。 Further, as the process according to the second embodiment, the context type whose distance value is close to the context type of the current context (situation information) acquired via the camera or the like is determined as the next processing candidate to be executed by the information processing apparatus. Perform processing.
Both are examples in which processing is performed with reference to the distance value of the context type.

このコンテクスト種類の距離値を取得する際に、記憶部１７０に格納されたコンテクスト間距離テーブル１７３や、コンテクスト情報データベース１７２を利用する。
コンテクスト間距離テーブル１７３は、図４や図１３に示すデータ構成を有するテーブルである。コンテクスト情報データベース１７２は、図３に示すデータ構成を有する。 When acquiring the distance value of this context type, the inter-context distance table 173 stored in the storage unit 170 and the context information database 172 are used.
The inter-context distance table 173 is a table having the data structure shown in FIGS. 4 and 13. The context information database 172 has the data structure shown in FIG.

コンテクスト解析部１５２によるコンテクスト解析結果として選択されたユーザ発話候補（実施例１）や、次処理候補（実施例２）は、処理実行部１５３に入力される。 The user utterance candidate (Example 1) and the next processing candidate (Example 2) selected as the context analysis result by the context analysis unit 152 are input to the processing execution unit 153.

処理実行部１５３は、コンテクスト解析部１５２によるコンテクスト解析結果として選択されたユーザ発話候補（実施例１）や、次処理候補（実施例２）に基づいて、ユーザ発話に対応する処理を行い、また次処理候補として決定された処理の事前準備処理等を行う。 The processing execution unit 153 performs processing corresponding to the user utterance based on the user utterance candidate (Example 1) selected as the context analysis result by the context analysis unit 152 and the next processing candidate (Example 2). Performs preparatory processing and the like for the processing determined as the next processing candidate.

出力情報生成部１６０は、出力音声生成部１６１、表示情報生成部１６２を有する。
出力音声生成部１６１は、データ処理実行部１５０の処理実行部１５３によって実行される処理に伴うシステム発話音声を生成する。
出力音声生成部１６１の生成した応答音声情報は、スピーカー等の音声出力部１２１を介して出力される。 The output information generation unit 160 includes an output voice generation unit 161 and a display information generation unit 162.
The output voice generation unit 161 generates a system speech voice associated with the processing executed by the processing execution unit 153 of the data processing execution unit 150.
The response voice information generated by the output voice generation unit 161 is output via the voice output unit 121 such as a speaker.

表示情報生成部１６２は、ユーザに対するシステム発話のテキスト情報や、その他の提示情報を表示する。
例えばユーザが世界地図を見せてというユーザ発話を行った場合、世界地図を表示する。
世界地図は、例えばサービス提供サーバから取得可能である。 The display information generation unit 162 displays the text information of the system utterance to the user and other presentation information.
For example, when the user makes a user utterance to show the world map, the world map is displayed.
The world map can be obtained from, for example, a service providing server.

記憶部１７０のユーザ情報ＤＢ１７１は、例えば情報処理装置１０と対話を行うユーザを識別するための顔情報や年齢、性別等のユーザプロファイル、さらにユーザ間の関係情報等を記録したデータベースである。 The user information DB 171 of the storage unit 170 is a database that records, for example, face information for identifying a user interacting with the information processing device 10, a user profile such as age and gender, and relationship information between users.

コンテクスト情報データベース１７２は、先に図３を参照して説明したように、以下の各データを対応付けて記録したデータベースである。
（ａ）コンテクストＩＤ
（ｂ）コンテクスト種類
（ｃ）コンテクスト対応主要スロット（有意要素） The context information database 172 is a database in which the following data are associated and recorded as described above with reference to FIG.
(A) Context ID
(B) Context type (c) Context-compatible main slot (significant element)

また、コンテクスト間距離テーブル１７３は、先に図４や図１３を参照して説明したように、２つのコンテクスト種類間の距離値を記録したテーブルである。
２つのコンテクスト種類（ＩＤ）間の距離値の値が小さいほど、その２つのコンテクスト種類の類似度や関連性が高いことを意味する。
一方、２つのコンテクスト種類（ＩＤ）間の距離値の値が大きいほど、その２つのコンテクスト種類の類似度や関連性が低いことを意味する。 Further, the inter-context distance table 173 is a table in which distance values between two context types are recorded, as described above with reference to FIGS. 4 and 13.
The smaller the value of the distance value between the two context types (ID), the higher the similarity and relevance of the two context types.
On the other hand, the larger the value of the distance value between the two context types (ID), the lower the similarity and relevance of the two context types.

コンテクスト情報データベース１７２や、コンテクスト間距離テーブル１７３は、コンテクスト解析部１５２において実行される処理に際して利用される。 The context information database 172 and the inter-text distance table 173 are used in the processing executed by the context analysis unit 152.

なお、図１７は、情報処理装置１０の構成例として説明したが、前述したように、図１７に示す構成中の入力部１１０、出力部１２０以外のデータ処理部１３０や記憶部１７０は、情報処理装置１０内に構成せず、外部サーバ内に構成してもよい。 Although FIG. 17 has been described as a configuration example of the information processing device 10, as described above, the data processing unit 130 and the storage unit 170 other than the input unit 110 and the output unit 120 in the configuration shown in FIG. 17 have information. It may not be configured in the processing device 10 but may be configured in an external server.

例えば、図１８に示すように、ユーザ端末である多数の情報処理装置１０とデータ処理サーバ５０を、ネットワークを介して接続する。各情報処理装置１０は、各個人の所有するスマホやＰＣ等の端末や、各家にあるスマートスピーカー等のユーザ端末によって構成される。各情報処理装置１０は、情報処理装置１０で実行される各ユーザとの対話情報や、入力部を介して取得される画像情報、音声情報、センサ検出情報等をデータ処理サーバ５０に送信する。データ処理サーバ５０は各情報処理装置１０から様々な情報を受信して解析を行う。このような構成とすることができる。 For example, as shown in FIG. 18, a large number of information processing devices 10 which are user terminals and a data processing server 50 are connected via a network. Each information processing device 10 is composed of terminals such as smartphones and PCs owned by each individual and user terminals such as smart speakers in each house. Each information processing device 10 transmits to the data processing server 50 the dialogue information with each user executed by the information processing device 10, the image information, the voice information, the sensor detection information, etc. acquired via the input unit. The data processing server 50 receives various information from each information processing device 10 and performs analysis. Such a configuration can be made.

なお、図１８に示すようなネットワーク接続構成において情報処理装置１０と、データ処理サーバ５０各々が実行する処理の区分は様々な設定が可能である。
例えば、図１９に示すように、情報処理装置１０が入力部１１０と出力部１２０を有し、データ処理サーバ５０がデータ処理部１３０や記憶部１７０を有する構成が可能である。 In the network connection configuration as shown in FIG. 18, various settings can be made for the classification of the processing executed by each of the information processing apparatus 10 and the data processing server 50.
For example, as shown in FIG. 19, the information processing apparatus 10 can have an input unit 110 and an output unit 120, and the data processing server 50 can have a data processing unit 130 and a storage unit 170.

あるいは、図２０に示すように、情報処理装置１０が入力部１１０と入力データ解析部１４０、さらに出力情報生成部１６０と出力部１２０を有し、データ処理サーバ５０がデータ処理実行部１５０と記憶部１７０を有する構成とすることも可能である。 Alternatively, as shown in FIG. 20, the information processing apparatus 10 has an input unit 110 and an input data analysis unit 140, and further has an output information generation unit 160 and an output unit 120, and the data processing server 50 stores the data processing execution unit 150 and storage. It is also possible to have a configuration having a unit 170.

図１８に示すようなネットワーク接続構成とした場合、データ処理サーバ５０は、ネットワーク接続された多数の情報処理装置１０におけるユーザとの対話情報等を入力して解析することが可能となり、より精度の高い解析を行うことが可能となる。 In the case of the network connection configuration as shown in FIG. 18, the data processing server 50 can input and analyze the dialogue information with the user in a large number of information processing devices 10 connected to the network, and is more accurate. It is possible to perform high-level analysis.

［１０．情報処理装置のハードウェア構成例について］
次に、図２１を参照して、エージェント装置（情報処理装置）のハードウェア構成例について説明する。
図２１を参照して説明するハードウェアは、先に図１７や、図１９、図２０を参照して説明した情報処理装置１０の１つの具体的なハードウェア構成例であり、また、図１９や図２０を参照して説明したデータ処理サーバ５０を構成する情報処理装置のハードウェア構成の一例でもある。 [10. Information processing device hardware configuration example]
Next, a hardware configuration example of the agent device (information processing device) will be described with reference to FIG.
The hardware described with reference to FIG. 21 is one specific hardware configuration example of the information processing apparatus 10 described above with reference to FIGS. 17, 19 and 20, and also with reference to FIG. It is also an example of the hardware configuration of the information processing apparatus constituting the data processing server 50 described with reference to FIG.

ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）３０１は、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）３０２、または記憶部３０８に記憶されているプログラムに従って各種の処理を実行する制御部やデータ処理部として機能する。例えば、上述した実施例において説明したシーケンスに従った処理を実行する。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）３０３には、ＣＰＵ３０１が実行するプログラムやデータなどが記憶される。これらのＣＰＵ３０１、ＲＯＭ３０２、およびＲＡＭ３０３は、バス３０４により相互に接続されている。 The CPU (Central Processing Unit) 301 functions as a control unit or a data processing unit that executes various processes according to a program stored in the ROM (Read Only Memory) 302 or the storage unit 308. For example, the process according to the sequence described in the above-described embodiment is executed. The RAM (Random Access Memory) 303 stores programs and data executed by the CPU 301. These CPU 301, ROM 302, and RAM 303 are connected to each other by a bus 304.

ＣＰＵ３０１はバス３０４を介して入出力インタフェース３０５に接続され、入出力インタフェース３０５には、各種スイッチ、キーボード、マウス、マイクロホン、センサなどよりなる入力部３０６、ディスプレイ、スピーカーなどよりなる出力部３０７が接続されている。ＣＰＵ３０１は、入力部３０６から入力される指令に対応して各種の処理を実行し、処理結果を例えば出力部３０７に出力する。 The CPU 301 is connected to the input / output interface 305 via the bus 304, and the input / output interface 305 is connected to an input unit 306 consisting of various switches, a keyboard, a mouse, a microphone, a sensor, etc., and an output unit 307 consisting of a display, a speaker, and the like. Has been done. The CPU 301 executes various processes in response to a command input from the input unit 306, and outputs the process results to, for example, the output unit 307.

入出力インタフェース３０５に接続されている記憶部３０８は、例えばハードディスク等からなり、ＣＰＵ３０１が実行するプログラムや各種のデータを記憶する。通信部３０９は、Ｗｉ−Ｆｉ通信、ブルートゥース（登録商標）（ＢＴ）通信、その他インターネットやローカルエリアネットワークなどのネットワークを介したデータ通信の送受信部として機能し、外部の装置と通信する。 The storage unit 308 connected to the input / output interface 305 is composed of, for example, a hard disk or the like, and stores a program executed by the CPU 301 and various data. The communication unit 309 functions as a transmission / reception unit for Wi-Fi communication, Bluetooth (registered trademark) (BT) communication, and other data communication via a network such as the Internet or a local area network, and communicates with an external device.

入出力インタフェース３０５に接続されているドライブ３１０は、磁気ディスク、光ディスク、光磁気ディスク、あるいはメモリカード等の半導体メモリなどのリムーバブルメディア３１１を駆動し、データの記録あるいは読み取りを実行する。 The drive 310 connected to the input / output interface 305 drives a removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory such as a memory card, and records or reads data.

［１１．本開示の構成のまとめ］
以上、特定の実施例を参照しながら、本開示の実施例について詳解してきた。しかしながら、本開示の要旨を逸脱しない範囲で当業者が実施例の修正や代用を成し得ることは自明である。すなわち、例示という形態で本発明を開示してきたのであり、限定的に解釈されるべきではない。本開示の要旨を判断するためには、特許請求の範囲の欄を参酌すべきである。 [11. Summary of the structure of this disclosure]
As described above, the examples of the present disclosure have been described in detail with reference to the specific examples. However, it is self-evident that one of ordinary skill in the art can modify or substitute the examples without departing from the gist of the present disclosure. That is, the present invention has been disclosed in the form of an example, and should not be construed in a limited manner. In order to judge the gist of this disclosure, the column of claims should be taken into consideration.

なお、本明細書において開示した技術は、以下のような構成をとることができる。
（１）ユーザ発話の音声解析結果として生成された複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択するデータ処理部を有し、
前記データ処理部は、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記複数のユーザ発話候補から情報処理装置が対応する１つのユーザ発話を選択するユーザ発話選択処理を実行する情報処理装置。 The technology disclosed in the present specification can have the following configuration.
(1) It has a data processing unit that selects one user utterance corresponding to the information processing device from a plurality of user utterance candidates generated as a result of voice analysis of user utterances.
The data processing unit
An information processing device that analyzes a context that is current status information and executes a user utterance selection process in which the information processing device selects one user utterance corresponding to the plurality of user utterance candidates from the plurality of user utterance candidates using the context analysis result.

（２）前記データ処理部は、
現在の状況情報であるコンテクストの種類であるコンテクスト種類を判別し、
判別したコンテクスト種類と類似性または関連性の高いコンテクスト種類を選択して、選択したコンテクスト種類に対応付けてデータベースに登録されたスロットと同一、または類似するスロットを有するユーザ発話を、情報処理装置が対応する１つのユーザ発話として選択する（１）に記載の情報処理装置。 (2) The data processing unit
Determine the context type, which is the type of context that is the current status information,
The information processing device selects a context type that is similar to or highly relevant to the determined context type, and the information processing device transmits a user utterance having a slot that is the same as or similar to the slot registered in the database in association with the selected context type. The information processing device according to (1), which is selected as one corresponding user utterance.

（３）前記データ処理部は、
複数の異なるコンテクスト種類と、各コンテクスト種類に対応する主要スロットを対応付けて記録したコンテクスト情報データベースを参照して、前記ユーザ発話選択処理を実行する（２）に記載の情報処理装置。 (3) The data processing unit
The information processing apparatus according to (2), wherein the user speech selection process is executed with reference to a context information database in which a plurality of different context types are associated with and recorded main slots corresponding to each context type.

（４）前記データ処理部は、
複数のコンテクスト種類間の類似性および関連性に基づいて算出された距離値を記録したコンテクスト間距離テーブルを参照して、
前記判別したコンテクスト種類と類似性または関連性の高いコンテクスト種類を選択する（２）または（３）に記載の情報処理装置。 (4) The data processing unit
Refer to the inter-context distance table, which records distance values calculated based on similarities and relationships between multiple context types.
The information processing apparatus according to (2) or (3), which selects a context type having a high degree of similarity or relevance to the determined context type.

（５）前記情報処理装置は、
前記ユーザ発話の音声解析処理を実行する音声解析部を有し、
前記音声解析部は、前記ユーザ発話に不明瞭な部分が含まれる場合、複数のユーザ発話候補を生成し、
前記データ処理部は、
前記音声解析部が生成した複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択する（１）〜（４）いずれかに記載の情報処理装置。 (5) The information processing device is
It has a voice analysis unit that executes the voice analysis process of the user's utterance.
The voice analysis unit generates a plurality of user utterance candidates when the user utterance contains an unclear part.
The data processing unit
The information processing device according to any one of (1) to (4), wherein the information processing device selects one user utterance corresponding to the plurality of user utterance candidates generated by the voice analysis unit.

（６）前記コンテクストは、
カメラ、マイク、センサいずれかの検出情報、または、情報処理装置において実行中の処理に関する情報の少なくともいずれかの情報に基づく状況情報である（１）〜（５）いずれかに記載の情報処理装置。 (6) The context is
The information processing apparatus according to any one of (1) to (5), which is status information based on detection information of any one of a camera, a microphone, and a sensor, or at least one of information related to processing being executed in the information processing apparatus. ..

（７）前記データ処理部は、
前記ユーザ発話を行ったユーザを識別し、ユーザ識別結果に応じたユーザ固有の処理を実行して、情報処理装置が対応する１つのユーザ発話を選択する（１）〜（６）いずれかに記載の情報処理装置。 (7) The data processing unit
Described in any of (1) to (6), wherein the user who has made the user utterance is identified, a user-specific process is executed according to the user identification result, and the information processing apparatus selects one corresponding user utterance. Information processing equipment.

（８）前記データ処理部は、
前記ユーザ発話が行われた場所を識別し、場所識別結果に応じた場所固有の処理を実行して、情報処理装置が対応する１つのユーザ発話を選択する（１）〜（７）いずれかに記載の情報処理装置。 (8) The data processing unit
The place where the user utterance was made is identified, a place-specific process is executed according to the place identification result, and the information processing apparatus selects one corresponding user utterance (1) to (7). The information processing device described.

（９）前記データ処理部は、さらに、
前記ユーザ発話選択処理において選択した１つのユーザ発話に対応する応答処理を実行する（１）〜（８）いずれかに記載の情報処理装置。 (9) The data processing unit further
The information processing apparatus according to any one of (1) to (8), which executes a response process corresponding to one user utterance selected in the user utterance selection process.

（１０）情報処理装置が次に実行する処理の候補である次処理候補を決定するデータ処理部を有し、
前記データ処理部は、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記次処理候補を決定する情報処理装置。 (10) The information processing apparatus has a data processing unit that determines a next processing candidate that is a candidate for the next processing to be executed.
The data processing unit
An information processing device that analyzes a context that is current status information and uses the context analysis result to determine the next processing candidate.

（１１）前記データ処理部は、
現在の状況情報であるコンテクストの種類であるコンテクスト種類を判別し、
判別したコンテクスト種類と類似性または関連性の高いコンテクスト種類を選択して、選択したコンテクスト種類に対応する処理を前記次処理候補として決定する（１０）に記載の情報処理装置。 (11) The data processing unit is
Determine the context type, which is the type of context that is the current status information,
The information processing apparatus according to (10), wherein a context type having a high similarity or relevance to the determined context type is selected, and a process corresponding to the selected context type is determined as the next processing candidate.

（１２）前記データ処理部は、
複数のコンテクスト種類間の類似性および関連性に基づいて算出された距離値を記録したコンテクスト間距離テーブルを参照して、
前記判別したコンテクスト種類と類似性または関連性の高いコンテクスト種類を選択する（１１）に記載の情報処理装置。 (12) The data processing unit is
Refer to the inter-context distance table, which records distance values calculated based on similarities and relationships between multiple context types.
The information processing apparatus according to (11), which selects a context type having a high degree of similarity or relevance to the determined context type.

（１３）前記データ処理部は、さらに、
決定した次処理候補に関する事前準備処理を実行する（１０）〜（１２）いずれかに記載の情報処理装置。 (13) The data processing unit further
The information processing apparatus according to any one of (10) to (12), which executes preparatory processing for the determined next processing candidate.

（１４）情報処理装置において実行する情報処理方法であり、
前記情報処理装置は、ユーザ発話の音声解析結果として生成された複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択するデータ処理部を有し、
前記データ処理部が、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記複数のユーザ発話候補から情報処理装置が対応する１つのユーザ発話を選択するユーザ発話選択処理を実行する情報処理方法。 (14) This is an information processing method executed in an information processing device.
The information processing device has a data processing unit that selects one user utterance corresponding to the information processing device from a plurality of user utterance candidates generated as a result of voice analysis of the user utterance.
The data processing unit
An information processing method that analyzes a context that is current status information and executes a user utterance selection process in which an information processing device selects one user utterance corresponding to the plurality of user utterance candidates from the plurality of user utterance candidates using the context analysis result.

（１５）情報処理装置において実行する情報処理方法であり、
前記情報処理装置は、
前記情報処理装置が次に実行する処理の候補である次処理候補を決定するデータ処理部を有し、
前記データ処理部が、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記次処理候補を決定する情報処理方法。 (15) An information processing method executed in an information processing device.
The information processing device
The information processing apparatus has a data processing unit that determines a next processing candidate that is a candidate for processing to be executed next.
The data processing unit
An information processing method that analyzes a context that is current status information and uses the context analysis result to determine the next processing candidate.

（１６）情報処理装置において情報処理を実行させるプログラムであり、
前記情報処理装置は、ユーザ発話の音声解析結果として生成された複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択するデータ処理部を有し、
前記プログラムは、前記データ処理部に、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記複数のユーザ発話候補から情報処理装置が対応する１つのユーザ発話を選択するユーザ発話選択処理を実行させるプログラム。 (16) A program that executes information processing in an information processing device.
The information processing device has a data processing unit that selects one user utterance corresponding to the information processing device from a plurality of user utterance candidates generated as a result of voice analysis of the user utterance.
The program is installed in the data processing unit.
A program that analyzes a context that is current status information and executes a user utterance selection process in which an information processing device selects one user utterance corresponding to the plurality of user utterance candidates from the plurality of user utterance candidates using the context analysis result.

（１７）情報処理装置において情報処理を実行させるプログラムであり、
前記情報処理装置は、
前記情報処理装置が次に実行する処理の候補である次処理候補を決定するデータ処理部を有し、
前記プログラムは、前記データ処理部に、
現在の状況情報であるコンテクストを解析し、コンテクスト解析結果を利用して前記次処理候補を決定させるプログラム。 (17) A program that executes information processing in an information processing device.
The information processing device
The information processing apparatus has a data processing unit that determines a next processing candidate that is a candidate for processing to be executed next.
The program is installed in the data processing unit.
A program that analyzes the context, which is the current status information, and determines the next processing candidate using the context analysis result.

また、明細書中において説明した一連の処理はハードウェア、またはソフトウェア、あるいは両者の複合構成によって実行することが可能である。ソフトウェアによる処理を実行する場合は、処理シーケンスを記録したプログラムを、専用のハードウェアに組み込まれたコンピュータ内のメモリにインストールして実行させるか、あるいは、各種処理が実行可能な汎用コンピュータにプログラムをインストールして実行させることが可能である。例えば、プログラムは記録媒体に予め記録しておくことができる。記録媒体からコンピュータにインストールする他、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、インターネットといったネットワークを介してプログラムを受信し、内蔵するハードディスク等の記録媒体にインストールすることができる。 In addition, the series of processes described in the specification can be executed by hardware, software, or a composite configuration of both. When executing processing by software, install the program that records the processing sequence in the memory in the computer built in the dedicated hardware and execute it, or execute the program on a general-purpose computer that can execute various processing. It can be installed and run. For example, the program can be pre-recorded on a recording medium. In addition to installing on a computer from a recording medium, the program can be received via a network such as LAN (Local Area Network) or the Internet and installed on a recording medium such as a built-in hard disk.

なお、明細書に記載された各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。また、本明細書においてシステムとは、複数の装置の論理的集合構成であり、各構成の装置が同一筐体内にあるものには限らない。 The various processes described in the specification are not only executed in chronological order according to the description, but may also be executed in parallel or individually as required by the processing capacity of the device that executes the processes. Further, in the present specification, the system is a logical set configuration of a plurality of devices, and the devices having each configuration are not limited to those in the same housing.

以上、説明したように、本開示の一実施例の構成によれば、複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択することを可能とした装置、方法が実現される。
具体的には、例えば、ユーザ発話の音声解析結果として生成された複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択するデータ処理部を有する。データ処理部は、現在の状況情報であるコンテクストの種類であるコンテクスト種類を判別し、判別したコンテクスト種類と類似性または関連性の高いコンテクスト種類を選択して、選択したコンテクスト種類に対応付けてデータベースに登録されたスロットと同一、または類似するスロットを有するユーザ発話を、情報処理装置が対応する１つのユーザ発話として選択する。
本構成により、複数のユーザ発話候補から、情報処理装置が対応する１つのユーザ発話を選択することを可能とした装置、方法が実現される。 As described above, according to the configuration of one embodiment of the present disclosure, a device and a method capable of selecting one user utterance corresponding to the information processing device from a plurality of user utterance candidates are realized. NS.
Specifically, for example, it has a data processing unit that selects one user utterance corresponding to the information processing device from a plurality of user utterance candidates generated as a result of voice analysis of user utterances. The data processing unit determines the context type, which is the type of context that is the current status information, selects a context type that is similar to or highly related to the determined context type, and associates it with the selected context type in the database. A user utterance having a slot that is the same as or similar to the slot registered in is selected as one corresponding user utterance by the information processing apparatus.
With this configuration, a device and a method that enable the information processing device to select one corresponding user utterance from a plurality of user utterance candidates are realized.

１０エージェント装置
１１カメラ
１２マイク
１３表示部
１４スピーカー
１５センサ
１８コンテクスト情報データベース
１９コンテクスト間距離テーブル
５０データ処理サーバ
１１０入力部
１１１音声入力部
１１２画像入力部
１１３センサ
１２０出力部
１２１音声出力部
１２２画像出力部
１３０データ処理部
１４０入力データ解析部
１４１音声解析部
１４２画像解析部
１４３センサ情報解析部
１５０データ処理実行部
１５１ユーザ識別部
１５２コンテクスト解析部
１５３処理実行部
１６０出力情報生成部
１６１出力音声生成部
１６２表示情報生成部
１７０記憶部
１７１ユーザ情報ＤＢ（データベース）
１７２コンテクスト情報データベース
１７３コンテクスト間距離テーブル
３０１ＣＰＵ
３０２ＲＯＭ
３０３ＲＡＭ
３０４バス
３０５入出力インタフェース
３０６入力部
３０７出力部
３０８記憶部
３０９通信部
３１０ドライブ
３１１リムーバブルメディア 10 Agent device 11 Camera 12 Microphone 13 Display unit 14 Speaker 15 Sensor 18 Context information database 19 Context distance table 50 Data processing server 110 Input unit 111 Audio input unit 112 Image input unit 113 Sensor 120 Output unit 121 Audio output unit 122 Image output Unit 130 Data processing unit 140 Input data analysis unit 141 Voice analysis unit 142 Image analysis unit 143 Sensor information analysis unit 150 Data processing execution unit 151 User identification unit 152 Context analysis unit 153 Processing execution unit 160 Output information generation unit 161 Output audio generation unit 162 Display information generation unit 170 Storage unit 171 User information DB (database)
172 Context information database 173 Distance table between contexts 301 CPU
302 ROM
303 RAM
304 Bus 305 Input / output interface 306 Input unit 307 Output unit 308 Storage unit 309 Communication unit 310 Drive 311 Removable media

Claims

It has a data processing unit that selects one user utterance corresponding to the information processing device from a plurality of user utterance candidates generated as a voice analysis result of the user utterance.
The data processing unit
An information processing device that analyzes a context that is current status information and executes a user utterance selection process in which the information processing device selects one user utterance corresponding to the plurality of user utterance candidates from the plurality of user utterance candidates using the context analysis result.

The data processing unit
Determine the context type, which is the type of context that is the current status information,
The information processing device selects a context type that is similar to or highly relevant to the determined context type, and the information processing device sends a user utterance having a slot that is the same as or similar to the slot registered in the database in association with the selected context type. The information processing device according to claim 1, which is selected as one corresponding user utterance.

The data processing unit
The information processing apparatus according to claim 2, wherein the user speech selection process is executed with reference to a context information database in which a plurality of different context types are associated with and recorded main slots corresponding to each context type.

The data processing unit
Refer to the inter-context distance table, which records distance values calculated based on similarities and relationships between multiple context types.
The information processing apparatus according to claim 2, wherein a context type having a high degree of similarity or relevance to the determined context type is selected.

The information processing device
It has a voice analysis unit that executes the voice analysis process of the user's utterance.
The voice analysis unit generates a plurality of user utterance candidates when the user utterance contains an unclear part.
The data processing unit
The information processing device according to claim 1, wherein the information processing device selects one user utterance corresponding to the plurality of user utterance candidates generated by the voice analysis unit.

The context is
The information processing apparatus according to claim 1, which is status information based on detection information of any one of a camera, a microphone, and a sensor, or at least one of information related to processing being executed in the information processing apparatus.

The data processing unit
The information processing device according to claim 1, wherein the user who has made the user utterance is identified, a user-specific process according to the user identification result is executed, and the information processing device selects the corresponding user utterance.

The data processing unit
The information processing device according to claim 1, wherein the information processing device identifies a place where the user utterance is made, executes a place-specific process according to the place identification result, and selects one user utterance corresponding to the information processing device.

The data processing unit further
The information processing device according to claim 1, which executes a response process corresponding to one user utterance selected in the user utterance selection process.

It has a data processing unit that determines the next processing candidate, which is the next processing candidate to be executed by the information processing device.
The data processing unit
An information processing device that analyzes a context that is current status information and uses the context analysis result to determine the next processing candidate.

The data processing unit
Determine the context type, which is the type of context that is the current status information,
The information processing apparatus according to claim 10, wherein a context type having a high similarity or relevance to the determined context type is selected, and a process corresponding to the selected context type is determined as the next processing candidate.

The data processing unit
Refer to the inter-context distance table, which records distance values calculated based on similarities and relationships between multiple context types.
The information processing apparatus according to claim 11, wherein a context type having a high degree of similarity or relevance to the determined context type is selected.

The data processing unit further
The information processing apparatus according to claim 10, wherein the preparatory processing for the determined next processing candidate is executed.

It is an information processing method executed in an information processing device.
The information processing device has a data processing unit that selects one user utterance corresponding to the information processing device from a plurality of user utterance candidates generated as a result of voice analysis of the user utterance.
The data processing unit
An information processing method that analyzes a context that is current status information and executes a user utterance selection process in which an information processing device selects one user utterance corresponding to the plurality of user utterance candidates from the plurality of user utterance candidates using the context analysis result.

It is an information processing method executed in an information processing device.
The information processing device
The information processing apparatus has a data processing unit that determines a next processing candidate that is a candidate for processing to be executed next.
The data processing unit
An information processing method that analyzes a context that is current status information and uses the context analysis result to determine the next processing candidate.

A program that executes information processing in an information processing device.
The information processing device has a data processing unit that selects one user utterance corresponding to the information processing device from a plurality of user utterance candidates generated as a result of voice analysis of the user utterance.
The program is installed in the data processing unit.
A program that analyzes a context that is current status information and executes a user utterance selection process in which an information processing device selects one user utterance corresponding to the plurality of user utterance candidates from the plurality of user utterance candidates using the context analysis result.

A program that executes information processing in an information processing device.
The information processing device
The information processing apparatus has a data processing unit that determines a next processing candidate that is a candidate for processing to be executed next.
The program is installed in the data processing unit.
A program that analyzes the context, which is the current status information, and determines the next processing candidate using the context analysis result.