JPH07110734A

JPH07110734A - Multimodal input analysis system

Info

Publication number: JPH07110734A
Application number: JP5256199A
Authority: JP
Inventors: Mayumi Egashira; まゆみ江頭
Original assignee: PERSONAL JOHO KANKYO KYOKAI
Current assignee: PERSONAL JOHO KANKYO KYOKAI
Priority date: 1993-10-14
Filing date: 1993-10-14
Publication date: 1995-04-25

Abstract

PURPOSE:To provide the multimodal input analysis system where it is unneces sary to point an object at each time and an input means is freely selected without forcing a user to input a voice. CONSTITUTION:This system is provided with a pointing input means 1, a voice input means 2, an object information storage part 3 where the position and feature information of each object are stored, an operation object holding part 4 where information of an object as the operation object is stored, a pointing event table 5 where the processing to be executed correspondingly to pointing input is described, a voice event table 6 where the processing to be executed correspondingly to the pattern of voice input is described, a history holding part 7 where the input history is held, an integrating and interpreting part 8 which integrates and interprets input data from the pointing input means and the voice input means to determine a processing, and a processing execution part 9 which executes the determined processing.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、入力解析システムに関
し、特にポインティング入力と音声入力を統合して解析
するマルチモーダル解析システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an input analysis system, and more particularly to a multimodal analysis system for integrating pointing input and voice input for analysis.

【０００２】[0002]

【従来の技術】従来のポインティングデバイスと音声入
力を統合したインタフェースとして、「これをここへ移
動」という音声入力と、「これ」に対応する対象へのポ
インティング入力と、「ここ」に対応する位置へのポイ
ンティング入力とを組み合わせて解釈する方法がある
（参考文献：「マルチモーダルインタフェースにおける
情報統合モデルの検討」、安藤他、第８回ヒューマンイ
ンタフェース・シンポジウム論文集、１９９２）。2. Description of the Related Art As an interface integrating a conventional pointing device and a voice input, a voice input of "move this", a pointing input to an object corresponding to "this", and a position corresponding to "here" There is a method of interpreting by combining with pointing input to (Reference: "Examination of information integration model in multimodal interface", Ando et al., Proc. Of the 8th Human Interface Symposium, 1992).

【０００３】[0003]

【発明が解決しようとする課題】この従来のマルチモー
ダル入力方法は、操作を指示するときに、操作対象への
ポインティングと、「これを」などの指示語を必ず組み
合わせて入力する必要がある。たとえばある対象物を移
動させるときには、「これをここへ移動」という音声入
力と、ほぼ同時に「これ」に対応する対象物へのポイン
ティングと、「ここ」に対応する位置へのポインティン
グを行なう必要がある。同じ対象物に対して操作を繰り
返すときでも、いちいち対象物を指示しなおさなければ
ならない。また操作の指示を必ず音声で行なわなければ
ならないという欠点がある。In the conventional multi-modal input method, when instructing an operation, it is necessary to input the pointing to the operation target and an instruction word such as "this" in combination. For example, when moving a certain object, it is necessary to perform a voice input "Move this to here", pointing to the object corresponding to "This" at almost the same time, and pointing to a position corresponding to "Here". is there. Even when the operation is repeated for the same object, the object must be pointed again. Further, there is a drawback that the operation instruction must be given by voice.

【０００４】[0004]

【課題を解決するための手段】本発明の目的は、対象物
を毎回ポインティングする必要がなく、また音声入力を
強制せず自由に入力手段を選択できるマルチモーダル入
力解析システムを提供することにある。SUMMARY OF THE INVENTION It is an object of the present invention to provide a multimodal input analysis system which does not require pointing of an object each time and which can freely select input means without forcing voice input. .

【０００５】このため本発明のマルチモーダル入力解析
システムは、操作対象となる各オブジェクトの位置や特
徴情報を格納するオブジェクト情報格納部と、現在の操
作対象オブジェクトの情報を格納する操作対象保持部
と、入力履歴を格納しておく履歴保持部と、ポインティ
ング入力に応じて行なうべき処理を記述したポインティ
ングイベントテーブルと、音声入力のパターンに応じて
行なうべき処理を記述した音声イベントテーブルと、ポ
インティング入力手段および音声入力手段から入力デー
タを受け取り、入力データに応じて処理を振り分ける入
力判定部と、前記入力判定部からポインティング入力デ
ータのみを受け取って、前記ポインティングイベントテ
ーブルを参照して、実行する処理を決定するポインティ
ングイベント単独処理部と、前記入力判定部から音声入
力データとポインティング入力データのリストを受け取
り、音声入力データ中の指示表現の数とポインティング
入力データの数を比較し、ポインティング入力データの
方が少なければ、前記履歴保持部内の最新の入力データ
がポインティング入力であるかどうか、前記操作対象保
持部内のオブジェクト情報が音声入力データ中の最初に
出現する指示表現に合致するかどうか、また次の入力が
ポインティング入力かどうかを、この順に調べ、該当す
るデータが見つかった時点でそのデータを前記ポインテ
ィング入力データリストに追加するポインティング情報
獲得部と、このポインティング入力データリストおよび
音声入力データを受け取って、音声入力データ中の意味
表現とポインティング入力データリスト中のポインティ
ングデータを出現順に比較して、意味表現に合致してい
ればその意味表現をポインティングデータで置き換える
同定処理部と、同定処理後の意味表現を受け取り、音声
イベントテーブルを参照して実行する処理を決定する処
理決定部と、決定された処理を実行してその結果を前記
オブジェクト格納部および操作対象保持部に反映させる
処理実行部とを有することを特徴とする。Therefore, the multimodal input analysis system of the present invention includes an object information storage section for storing the position and characteristic information of each operation target object, and an operation target holding section for storing information of the current operation target object. A history holding unit for storing an input history, a pointing event table describing a process to be performed according to pointing input, a voice event table describing a process to be performed according to a voice input pattern, and pointing input means And an input determination unit that receives input data from the voice input unit and distributes processing according to the input data, and receives only pointing input data from the input determination unit and refers to the pointing event table to determine a process to be executed. Pointing event alone Section and the list of voice input data and pointing input data from the input determination unit, compares the number of pointing expressions in the voice input data with the number of pointing input data, and if there is less pointing input data, the history Whether the latest input data in the holding unit is a pointing input, whether the object information in the operation target holding unit matches the instruction expression that appears first in the voice input data, and whether the next input is a pointing input In this order, and when the corresponding data is found, the pointing information acquisition unit that adds the data to the pointing input data list, and the pointing input data list and the voice input data are received, and the meaning in the voice input data is received. Representation and pointing input data list The pointing data is compared in the order of appearance, and if the meaning expression matches the meaning expression, the identification processing unit that replaces the meaning expression with the pointing data and the meaning expression after the identification processing are received, and executed by referring to the voice event table. It is characterized by having a process determining unit that determines a process and a process executing unit that executes the determined process and reflects the result on the object storage unit and the operation target holding unit.

【０００６】[0006]

【実施例】本発明の実施例について図面を参照しながら
説明する。Embodiments of the present invention will be described with reference to the drawings.

【０００７】図１は、本発明の一実施例であるマルチモ
ーダル入力解析システムを示すブロック図である。この
マルチモーダル入力解析システムは、マウスやジョイス
ティックなどのポインティング入力手段１と、音声を入
力する音声入力手段２と、各オブジェクトの位置や特徴
情報を格納しておくオブジェクト情報格納部３と、操作
対象となっているオブジェクトの情報を格納しておく操
作対象保持部４と、ポインティング入力に対応して行な
うべき処理を記述したポインティングイベントテーブル
５と、音声入力のパターンに対応して行うべき処理を記
述した音声イベントテーブル６と、入力履歴を保存して
おく履歴保持部７と、ポインティング入力手段および音
声入力手段からの入力データを統合して解釈し処理を決
定する統合解釈部８と、決定された処理を実行する処理
実行部９とを備えている。FIG. 1 is a block diagram showing a multimodal input analysis system which is an embodiment of the present invention. This multimodal input analysis system includes a pointing input means 1 such as a mouse and a joystick, a voice input means 2 for inputting a voice, an object information storage section 3 for storing the position and characteristic information of each object, and an operation target. The operation target holding unit 4 for storing the information of the object, the pointing event table 5 describing the processing to be performed in response to pointing input, and the processing to be performed in response to the voice input pattern are described. The voice event table 6, the history holding unit 7 that stores the input history, the integrated interpretation unit 8 that integrates and interprets the input data from the pointing input unit and the voice input unit, and determines the processing are determined. And a process execution unit 9 that executes a process.

【０００８】統合解釈部８は、入力判定部２１と、ポイ
ンティングイベント単独処理部２２と、ポインティング
情報獲得部２３と、同定処理部２４と、処理決定部２５
とから構成される。The integrated interpretation unit 8 includes an input determination unit 21, a pointing event independent processing unit 22, a pointing information acquisition unit 23, an identification processing unit 24, and a processing determination unit 25.
Composed of and.

【０００９】音声入力手段２は、音声入力が開始される
と、音声入力開始シグナルを出力する。音声入力が終了
すると、入力された音声の内容を認識し、認識結果を出
力する。認識結果は、入力順序を保持した意味表現で表
される。例えば、「このまるをここへ移動」という音声
入力は、図５のようなリストで表される。The voice input means 2 outputs a voice input start signal when voice input is started. When the voice input is completed, the content of the input voice is recognized and the recognition result is output. The recognition result is represented by a semantic expression holding the input order. For example, the voice input "Move this Maru here" is represented by a list as shown in FIG.

【００１０】入力判定部２１は、ポインティング入力手
段１と音声入力手段２からデータを受け取り、受け取っ
たデータによって処理を振り分ける。入力判定部２１
は、音声入力手段部２から音声入力開始シグナルを受け
取らない間は、ポインティング入力手段１から受け取っ
たデータをポインティングイベント単独処理部２２へ送
る。音声入力開始シグナルを受け取ると、音声入力が終
了するまで、すなわち次に音声入力データを受け取るま
で、ポインティング入力手段１から受け取ったデータを
ポインティング情報リストとして貯めておく。音声入力
データを受け取った時点で、その音声入力データとポイ
ンティング情報リストをポインティング情報獲得部２３
へ送る。The input determination section 21 receives data from the pointing input means 1 and the voice input means 2 and sorts the processing according to the received data. Input determination unit 21
Sends the data received from the pointing input unit 1 to the pointing event independent processing unit 22 while not receiving the voice input start signal from the voice input unit 2. When the voice input start signal is received, the data received from the pointing input means 1 is stored as a pointing information list until the voice input ends, that is, until the next voice input data is received. When the voice input data is received, the voice input data and the pointing information list are stored in the pointing information acquisition unit 23.
Send to.

【００１１】ポインティングイベント単独処理部２２
は、ポインティングイベントテーブル５を参照して、入
力データに対応する処理手順を取り出し、処理実行部９
へ処理手順を送る。ポインティング情報獲得部２３は、
入力判定部２１から音声入力データとポインティング情
報リストを受け取ると、音声入力データ中の指示表現の
数と、ポインティング情報リスト中のポインティング入
力の数を比較し、ポインティング入力の数が少ない場
合、履歴保持部７および操作対象保持部４を参照したり
次のポインティング入力を得るなどしてポインティング
情報を獲得し、ポインティング情報リストに追加する。Pointing event independent processing unit 22
Refers to the pointing event table 5 to extract the processing procedure corresponding to the input data, and the processing execution unit 9
Send the processing procedure to. The pointing information acquisition unit 23,
When the voice input data and the pointing information list are received from the input determination unit 21, the number of instruction expressions in the voice input data is compared with the number of pointing inputs in the pointing information list, and if the number of pointing inputs is small, the history is retained. The pointing information is acquired by referring to the section 7 and the operation target holding section 4 or by obtaining the next pointing input, and added to the pointing information list.

【００１２】同定処理部２４は、オブジェクト情報獲得
部２３から音声入力データとポインティング情報リスト
を受け取り、音声入力データ中の指示表現とポインティ
ング情報との同定を行なう。The identification processing unit 24 receives the voice input data and the pointing information list from the object information acquisition unit 23 and identifies the pointing expression and the pointing information in the voice input data.

【００１３】処理決定部２５は、音声イベントテーブル
６を参照して、音声入力データのパターンに対応する処
理手順を取りだし、処理実行部９へ送る。The process determining section 25 refers to the voice event table 6 and extracts a processing procedure corresponding to the pattern of the voice input data and sends it to the process executing section 9.

【００１４】処理実行部９は受け取った処理手順に従っ
て実際の処理を行ない、処理結果に応じてオブジェクト
情報格納部３および操作対象保持部４の情報を変更す
る。The processing execution unit 9 performs the actual processing according to the received processing procedure, and changes the information in the object information storage unit 3 and the operation target holding unit 4 according to the processing result.

【００１５】図２は、本実施例の統合解釈部８の処理の
流れを示す図である。図２を用いて、本実施例の統合解
釈部８の処理の詳細を実例を用いながら説明する。FIG. 2 is a diagram showing the flow of processing of the integrated interpretation unit 8 of this embodiment. The details of the process of the integrated interpretation unit 8 of the present embodiment will be described with reference to FIG. 2 by using an actual example.

【００１６】ステップＳ３１ポインティング入力手段
１および音声入力手段２からの入力を待つ。Step S31 The input from the pointing input means 1 and the voice input means 2 is awaited.

【００１７】ステップＳ３２受け取った入力データが
音声入力開始シグナルかどうかを判定する。受け取った
入力が音声入力開始シグナルであればステップＳ３４以
降の処理を行ない、そうでなければ、すなわち入力がポ
インティング入力データであれば、入力データをポイン
ティングイベント単独処理部２２へ渡す。Step S32: It is determined whether the received input data is a voice input start signal. If the received input is a voice input start signal, the process from step S34 is performed. If not, that is, if the input is pointing input data, the input data is passed to the pointing event independent processing unit 22.

【００１８】ステップＳ３３ポインティングイベント
単独処理部２２は、ポインティングイベントテーブル５
を参照して、入力データに対応する処理手順を取り出し
処理実行部９に処理手順を送る。処理実行部９は受け取
った処理手順に従って順に処理を実行する。In step S33, the pointing event independent processing section 22 uses the pointing event table 5
With reference to, the processing procedure corresponding to the input data is extracted and sent to the processing execution unit 9. The processing execution unit 9 sequentially executes processing according to the received processing procedure.

【００１９】ステップＳ３４入力判定部２１は、受け
取った入力データが音声入力開始シグナルであれば、次
に音声入力データを受け取るまで待つ。この間に入力さ
れたポインティング入力データはポインティング情報リ
ストに保存される。この処理は音声入力データを受け取
った時点で終了し、音声入力データとポインティング情
報リストをポインティング情報獲得部２３へ渡す。Step S34 If the received input data is the voice input start signal, the input judging section 21 waits until the next voice input data is received. The pointing input data input during this period is stored in the pointing information list. This process ends when the voice input data is received, and the voice input data and the pointing information list are passed to the pointing information acquisition unit 23.

【００２０】ステップＳ３５ポインティング情報獲得
部２３は、音声入力データ中の指示表現の数と、ポイン
ティング情報リスト内のポインティング入力の数を比較
する。例えば音声入力データが図５に示す意味表現であ
る場合は、指示表現の数は、“指示語（）”項をサブリ
スト中に持つ要素の数となり、この例では２となる。指
示表現の数よりポインティング入力の数の方が多い場合
ステップＳ３６の処理へ、少ない場合ステップＳ３７の
処理へ、同じ場合ステップＳ３８の処理へそれぞれ進
む。Step S35 The pointing information acquisition unit 23 compares the number of instruction expressions in the voice input data with the number of pointing inputs in the pointing information list. For example, when the voice input data is the semantic expression shown in FIG. 5, the number of instruction expressions is the number of elements having the "instruction word ()" item in the sublist, which is 2 in this example. If the number of pointing inputs is larger than the number of pointing expressions, the process proceeds to step S36, if it is less, the process proceeds to step S37, and if the same, the process proceeds to step S38.

【００２１】上記の例で、ポインティング情報リストが [pointing(X,Y)] のようにひとつの要素からなるリストであった場合、指
示表現の数は１となり、ポインティング入力の数の方が
少ないのでステップＳ３８の処理へ進む。In the above example, when the pointing information list is a list consisting of one element such as [pointing (X, Y)], the number of instruction expressions is 1 and the number of pointing inputs is smaller. Therefore, the process proceeds to step S38.

【００２２】ステップＳ３６ポインティング入力の数
の方が指示語の数より多かった場合、エラーメッセージ
を表示するなどのエラー処理を行ない、ステップＳ３１
へ戻る。Step S36 If the number of pointing inputs is larger than the number of pointing words, error processing such as displaying an error message is performed, and step S31 is performed.
Return to.

【００２３】ステップＳ３７ポインティング入力の数
の方が指示表現の数より少かった場合、履歴保持部７と
操作対象保持部４を参照して、不足しているポインティ
ング情報の獲得を行なう。Step S37 If the number of pointing inputs is less than the number of instruction expressions, the history holding unit 7 and the operation target holding unit 4 are referenced to acquire the missing pointing information.

【００２４】ポインティング情報の獲得は以下の順序で
行なわれる。ポインティング情報は、ポインティング入
力データである場合と、特定のオブジェクトの情報であ
る場合がある。The acquisition of pointing information is performed in the following order. The pointing information may be pointing input data or information of a specific object.

【００２５】・ポインティング入力の数が指示表現の数
よりひとつ少ない場合のオブジェクト情報獲得方法を図
３に示す。FIG. 3 shows an object information acquisition method when the number of pointing inputs is one less than the number of instruction expressions.

【００２６】ステップＳ５１履歴保持部７を参照し
て、直前の入力がポインティング入力だったかどうかを
調べる。ポインティング入力であれば、それをポインテ
ィング情報リストの先頭に追加して終了する。Step S51 Referring to the history holding unit 7, it is checked whether or not the last input was the pointing input. If it is pointing input, add it to the beginning of the pointing information list and end.

【００２７】ステップＳ５２最初に出現する指示表現
が対象物の指定であれば、操作対象保持部４を参照し、
現在の操作対象オブジェクトがあればそのオブジェクト
情報をポインティング情報リストの先頭に追加して終了
する。このとき指示表現中に指示語以外の条件があれ
ば、その操作対象オブジェクトがその条件に合うかどう
かをチェックする。たとえば図５の例では、“タイプ
（まる）”項が条件となり、操作対象オブジェクトの形
が“まる”であるかどうかをチェックする。条件に合わ
なければ失敗とする。Step S52 If the first instructional expression is the designation of the object, the operation object holding unit 4 is referred to,
If there is a current operation target object, the object information is added to the beginning of the pointing information list and the processing ends. At this time, if the instruction expression includes a condition other than the instruction word, it is checked whether the operation target object meets the condition. For example, in the example of FIG. 5, the "type (maru)" item becomes a condition, and it is checked whether the shape of the operation target object is "maru". If it does not meet the conditions, it is considered a failure.

【００２８】ステップＳ５３次の入力を待つ。このと
き、ポインティング入力があるか、あるいは音声入力終
了後あらかじめ決められた時間が経過するまで待つ。ポ
インティング入力があればそれをポインティング情報リ
ストの最後に追加して終了する。タイムアウトであれば
失敗して終了する。Step S53 Wait for the next input. At this time, it waits until there is a pointing input or a predetermined time elapses after the voice input ends. If there is pointing input, add it to the end of the pointing information list and exit. If it times out, it fails and ends.

【００２９】・ポインティング入力の数が指示表現の数
よりふたつ少ない場合のオブジェクト情報獲得方法を図
４に示す。FIG. 4 shows a method for obtaining object information when the number of pointing inputs is two smaller than the number of pointing expressions.

【００３０】ステップＳ６１履歴保持部７を参照し
て、直前の入力がポインティング入力だったかどうかを
調べる。ポインティング入力であれば、それをポインテ
ィング情報リストの先頭に追加しステップＳ６３へ進
む。Step S61 Referring to the history holding unit 7, it is checked whether or not the last input was the pointing input. If it is a pointing input, it is added to the head of the pointing information list and the process proceeds to step S63.

【００３１】ステップＳ６２最初に出現する指示表現
が対象物の指定であれば、操作対象保持部４を参照し、
もし現在の操作対象オブジェクトがあれば、そのオブジ
ェクト情報をポインティング情報リストの先頭に追加す
る。現在の操作対象オブジェクトがなければ失敗して終
了する。このとき指示表現中に指示語以外の条件があれ
ば、その操作対象オブジェクトがその条件に合うかどう
かをチェックする。条件に合わなければ失敗とする。Step S62 If the instruction expression that appears first is the designation of the object, the operation object holding unit 4 is referred to,
If there is a current operation target object, that object information is added to the top of the pointing information list. If there is no current operation target object, it fails and ends. At this time, if the instruction expression includes a condition other than the instruction word, it is checked whether the operation target object meets the condition. If it does not meet the conditions, it is considered a failure.

【００３２】ステップＳ６３次の入力を待つ。このと
き、ポインティング入力があるか、あるいは音声入力終
了後あらかじめ決められた時間が経過するまで待つ。ポ
インティング入力があればそれをポインティング情報リ
ストの最後に追加して終了する。タイムアウトであれば
失敗して終了する。Step S63 Wait for the next input. At this time, it waits until there is a pointing input or a predetermined time elapses after the voice input ends. If there is pointing input, add it to the end of the pointing information list and exit. If it times out, it fails and ends.

【００３３】図２に戻り、上記の処理でポインティング
情報の獲得が失敗すると、ステップＳ３６でエラー処理
を行なってステップＳ３１に戻る。Returning to FIG. 2, if the acquisition of pointing information fails in the above process, error processing is performed in step S36 and the process returns to step S31.

【００３４】ステップＳ３８同定処理部２４は、音声
入力データ中の指示表現とポインティング情報の同定処
理を行なう。Step S38 The identification processing section 24 carries out an identification process of the pointing expression and the pointing information in the voice input data.

【００３５】音声入力データの意味表現中の指示表現
と、ポインティング情報リスト中のポインティング情報
を出現順に同定する。この同定は、以下の手順で行われ
る。The pointing expression in the semantic expression of the voice input data and the pointing information in the pointing information list are identified in the order of appearance. This identification is performed by the following procedure.

【００３６】・指示表現が対象物の指定で、ポインティ
ング情報がオブジェクト情報である場合、指示表現をオ
ブジェクト情報で置き換える。When the designated expression is the designation of the object and the pointing information is the object information, the designated expression is replaced with the object information.

【００３７】・指示表現が、対象物の指定で、ポインテ
ィング情報がポインティング入力データ、すなわち
（Ｘ，Ｙ）座標データである場合、オブジェクト情報格
納部を参照してその（Ｘ，Ｙ）座標にあるオブジェクト
の情報を得る。もしこのとき条件が付加されていれば、
その条件に合うオブジェクトを探す。指示表現をオブジ
ェクト情報で置き換える。When the instruction expression is the designation of an object and the pointing information is pointing input data, that is, (X, Y) coordinate data, the object information storage unit is referenced to locate at the (X, Y) coordinates. Get information about an object. If conditions are added at this time,
Find an object that meets the conditions. Replace the instruction expression with object information.

【００３８】・指示表現が位置の指定である場合、指示
表現をポインティング情報の（Ｘ，Ｙ）座標で置き換え
る。When the designated expression is the designation of the position, the designated expression is replaced with the (X, Y) coordinates of the pointing information.

【００３９】例えば、図５の意味表現は、図６のように
書き換えられる。For example, the semantic representation of FIG. 5 can be rewritten as shown in FIG.

【００４０】ステップＳ３９音声イベントテーブル６
を参照して、ステップＳ３８で生成された入力の意味表
現と一致する音声入力パターンを検索し、その音声入力
パターンに対応する処理手続きを取り出して処理実行部
９へ送る。Step S39 Audio event table 6
With reference to, a voice input pattern that matches the semantic expression of the input generated in step S38 is searched, and a processing procedure corresponding to the voice input pattern is extracted and sent to the processing execution unit 9.

【００４１】ステップＳ４０入力データを履歴保持部
７へ格納し、ステップＳ３１へ戻る。Step S40 The input data is stored in the history holding unit 7, and the process returns to step S31.

【００４２】[0042]

【発明の効果】以上説明したように、本発明によるマル
チモーダル入力解析システムは、直前の入力履歴と、現
在の操作対象オブジェクトを用いて解釈を行なうため、
同じ対象物に対して操作を繰り返すような場合に、毎回
対象物を指示する必要がなくなる。また音声入力がない
場合でも、ポインティング入力を単独で処理するので、
音声入力を強制されることがない。As described above, since the multimodal input analysis system according to the present invention uses the immediately preceding input history and the current operation target object for interpretation,
When the operation is repeated for the same target object, it is not necessary to point the target object every time. Even if there is no voice input, the pointing input is processed independently, so
No voice input is forced.

[Brief description of drawings]

【図１】本発明のマルチモーダル入力解析システムの一
実施例を示すブロック図である。FIG. 1 is a block diagram showing an embodiment of a multimodal input analysis system of the present invention.

【図２】本発明の統合解釈部の処理の流れの一例を示す
図である。FIG. 2 is a diagram showing an example of a processing flow of an integrated interpretation unit of the present invention.

【図３】本発明のポインティング情報獲得部の処理の流
れの一例を示す図である。FIG. 3 is a diagram showing an example of a processing flow of a pointing information acquisition unit of the present invention.

【図４】本発明のポインティング情報獲得部の処理の流
れの別の一例を示す図である。FIG. 4 is a diagram showing another example of the processing flow of the pointing information acquisition unit of the present invention.

【図５】音声入力データの意味表現の一例を示す図であ
る。FIG. 5 is a diagram showing an example of a semantic representation of voice input data.

【図６】同定処理後の意味表現の一例を示す図である。FIG. 6 is a diagram showing an example of a semantic expression after identification processing.

[Explanation of symbols]

１ポインティング入力手段２音声入力手段３オブジェクト情報格納部４操作対象保持部５ポインティングイベントテーブル６音声イベントテーブル７履歴保持部８統合解釈部９処理実行部２１入力判定部２２ポインティングイベント単独処理部２３ポインティング情報獲得部２４同定処理部２５処理決定部 1 Pointing Input Means 2 Voice Input Means 3 Object Information Storage Section 4 Operation Target Holding Section 5 Pointing Event Table 6 Voice Event Table 7 History Holding Section 8 Integrated Interpretation Section 9 Processing Execution Section 21 Input Judgment Section 22 Pointing Event Independent Processing Section 23 Pointing Information acquisition unit 24 Identification processing unit 25 Processing determination unit

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｇ１０Ｌ 3/00 ５３１Ｄ 9379−5Ｈ ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification code Internal reference number FI technical display location G10L 3/00 531 D 9379-5H

Claims

[Claims]

1. An object information storage unit that stores the position and characteristic information of each operation target object, an operation target holding unit that stores information about the current operation target object, and a history holding unit that stores an input history. Section, a pointing event table that describes processing that should be performed according to pointing input, a voice event table that describes processing that should be performed according to a voice input pattern, and input data received from pointing input means and voice input means, An input determination unit that distributes processing according to input data; a pointing event independent processing unit that receives only pointing input data from the input determination unit and refers to the pointing event table to determine a processing to be executed; The voice input data and poi The list of pointing input data is received, the number of pointing expressions in the voice input data is compared with the number of pointing input data, and if the pointing input data is less, the missing pointing information is acquired and the pointing input is performed. The pointing information acquisition unit to be added to the data list, and the pointing input data list and the voice input data are received, and the semantic representation in the voice input data and the pointing data in the pointing input data list are compared in the order of appearance to obtain the semantic representation. If they match, the identification processing unit that replaces the semantic expression with pointing data, the processing determination unit that receives the semantic expression after the identification processing and determines the processing to be executed by referring to the voice event table, and the determined processing Execute and the result is the object Multimodal input analyzing system characterized by comprising a processing execution section to be reflected in the storage unit and the operation target holding unit.

2. The lacking pointing information is acquired by determining whether the latest input data in the history holding unit is pointing input, and the object information in the operation target holding unit appears first in the voice input data. 2. The multimodal input analysis system according to claim 1, wherein the multimodal input analysis system is performed by checking whether or not the specified expression is satisfied and whether or not the next input is a pointing input in this order, and finding the corresponding data.

3. The multimodal input analysis system according to claim 2, wherein when the corresponding data is found, the data is added to the pointing input data list.