JP2002116854A

JP2002116854A - Information input device, information input method and storage medium

Info

Publication number: JP2002116854A
Application number: JP2000311099A
Authority: JP
Inventors: Keiichi Sakai; 桂一酒井; Tetsuo Kosaka; 哲夫小坂; Shigeki Shibayama; 茂樹柴山
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2000-10-11
Filing date: 2000-10-11
Publication date: 2002-04-19

Abstract

PROBLEM TO BE SOLVED: To provide an information input device capable of smoothly performing a speech input without requiring any skillful operation. SOLUTION: In a target application, the recognized result of a speech recognizing means for defining the labels of overlapped menus or links as recognition vocabularies can be confirmed with a screen or synthetic speech. In the case of a non-overlapped speech input, the input is defined as a short cut input, and in the case of an overlapped speech input, a value inputted by referring to the past history is preferentially used as the value of an intermediate operation to be uniquely decided so that the speech confirmation can be performed. Thus, it is possible to perform smooth speech input operation.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は情報入力装置、情報
入力方法及び記憶媒体に関し、特に、従来のウインドウ
システムやウェブフラウザ等において、メニューやリン
ク等を用い、複数手順を通して入力を行っていたインタ
フェースに音声入力を用いることによって、より円滑な
ショートカットの入力を可能にする音声インタフェース
装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information input device, an information input method, and a storage medium, and more particularly, to input through a plurality of procedures using menus and links in a conventional window system, web browser, or the like. The present invention relates to a voice interface device that enables smooth input of shortcuts by using voice input for an interface.

【０００２】[0002]

【従来の技術】一般に、ウィンドウシステムやウェブフ
ラウザ等のアプリケーションに対する操作のインタフェ
ース手段としては、通常、所望のコマンド入力や、所望
の情報を得るために、メニューやリンク等を用い、複数
の操作手順を行って入力する必要があった。2. Description of the Related Art Generally, as an interface means for an operation to an application such as a window system or a web browser, a menu or a link is usually used to input a desired command or obtain desired information. I had to go through the steps and enter it.

【０００３】これに対して、（ａ）使用頻度の高い一連
の操作手順にボタンやラベルを付与し、そのボタンやラ
ベルを指定することによってショートカットの入力を行
なえる機能や、（ｂ）音声入力手段などを用いて、直接
入力を行なえる機能を保持するウィンドウシステムやウ
ェブブラウザ等も出現しつつある。[0003] On the other hand, (a) a function of assigning buttons and labels to a series of frequently used operation procedures and inputting a shortcut by designating the buttons and labels, and (b) voice input. A window system, a web browser, and the like that hold a function of directly inputting by using means or the like are also appearing.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、上記従
来の装置において、上記（ａ）については、ショートカ
ットのボタンやラベルを付与する作業自体が煩わしい問
題があった。また、付与できるボタンやラベルの数が限
られていることや、ラベルを記憶する必要があることな
どの問題があった。However, in the above-mentioned conventional apparatus, there is a problem that the operation itself of giving a shortcut button or label is cumbersome in (a) above. There are also problems such as the limited number of buttons and labels that can be provided and the need to store labels.

【０００５】また、上記（ｂ）については、直接入力は
排他的である必要があるので、メニューやリンクの値に
制約があったり、ユーザがその値を熟知していないと自
由に使いこなせない等の欠点があった。[0005] In the case of (b), since direct input needs to be exclusive, the values of menus and links are restricted, and if the user is not familiar with the values, it cannot be used freely. There were drawbacks.

【０００６】本発明は上述の問題点にかんがみ、熟練し
た操作を要することなく円滑な音声入力を行うことを可
能とする情報入力装置を提供できるようにすることを目
的とする。SUMMARY OF THE INVENTION In view of the above problems, it is an object of the present invention to provide an information input device capable of performing a smooth voice input without requiring a skilled operation.

【０００７】[0007]

【課題を解決するための手段】本発明の情報入力装置
は、対象とするアプリケーションにおいて重複を伴なう
メニュー及びリンクのラベルを認識語彙とする音声認識
手段と、上記音声認識手段の認識結果をユーザに提示し
て確認させるようにするための認識結果確認手段と、上
記対象とするアプリケーションにおいて、重複がある音
声入力について一意に決定するための中間操作の一覧を
生成する中間操作生成手段と、上記重複がある音声入力
について、上記中間操作の一覧から、ユーザに提示する
中間操作の候補を作成する中間操作候補作成手段と、上
記音声認識手段により認識した結果、及び上記中間操作
候補作成手段で作成した中間操作の候補とをユーザに提
示して確認させるための合成音声を生成する確認応答生
成手段とを具備することを特徴としている。また、本発
明の他の特徴とするところは、上記認識結果確認手段
は、スクリーンまたは合成音声により認識結果をユーザ
に知らせて確認させることを特徴としている。また、本
発明のその他の特徴とするところは、インタフェースの
対象とするアプリケーションから、ＡＰＩを含む通信手
段を介して、アプリケーションに入力可能なイベントの
書式及びそのラベルを収集するイベント収集手段と、上
記イベント収集手段によって収集されたイベントのラベ
ルが重複しているか否かを判定するイベント重複判定手
段と、上記イベント重複判定手段によって、イベントの
ラベルが重複していると判定された場合に、イベントを
一意に決定するための中間操作の一覧を生成する中間操
作生成手段と、上記イベント収集手段によって収集され
たイベントの書式及びラベル、上記イベント重複判定手
段で判定された重複の有無を示すフラグ、及び上記中間
操作生成手段においてイベントを一意に決定する中間操
作の一覧を保持するイベント保持手段と、上記重複があ
る音声入力について、上記中間操作の一覧から、ユーザ
に提示する中間操作の候補を作成する中間操作候補作成
手段と、上記インタフェースを介してユーザとの音声入
出力を行う音声入出力手段と、上記音声入出力手段から
入力された音声を、上記イベント保持手段に保持された
イベントのラベルのいずれか、または合成音声出力手段
によって出力された音声確認に対する肯否を認識する音
声認識手段と、上記音声認識手段によって認識された認
識結果を保持する認識結果保持手段と、上記認識結果保
持手段に保持された認識結果から上記イベント保持手段
に保持されたイベントが一意に決定されるか否かを判定
する認識結果判定手段と、上記認識結果判定手段でイベ
ントを一意に決定されないと判定された場合に、上記イ
ベント保持手段に保持された中間操作からイベントを一
意に決定するための中間操作の一覧をユーザに提示する
中間操作一覧作成手段と、上記認識結果保持手段に保持
された認識結果、及び上記中間操作候補作成手段によっ
て作成された中間操作の一覧からユーザに提示する合成
音声を生成する確認応答生成手段と、上記認識結果保持
手段、履歴保持手段からインタフェースが対象とするア
プリケーションの入力イベントを生成するイベント生成
手段と、上記インタフェースが対象とするアプリケーシ
ョンに、ＡＰＩを含む通信手段を介してイベントを発行
するイベント発行手段とを具備することを特徴としてい
る。また、本発明のその他の特徴とするところは、上記
アプリケーションに入力可能なイベントは、メニューで
指示入力可能なコマンド、あるいはマウスクリックによ
って移動可能なリンク先であることを特徴としている。
また、本発明のその他の特徴とするところは、上記認識
結果保持手段に保持された認識結果の履歴を保持する履
歴保持手段と、上記履歴保持手段に保持された認識結果
の履歴と、上記イベント保持手段に保持された中間操作
の一覧とを参照し、ユーザに提示する中間操作の候補を
作成する中間操作候補作成手段とを具備することを特徴
としている。また、本発明のその他の特徴とするところ
は、特定のイベントに対して、イベントと優先的に与え
る特定の中間操作とを上記音声認識手段からユーザが指
示入力するための優先中間操作指示手段と、上記優先中
間操作指示手段によって指示された優先中間操作を保持
する優先中間操作保持手段と、上記優先中間操作保持手
段に保持されたイベントと、中間操作と、上記イベント
保持手段に保持された中間操作の一覧とを参照して、ユ
ーザに提示する中間操作の候補を作成する優先中間操作
付与手段とを具備することを特徴としている。また、本
発明のその他の特徴とするところは、特定のイベントに
対して、イベントと優先的に与える特定の中間操作とを
予め保持するデフォルト中間操作保持手段と、上記デフ
ォルト中間操作保持手段に保持されたイベントと、中間
操作と、上記イベント保持手段に保持された中間操作の
一覧とを参照して、ユーザに提示する中間操作の候補を
作成するデフォルト中間操作付与手段とを具備すること
を特徴としている。An information input device according to the present invention comprises: a speech recognition unit that uses menu and link labels with duplication in a target application as a recognition vocabulary; and a recognition result of the speech recognition unit. A recognition result confirming means for presenting to a user for confirmation, an intermediate operation generating means for generating a list of intermediate operations for uniquely determining an overlapping voice input in the target application, From the list of intermediate operations, the intermediate operation candidate creating means for creating an intermediate operation candidate to be presented to the user, the result recognized by the voice recognition means, and the intermediate operation candidate creating means And a confirmation response generation means for generating a synthesized voice for presenting the created intermediate operation candidate to the user for confirmation. It is characterized by a door. Another feature of the present invention is that the recognition result confirming means informs the user of the recognition result by a screen or a synthesized voice to confirm the result. Another feature of the present invention is that an event collection unit that collects the format and label of an event that can be input to the application from the application to be interfaced through the communication unit including the API, An event duplication determining means for determining whether or not the label of the event collected by the event collecting means is duplicated; and, if the event duplication determining means determines that the label of the event is duplicated, the event is determined. An intermediate operation generating means for generating a list of intermediate operations for uniquely determining, a format and a label of the event collected by the event collecting means, a flag indicating the presence or absence of duplication determined by the event duplication determining means, and List of intermediate operations that uniquely determine an event in the intermediate operation generation means An event holding means for holding, an intermediate operation candidate creating means for creating an intermediate operation candidate to be presented to the user from the list of intermediate operations for the voice input having the overlap, and a voice input with the user via the interface. A voice input / output means for outputting, and a voice input from the voice input / output means, either one of the event labels held in the event holding means, or affirmative or negative for voice confirmation output by the synthesized voice output means. Voice recognition means for recognizing the information, a recognition result holding means for holding the recognition result recognized by the voice recognition means, and an event held in the event holding means based on the recognition result held in the recognition result holding means. The event is not uniquely determined by the recognition result determining means for determining whether or not the event is determined When it is determined, the intermediate operation list creating unit that presents to the user a list of intermediate operations for uniquely determining an event from the intermediate operations held in the event holding unit, and the intermediate operation list creation unit that is held in the recognition result holding unit An acknowledgment generation unit for generating a synthesized voice to be presented to the user from the recognition result and the list of intermediate operations created by the intermediate operation candidate creation unit; and an application targeted for an interface from the recognition result holding unit and the history holding unit. And an event issuing means for issuing an event to an application targeted by the interface via a communication means including an API. Another feature of the present invention is that the event that can be input to the application is a command that can be input in a menu or a link that can be moved by clicking a mouse.
Another feature of the present invention is that a history holding means for holding a history of recognition results held in the recognition result holding means, a history of recognition results held in the history holding means, An intermediate operation candidate creating unit that creates an intermediate operation candidate to be presented to the user with reference to the list of intermediate operations held in the holding unit. Another feature of the present invention is that, for a specific event, a priority intermediate operation instructing means for inputting an instruction by the user from the voice recognition means to the event and a specific intermediate operation to be given with priority. A priority intermediate operation holding means for holding a priority intermediate operation instructed by the priority intermediate operation instructing means, an event held in the priority intermediate operation holding means, an intermediate operation, and an intermediate held in the event holding means. A priority intermediate operation assigning unit that creates a candidate for an intermediate operation to be presented to the user with reference to the list of operations. Another feature of the present invention is that, for a specific event, a default intermediate operation holding unit that holds an event and a specific intermediate operation that is given priority in advance, and a default intermediate operation holding unit that holds the event. A default intermediate operation providing unit that creates a candidate for an intermediate operation to be presented to the user by referring to the performed event, the intermediate operation, and a list of the intermediate operations held in the event holding unit. And

【０００８】本発明の情報入力方法は、インタフェース
の対象とするアプリケーションから、ＡＰＩを含む通信
手段を介し、アプリケーションに入力可能なイベントの
書式、及びそのラベルを収集するイベント収集工程と、
上記イベント収集工程によって収集されたイベントのラ
ベルが重複しているか否かを判定するイベント重複判定
工程と、上記イベント重複判定工程によって、上記イベ
ントのラベルが重複していると判定された場合に、上記
イベントを一意に決定するための中間操作の一覧を生成
する中間操作生成工程と、上記イベント収集工程によっ
て収集されたイベントの書式及びラベル、上記イベント
重複判定工程で判定された重複の有無を示すフラグ、及
び上記中間操作生成工程においてイベントを一意に決定
する中間操作の一覧をイベント保持部に保持するイベン
ト保持工程と、ユーザに対して音声入出力処理を行う音
声入出力工程と、上記音声入出力工程において入力され
た音声を、上記イベント保持部に保持されたイベントの
ラベルのいずれか、または合成音声出力処理によって出
力された音声確認に対する肯否を認識する音声認識工程
と、上記音声認識工程によって認識された認識結果を認
識結果保持部に保持する認識結果保持工程と、上記認識
結果保持部に保持された認識結果から上記イベント保持
部に保持されたイベントが一意に決定されるか否かを判
定する認識結果判定工程と、上記認識結果判定工程にお
いて、イベントを一意に決定されないと判定された場合
に、上記イベント保持部に保持された中間操作からイベ
ントを一意に決定するための中間操作の一覧をユーザに
提示する中間操作一覧作成工程と、上記認識結果保持部
に保持された認識結果、及び上記中間操作候補作成工程
によって作成された中間操作の一覧からユーザに提示す
る合成音声を生成する確認応答生成工程と、上記認識結
果保持部、履歴保持部からインタフェースの対象とする
アプリケーションの入力イベントを生成するイベント生
成工程と、上記インタフェースと対象とするアプリケー
ションにＡＰＩを含む通信手段を介してイベントを発行
するイベント発行工程とを行うことを特徴としている。
また、本発明の他の特徴とするところは、上記認識結果
保持部に保持された認識結果の履歴を履歴保持部に保持
する履歴保持工程と、上記履歴保持部に保持された認識
結果の履歴と、上記イベント保持部に保持された中間操
作を参照し、ユーザに提示する中間操作の候補を作成す
る中間操作候補作成工程とを行うことを特徴としてい
る。また、本発明のその他の特徴とするところは、上記
イベントのうちの、特定のイベントに対して、イベント
と優先的に与える特定の中間操作とを上記音声認識工程
においてユーザが指示する優先中間操作指示工程と、上
記優先中間操作指示工程によって指示された優先中間操
作を中間操作保持部に保持する優先中間操作保持工程
と、上記優先中間操作保持部に保持されたイベントと中
間操作と、上記イベント保持部に保持された中間操作と
を参照して、ユーザに提示する中間操作の候補を作成す
る優先中間操作付与工程とを行うことを特徴としてい
る。また、本発明のその他の特徴とするところは、上記
イベントのうちの、特定のイベントに対して、イベント
と優先的に与える特定の中間操作をデフォルト中間操作
保持部内に予め保持するデフォルト中間操作保持工程
と、上記デフォルト中間操作保持部に保持されたイベン
トと中間操作と、上記イベント保持部に保持された中間
操作とを参照し、ユーザに提示する中間操作の候補を作
成するデフォルト中間操作付与工程とを行うことを特徴
としている。An information input method according to the present invention includes an event collecting step of collecting a format of an event that can be input to an application from a target application of the interface via a communication means including an API and a label of the event,
An event duplication determination step of determining whether or not the label of the event collected by the event collection step is duplicated, and when the event duplication determination step determines that the label of the event is duplicated, An intermediate operation generating step of generating a list of intermediate operations for uniquely determining the event, the format and label of the event collected by the event collecting step, and the presence / absence of duplication determined in the event duplication determining step An event holding step of holding a flag and a list of intermediate operations for uniquely determining an event in the intermediate operation generation step in an event holding unit; a voice input / output step of performing voice input / output processing for a user; The voice input in the output step is one of the event labels held in the event holding unit. Alternatively, a voice recognition step of recognizing whether or not the voice confirmation output by the synthesized voice output processing is positive or negative, a recognition result holding step of storing a recognition result recognized by the voice recognition step in a recognition result holding unit, and a recognition result holding step A recognition result determining step of determining whether the event held in the event holding unit is uniquely determined from the recognition result held in the unit; and determining that the event is not uniquely determined in the recognition result determining step. The intermediate operation list creation step of presenting to the user a list of intermediate operations for uniquely determining an event from the intermediate operations held in the event holding unit, and the recognition held in the recognition result holding unit Acknowledgment generation for generating a synthesized voice to be presented to the user from the result and the list of intermediate operations created in the intermediate operation candidate creation step An event generating step of generating an input event of an application targeted by the interface from the recognition result holding unit and the history holding unit; and issuing an event to the interface and the target application via communication means including an API. An event issuing step is performed.
Another feature of the present invention is that a history holding step of holding a history of the recognition results held in the recognition result holding unit in a history holding unit, and a history of the recognition results held in the history holding unit. And an intermediate operation candidate creating step of creating an intermediate operation candidate to be presented to the user with reference to the intermediate operation held in the event holding unit. Another feature of the present invention is that a priority intermediate operation in which a user designates an event and a specific intermediate operation to be given preferentially in the voice recognition step for a specific event among the above events. An instruction step; a priority intermediate operation holding step of holding the priority intermediate operation instructed by the priority intermediate operation instruction step in the intermediate operation holding unit; an event and an intermediate operation held in the priority intermediate operation holding unit; A priority intermediate operation providing step of creating a candidate for an intermediate operation to be presented to the user with reference to the intermediate operation held in the holding unit. Another feature of the present invention is that, among the above-described events, a specific intermediate operation that is given preferentially to an event is held in a default intermediate operation holding unit in advance in a default intermediate operation holding unit. Default intermediate operation providing step of creating a candidate for an intermediate operation to be presented to a user by referring to a process, an event and an intermediate operation held in the default intermediate operation holding unit, and an intermediate operation held in the event holding unit And is performed.

【０００９】本発明の記憶媒体は、上記に記載の各手段
を構成するプログラムをコンピュータから読み出し可能
に格納したことを特徴としている。また、本発明の他の
特徴とするところは、上記に記載の情報入力方法を実行
するプログラムをコンピュータから読み出し可能に格納
したことを特徴としている。[0009] A storage medium according to the present invention is characterized in that a program constituting each means described above is stored so as to be readable from a computer. Another feature of the present invention is that a program for executing the above-described information input method is stored so as to be readable from a computer.

【００１０】[0010]

【作用】本発明は上記技術手段を有するので、対象とす
るアプリケーションにおいて重複を伴なうメニュー及び
リンクのラベルを認識語彙とする音声認識手段によって
認識した結果をスクリーンや合成音声でユーザが確認可
能であるとともに、重複がない音声入力の場合は、その
入力をショートカットの入力とする。また、重複がある
音声入力の場合には、一意に決定するための中間操作の
値として、過去に入力した値やユーザが登録した値、装
置が予め規定した値を優先的に用いた確認を行うこと
で、より円滑な音声入力操作を可能としている。Since the present invention has the above technical means, the user can confirm the result of the recognition by the voice recognition means using the menu and link labels accompanied by the duplication in the target application as the recognition vocabulary by the screen or the synthesized voice. In the case of a voice input having no overlap, the input is used as a shortcut input. In addition, in the case of a voice input having a duplication, a value input in the past, a value registered by a user, or a value preliminarily set by the device is used as a value of an intermediate operation for uniquely determining a value. By doing so, a smoother voice input operation is possible.

【００１１】[0011]

【発明の実施の形態】［第１の実施形態］以下、添付の
図面を参照して本発明の情報入力装置、情報入力方法及
び記憶媒体の第１の実施形態を詳細に説明する。図１
は、本発明の実施形態の音声インタフェース装置の概略
の構成を示すブロック図である。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS [First Embodiment] A first embodiment of an information input device, an information input method and a storage medium according to the present invention will be described below in detail with reference to the accompanying drawings. FIG.
1 is a block diagram illustrating a schematic configuration of a voice interface device according to an embodiment of the present invention.

【００１２】図１において、１は音声入力部であり、マ
イク、変調装置などを介してユーザが音声を入力するた
めに使用する。２はＣＰＵであり、本実施形態の音声イ
ンタフェース装置、及び対象となるアプリケーションに
おける各種制御を行う。In FIG. 1, reference numeral 1 denotes a voice input unit, which is used by a user to input voice via a microphone, a modulator, or the like. Reference numeral 2 denotes a CPU, which performs various controls in the voice interface device of the present embodiment and a target application.

【００１３】３はＲＯＭであり、ＣＰＵ２が実行する制
御プログラムを格納するためのものである。このＲＯＭ
３には、後述のフローチャートで説明する制御を実行す
るための制御プログラムや、本実施形態における音声イ
ンタフェース装置の対象となるアプリケーションも格納
されている。Reference numeral 3 denotes a ROM for storing a control program executed by the CPU 2. This ROM
3 stores a control program for executing control described in a flowchart described later and an application to be used by the voice interface device in the present embodiment.

【００１４】４はＲＡＭであり、ＣＰＵ２が各種の制御
を実行するための作業領域を提供する。５は外部メモリ
であり、本実施形態の装置の音声インタフェースの対象
となるアプリケーションのイベントなどが格納されてい
る。６は音声出力部であり、合成音声信号に基づいて音
声を生成して出力し、変調装置、スピーカなどを介して
ユーザに合成音声を提供する。７はバスであり、上記の
各構成を相互に接続し、各構成間におけるデータの授受
を可能とする。Reference numeral 4 denotes a RAM, which provides a work area for the CPU 2 to execute various controls. Reference numeral 5 denotes an external memory which stores events of an application which is a target of the voice interface of the apparatus of the present embodiment. Reference numeral 6 denotes a voice output unit that generates and outputs voice based on the synthesized voice signal, and provides the user with the synthesized voice via a modulator, a speaker, or the like. Reference numeral 7 denotes a bus, which interconnects the above-described components, and enables data transfer between the components.

【００１５】図２は、本実施形態の音声インタフェース
装置の機能構成を示すブロック図である。図２におい
て、１０１はインタフェースの対象とするアプリケーシ
ョンに入力できるイベント（メニューで指示入力できる
コマンドやマウスクリックによって移動できるリンク先
など）の書式及びそのラベルを、ＡＰＩなどの通信手段
を介して収集するイベント収集部である。FIG. 2 is a block diagram showing a functional configuration of the voice interface device of the present embodiment. In FIG. 2, reference numeral 101 denotes a format and a label of an event that can be input to an application to be interfaced (a command that can be input by a menu or a link that can be moved by clicking a mouse) and a label thereof through an API or other communication means. Event collection unit.

【００１６】１０２はイベント収集部１０１によって収
集されたイベントのラベルが重複しているか否かを判定
するイベント重複判定部であり、１０３はイベント重複
判定部１０２によって、イベントのラベルが重複してい
る場合にイベントを一意に決定するための中間操作の一
覧を生成する中間操作生成部である。Reference numeral 102 denotes an event duplication determination unit that determines whether or not the labels of the events collected by the event collection unit 101 are duplicated. Reference numeral 103 denotes an event that is duplicated by the event duplication determination unit 102. An intermediate operation generation unit that generates a list of intermediate operations for uniquely determining an event in a case.

【００１７】１０４はイベント収集部１０１によって収
集されたイベントの書式及びラベル、イベント重複判定
部１０２で判定された重複の有無を示すフラグ、及び中
間操作生成部１０３において生成された中間操作の一覧
等の各データを保持するイベント保持部である。Reference numeral 104 denotes the format and label of the event collected by the event collection unit 101, a flag indicating the presence or absence of duplication determined by the event duplication determination unit 102, a list of intermediate operations generated by the intermediate operation generation unit 103, and the like. This is an event holding unit that holds each data item.

【００１８】１０５はユーザとの音声入出力を行う音声
入出力部であり、１０６は音声入出力部１０５から入力
された音声を、イベント保持部１０４に保持されたイベ
ントのラベルのいずれか、及び後述する合成音声出力部
１１２によって出力された音声確認に対する肯否を認識
する音声認識部である。Reference numeral 105 denotes a voice input / output unit for performing voice input / output with the user. Reference numeral 106 denotes a voice input from the voice input / output unit 105, which is one of an event label held in the event holding unit 104, and This is a voice recognition unit that recognizes whether the voice confirmation output by the synthesized voice output unit 112 described later is positive or negative.

【００１９】１０７は音声認識部１０６によって認識さ
れた認識結果を保持する認識結果保持部であり、１０８
は認識結果保持部１０７に保持された認識結果の履歴を
保持する履歴保持部である。１０９は認識結果保持部１
０７に保持された認識結果のみでイベント保持部１０４
に保持されたイベントが一意に決定されるか否かを判定
する認識結果判定部である。Reference numeral 107 denotes a recognition result holding unit for holding the recognition result recognized by the voice recognition unit 106;
Reference numeral denotes a history holding unit that holds the history of the recognition results held in the recognition result holding unit 107. 109 is a recognition result holding unit 1
The event holding unit 104 uses only the recognition result held in the
Is a recognition result determination unit that determines whether or not the event held in is determined uniquely.

【００２０】１１０は認識結果判定部１０９で一意に決
定されないと判定された場合に、イベント保持部１０４
に保持された中間操作と、履歴保持部１０８に保持され
た認識結果の履歴とからイベントを一意に決定するため
の中間操作を作成する中間操作作成部である。Reference numeral 110 denotes an event holding unit 104 when the recognition result determination unit 109 determines that it is not uniquely determined.
And an intermediate operation creating unit for creating an intermediate operation for uniquely determining an event from the intermediate operation held in the history holding unit 108 and the history of the recognition result held in the history holding unit 108.

【００２１】１１１は中間操作判定部１１０によってイ
ベントを一意に決定するための中間操作の一覧から、履
歴保持部１０９を参照してユーザに提示する中間操作の
候補を作成する中間操作候補作成部であり、１１２は認
識結果保持部１０７に保持された認識結果や中間操作候
補作成部で作成された中間操作の候補をユーザに確認す
る合成音声を生成する確認応答生成部である。An intermediate operation candidate creating unit 111 creates an intermediate operation candidate to be presented to the user with reference to the history holding unit 109 from a list of intermediate operations for uniquely determining an event by the intermediate operation determining unit 110. Reference numeral 112 denotes an acknowledgment generation unit that generates a synthetic voice for confirming to the user the recognition result held in the recognition result holding unit 107 and the intermediate operation candidate created by the intermediate operation candidate creation unit.

【００２２】１１３はイベント保持部１０４、認識結果
保持部１０７、履歴保持部１０８を参照してインタフェ
ースの対象とするアプリケーションの入力イベントを生
成するイベント生成部であり、１１４はＡＰＩなどの通
信手段を介してインタフェースが対象とするアプリケー
ションにイベントを発行するイベント発行部である。イ
ベント収集部１０１〜イベント発行部１１４はバス１１
５によって相互接続されている。Reference numeral 113 denotes an event generation unit which generates an input event of an application to be interfaced with reference to the event storage unit 104, the recognition result storage unit 107, and the history storage unit 108, and 114 denotes communication means such as an API. An event issuing unit that issues an event to an application targeted by the interface via the interface. The event collecting unit 101 to the event issuing unit 114
5 interconnected.

【００２３】次に、本実施形態の装置の動作を説明す
る。図３は、本実施形態の音声インタフェース装置の動
作手順を示すフローチャートである。図３において、ま
ず、ステップＳ２０１では、イベント収集部１０１にお
いて、インタフェースの対象とするアプリケーションに
入力できるイベント（メニューで指示入力できるコマン
ドやマウスクリックによって移動できるリンク先など）
の書式、及びそのラベルを、ＡＰＩなどの通信手段を介
して、収集するイベント収集処理を行う。Next, the operation of the apparatus of this embodiment will be described. FIG. 3 is a flowchart showing an operation procedure of the voice interface device of the present embodiment. In FIG. 3, first, in step S201, an event that can be input to an application to be interfaced (a command that can be input by a menu or a link that can be moved by mouse click) in the event collection unit 101.
An event collection process is performed to collect the format and its label via communication means such as an API.

【００２４】続いて、ステップＳ２０２では、イベント
重複判定部１０２において、イベント収集部１０１によ
って収集されたイベントのラベルが重複しているか否か
を判定するイベント重複判定処理を行う。この判定の結
果、重複が存在する場合には、ステップＳ２０３に移
り、存在しない場合には、ステップＳ２０３をジャンプ
してステップＳ２０４に移る。Subsequently, in step S202, the event duplication determination unit 102 performs an event duplication determination process of determining whether or not the labels of the events collected by the event collection unit 101 are duplicated. If the result of this determination is that there is an overlap, the process moves to step S203; otherwise, the process jumps from step S203 and moves to step S204.

【００２５】ステップＳ２０３では、中間操作決定部１
０３において、イベントを一意に決定するための中間操
作を決定する中間操作決定処理を行ない、ステップＳ２
０４に移る。In step S203, the intermediate operation determining unit 1
In step 03, an intermediate operation determination process for determining an intermediate operation for uniquely determining an event is performed, and step S2 is performed.
Move to 04.

【００２６】ステップＳ２０４では、イベント収集部１
０１によって収集されたイベントの書式及びラベル、イ
ベント重複判定部１０２で判定された重複の有無を示す
フラグ、及びイベント重複判定部１０２で判定された重
複有りと判定された場合に、中間操作決定部１０３にお
いてイベントを一意に決定する中間操作を保持するイベ
ント保持処理を行ない、ステップＳ２０５に移る。In step S204, the event collection unit 1
01, a flag indicating the presence or absence of duplication determined by the event duplication determination unit 102, and an intermediate operation determination unit when the event duplication determination unit 102 determines that there is duplication. At 103, an event holding process for holding an intermediate operation for uniquely determining an event is performed, and the routine goes to Step S205.

【００２７】ステップＳ２０５では、音声入出力部１０
５から入力された音声を、音声認識部１０６において、
イベント保持部１０４に保持されたイベントのラベルの
いずれか、及び合成音声出力部１１２によって出力され
た音声確認に対する肯否を認識する音声認識処理を行な
い、その後ステップＳ２０６に移る。In step S205, the voice input / output unit 10
5 is input to the voice recognition unit 106.
A voice recognition process is performed to recognize whether any of the event labels stored in the event storage unit 104 and the voice confirmation output by the synthesized voice output unit 112 is positive or negative, and then the process proceeds to step S206.

【００２８】ステップＳ２０６では、ステップＳ２０５
で認識された音声認識結果が合成音声出力部１１２によ
って出力された音声確認に対する肯否であれば、ステッ
プＳ２１３に移り、そうでなければステップＳ２０７に
移る。In step S206, step S205
If the result of the voice recognition recognized in step S1 is affirmative or negative for the voice confirmation output by the synthesized voice output unit 112, the process proceeds to step S213; otherwise, the process proceeds to step S207.

【００２９】ステップＳ２０７では、音声認識部１０６
によって認識された認識結果を認識結果保持部１０７、
及び履歴保持部１０８に保持する認識結果履歴保持処理
を行ない、その後ステップＳ２０８に移る。In step S207, the voice recognition unit 106
The recognition result recognized by the recognition result holding unit 107,
Then, a recognition result history holding process held in the history holding unit 108 is performed, and then the process proceeds to step S208.

【００３０】ステップＳ２０８では、認識結果判定部１
０８において、イベント保持部１０４を参照し、認識結
果保持部１０７に保持された認識結果に重複が存在する
か否かを判定し、存在する場合にはステップＳ２０９に
移り、存在しない場合には、ステップＳ２１１に移る。In step S208, the recognition result determination unit 1
In step 08, it is determined whether or not the recognition result held in the recognition result holding unit 107 includes an overlap with reference to the event holding unit 104. If there is, the process proceeds to step S 209. Move to step S211.

【００３１】ステップＳ２０９では、中間操作判定部１
１０において、イベント保持部１０４に保持された中間
操作の一覧を抽出する中間操作抽出処理を行ない、処理
を終了後にステップＳ２１０に移る。In step S209, the intermediate operation determination unit 1
In step 10, an intermediate operation extraction process for extracting a list of intermediate operations held in the event holding unit 104 is performed, and after the process ends, the flow proceeds to step S210.

【００３２】ステップＳ２１０では、中間操作候補作成
部１１１において、履歴保持部１０８に保持された認識
結果の履歴を新しいものから順に参照し、中間操作判定
部１１０が抽出した中間操作の一覧が含まれていた場合
には、それを中間操作の候補とする中間操作候補作成処
理を行ない、その後ステップＳ２１１に移る。In step S210, the intermediate operation candidate creating unit 111 refers to the history of the recognition results held in the history holding unit 108 in ascending order, and includes a list of intermediate operations extracted by the intermediate operation determination unit 110. If it has been, an intermediate operation candidate creating process is performed to set it as an intermediate operation candidate, and then the process proceeds to step S211.

【００３３】ステップＳ２１１では、確認応答生成部１
１２において、認識結果保持部１０７に保持された認識
結果からユーザに確認する合成音声を生成する確認応答
生成処理を行ない、その後ステップＳ２１２に移る。In step S211, the acknowledgment generation unit 1
In step 12, an acknowledgment generation process is performed to generate a synthesized voice for checking with the user from the recognition result held in the recognition result holding unit 107, and then the process proceeds to step S212.

【００３４】ステップＳ２１２では、音声入出力部１０
５において、確認応答生成部１１２で生成された確認す
る合成音声あるいは中間操作候補作成部１１１で作成さ
れた中間操作の候補を提示する合成音声を出力してステ
ップＳ２０１に戻る。In step S212, the voice input / output unit 10
In step 5, the synthesized voice to be confirmed generated by the confirmation response generation unit 112 or the synthesized voice to present the candidate for the intermediate operation created by the intermediate operation candidate creation unit 111 is output, and the process returns to step S201.

【００３５】ステップＳ２１３では、ステップＳ２０６
で判定された肯否が肯定であれば、ステップＳ２１４に
移り、否定であれば、ステップＳ２１０に移る。ステッ
プＳ２１４では、イベント生成部１１４において、イベ
ント保持部１０４を参照し、アプリケーションに送信す
るイベントを生成し、イベント発行部１１５において生
成されたイベントをアプリケーションに送信するイベン
ト発行処理を行ない、ステップＳ２０１に移る。In step S213, step S206
If the determination in step is affirmative, the process proceeds to step S214. If the determination is negative, the process proceeds to step S210. In step S214, the event generation unit 114 refers to the event holding unit 104, generates an event to be transmitted to the application, and performs an event issuing process of transmitting the event generated in the event issuing unit 115 to the application. Move on.

【００３６】以上で述べたように、本実施形態の音声イ
ンタフェース装置では、対象とするアプリケーションに
おいて重複を伴なうメニューやリンクのラベルを認識語
彙とする音声認識手段を備え、認識結果をスクリーンや
合成音声で確認するとともに、重複がない音声入力の場
合は、その入力をショートカットの入力とし、重複があ
る音声入力の場合、一意に決定するための中間操作の値
として、過去の履歴を参照して入力した値を優先的に用
いた確認を行うようにしたので、より円滑な操作を実現
できる。As described above, the voice interface device of the present embodiment is provided with the voice recognition means that uses the labels of the menus and links that overlap in the target application as the recognition vocabulary, and displays the recognition results on the screen or the screen. In addition to checking the synthesized speech, if the speech input has no overlap, use that input as the shortcut input.If the speech input has overlap, refer to the past history as the value of the intermediate operation to determine uniquely. Since the input value is used as a priority for confirmation, a smoother operation can be realized.

【００３７】［第２の実施形態］次に、図４を参照しな
がら本発明の第２の実施形態を説明する。上述した第１
の実施形態では、請求項５の中間操作が一意に決定しな
い場合に、過去の履歴を参照して候補を提示する場合に
ついて説明したが、請求項３では、履歴保持部１０８及
び中間操作候補作成部１１１がなく、ステップＳ２０７
においては、音声認識部によって認識された認識結果を
認識結果保持部１０７保持する認識結果保持処理を行な
い、ステップＳ２１１の処理は省略される。[Second Embodiment] Next, a second embodiment of the present invention will be described with reference to FIG. The first mentioned above
In the embodiment described above, the case where the candidate is presented with reference to the past history when the intermediate operation of claim 5 is not uniquely determined has been described. There is no unit 111 and step S207
In, the recognition result holding process for holding the recognition result recognized by the speech recognition unit in the recognition result holding unit 107 is performed, and the process in step S211 is omitted.

【００３８】また、請求項６の中間操作が一意に決定し
ない場合に、予めユーザが中間操作を登録する場合に
は、特定のイベントに対して、イベントと優先的に与え
る特定の中間操作を音声認識部１０５からユーザが指示
入力する優先中間操作指示部１１６と、優先中間操作指
示部１１６によって指示された優先中間操作を保持する
優先中間操作保持部１１７と、優先中間操作保持部１１
７に保持されたイベントと中間操作と、上記イベント保
持手段に保持された中間操作を参照し、ユーザに提示す
る中間操作の候補を作成する優先中間操作付与部１１８
を備える。In the case where the intermediate operation is not uniquely determined and the user registers the intermediate operation in advance, a specific intermediate operation to be given preferentially to the event with respect to the specific event is sounded. A priority intermediate operation instructing unit 116 to which a user inputs an instruction from the recognizing unit 105, a priority intermediate operation holding unit 117 that holds a priority intermediate operation instructed by the priority intermediate operation instructing unit 116, and a priority intermediate operation holding unit 11
7. The priority intermediate operation assigning unit 118 that creates a candidate for an intermediate operation to be presented to the user by referring to the event and the intermediate operation held in 7 and the intermediate operation held in the event holding unit.
Is provided.

【００３９】そして、ステップＳ２０１の前に、特定の
イベントに対して、イベントと優先的に与える特定の中
間操作を音声入出力部１０５からユーザが指示入力する
優先中間操作指示処理を行う。Prior to step S201, a priority intermediate operation instructing process in which the user inputs a specific intermediate operation given preferentially to the event from the voice input / output unit 105 is performed for the specific event.

【００４０】また、ステップＳ２１１における中間操作
候補生成処理において、優先中間操作付与部１１８に
て、優先中間操作保持部１１７に保持された中間操作を
候補とする。In the intermediate operation candidate generation processing in step S 211, the intermediate operation held in the priority intermediate operation holding unit 117 by the priority intermediate operation giving unit 118 is set as a candidate.

【００４１】この際、履歴保持部１０８と優先中間操作
保持部１１７の両方に、別の中間操作が含まれている場
合には、両者を候補として提示しても構わないし、どち
らかを優先するものとしても構わない。At this time, when another intermediate operation is included in both the history holding unit 108 and the priority intermediate operation holding unit 117, both of them may be presented as candidates or one of them is given priority. It doesn't matter.

【００４２】また、請求項７の中間操作が一意に決定し
ない場合に、優先するデフォルト中間操作を装置内に登
録する場合には、特定のイベントに対して、イベントと
優先的に与える特定の中間操作を保持するデフォルト中
間操作保持部１１９と、デフォルト中間操作保持部１１
９に保持されたイベントと中間操作と、上記イベント保
持手段に保持された中間操作を参照し、ユーザに提示す
る中間操作の候補を作成するデフォルト中間操作付与部
１２０を備える。In the case where the default intermediate operation which is prioritized is registered in the apparatus when the intermediate operation of claim 7 is not uniquely determined, a specific intermediate event is given preferentially to an event with respect to a specific event. A default intermediate operation holding unit 119 for holding an operation and a default intermediate operation holding unit 11
And an intermediate operation held in the event holding means, and a default intermediate operation providing unit 120 for creating a candidate for an intermediate operation to be presented to the user.

【００４３】そして、ステップＳ２１１における中間操
作候補生成処理において、デフォルト中間操作付与部１
２０にて、デフォルト中間操作保持部１１９に保持され
た中間操作を候補とする。Then, in the intermediate operation candidate generation processing in step S211, the default intermediate operation giving unit 1
At 20, the intermediate operation held in the default intermediate operation holding unit 119 is set as a candidate.

【００４４】この際、履歴保持部１０８、優先中間操作
保持部１１７、デフォルト中間操作保持部１１９の複数
に、別の中間操作が含まれている場合には、すべてを候
補として提示しても構わないし、どちらかを優先するも
のとしても構わない。At this time, when a plurality of the history storage unit 108, the priority intermediate operation storage unit 117, and the default intermediate operation storage unit 119 include another intermediate operation, all of them may be presented as candidates. Alternatively, either one may be given priority.

【００４５】なお、本発明は、複数の機器から構成され
るシステムに適用しても、１つの機器からなる装置に適
用してもよい。上述した実施形態の機能を実現するソフ
トウエアのプログラムコードを記録した記録媒体を、シ
ステムあるいは装置に供給し、そのシステムあるいは装
置のコンピュータ（またはＣＰＵやＭＰＵ）が記憶媒体
に格納されたプログラムコードを読み出し実行すること
によっても、達成されることは言うまでもない。The present invention may be applied to a system composed of a plurality of devices or an apparatus composed of one device. A recording medium storing software program codes for realizing the functions of the above-described embodiments is supplied to a system or apparatus, and a computer (or CPU or MPU) of the system or apparatus executes the program code stored in the storage medium. Needless to say, this can also be achieved by executing the reading.

【００４６】この場合、記録媒体から読み出されたプロ
グラムコード自体が上述した実施形態の機能を実現する
ことになり、そのプログラムコードを記録した記録媒体
は本発明を構成することになる。In this case, the program code itself read from the recording medium implements the functions of the above-described embodiment, and the recording medium on which the program code is recorded constitutes the present invention.

【００４７】プログラムコードを供給するための記録媒
体としては、例えば、フロッピー（登録商標）ディス
ク、ハードディスク、光ディスク、光磁気ディスク、Ｃ
Ｄ−ＲＯＭ，ＣＤ−Ｒ、磁気テープ、不揮発性のメモリ
カード、ＲＯＭなどを用いることができる。As a recording medium for supplying the program code, for example, a floppy (registered trademark) disk, hard disk, optical disk, magneto-optical disk, C
A D-ROM, a CD-R, a magnetic tape, a nonvolatile memory card, a ROM, and the like can be used.

【００４８】また、コンピュータが読み出したプログラ
ムコードを実行することにより、上述した実施形態の機
能が実現されるだけでなく、そのプログラムコードの指
示に基づき、コンピュータ上で稼働しているＯＳなどが
実際の処理の一部または全部を行ない、その処理によっ
て上述した実施形態の機能が実現される場合も含まれる
ことは言うまでもない。When the computer executes the readout program code, not only the functions of the above-described embodiment are realized, but also the OS and the like running on the computer are actually executed based on the instructions of the program code. It goes without saying that a part or all of the above-described processing is performed, and the functions of the above-described embodiments are realized by the processing.

【００４９】更に、記録媒体から読み出されたプログラ
ムコードが、コンピュータに挿入された機能拡張ボード
やコンピュータに接続された機能拡張ユニットに備わる
メモリに書き込まれた後、そのプログラムコードの指示
に基づき、その機能拡張ボードや機能拡張ユニットに備
わるＣＰＵなどが実際の処理の一部または全部を行な
い、その処理によって上述した実施形態の機能が実現さ
れる場合も含まれることは言うまでもない。Further, after the program code read from the recording medium is written into a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, based on the instruction of the program code, It goes without saying that the CPU provided in the function expansion board or the function expansion unit performs part or all of the actual processing, and the functions of the above-described embodiments are realized by the processing.

【００５０】[0050]

【発明の効果】以上説明したように、本発明によれば、
重複を伴なうメニュー及びリンクのラベルを認識語彙と
する音声認識手段によって認識した結果をスクリーンや
合成音声で確認できるようにするとともに、重複がない
音声入力の場合は、その入力をショートカットの入力と
し、重複がある音声入力については、一意に決定するた
めの中間操作の値として、過去に入力した値やユーザが
登録した値、装置が予め規定した値を優先的に用いた確
認を行うようにしたので、熟練した操作を必要とするこ
となく円滑な音声入力を行うことが可能な情報入力装置
を提供することができ、より円滑な音声入力操作を実現
することができる。As described above, according to the present invention,
Make it possible to confirm the result of recognition by the voice recognition means that uses the labels of menus and links with duplication as recognition vocabulary on the screen and synthetic speech. In the case of a voice input having a duplication, a value that is input in the past, a value registered by a user, or a value specified in advance by a device is preferentially checked as a value of an intermediate operation for uniquely determining the value. Therefore, it is possible to provide an information input device capable of performing a smooth voice input without requiring a skilled operation, and to realize a smoother voice input operation.

[Brief description of the drawings]

【図１】第１の実施形態の音声インタフェース装置にお
ける概略の構成を示すブロック図である。FIG. 1 is a block diagram illustrating a schematic configuration of a voice interface device according to a first embodiment.

【図２】第１の実施形態の音声インタフェース装置にお
ける機能構成を示すブロック図である。FIG. 2 is a block diagram illustrating a functional configuration of the voice interface device according to the first embodiment.

【図３】第１の実施形態インタフェース装置の動作手順
を示すフローチャートである。FIG. 3 is a flowchart illustrating an operation procedure of the interface device according to the first embodiment.

【図４】第２の実施形態の音声インタフェース装置にお
ける機能構成を示すブロック図である。FIG. 4 is a block diagram illustrating a functional configuration of a voice interface device according to a second embodiment.

[Explanation of symbols]

１０１イベント収集部１０２イベント重複判定部１０３中間操作生成部１０４イベント保持部１０５音声入出力部１０６音声認識部１０７認識結果保持部１０８履歴保持部１０９認識結果判定部１１０中間操作一覧作成部１１１中間操作候補作成部１１２確認応答生成部（合成音出力部）１１３イベント生成部１１４イベント発行部１１５バス DESCRIPTION OF SYMBOLS 101 Event collection part 102 Event duplication determination part 103 Intermediate operation generation part 104 Event storage part 105 Voice input / output part 106 Voice recognition part 107 Recognition result storage part 108 History storage part 109 Recognition result determination part 110 Intermediate operation list preparation part 111 Intermediate operation Candidate creator 112 Acknowledgment generator (synthesized sound output unit) 113 Event generator 114 Event issuing unit 115 Bus

───────────────────────────────────────────────────── フロントページの続き (72)発明者柴山茂樹東京都大田区下丸子３丁目30番２号キヤノン株式会社内Ｆターム(参考） 5E501 AA01 BA05 CA08 CB15 EA21 FA32 FA43 ────────────────────────────────────────────────── ─── Continued on the front page (72) Inventor Shigeki Shibayama 3-30-2 Shimomaruko, Ota-ku, Tokyo F-term in Canon Inc. (reference) 5E501 AA01 BA05 CA08 CB15 EA21 FA32 FA43

Claims

[Claims]

1. A speech recognition unit that uses menu and link labels with duplication in a target application as a recognition vocabulary, and presents a recognition result of the speech recognition unit to a user for confirmation. A recognition result confirming means; an intermediate operation generating means for generating a list of intermediate operations for uniquely determining an overlapped voice input in the target application; and From the list, an intermediate operation candidate creating unit that creates an intermediate operation candidate to be presented to the user, a result recognized by the voice recognition unit, and an intermediate operation candidate created by the intermediate operation candidate creating unit are presented to the user. And an acknowledgment generating means for generating a synthesized voice for making a confirmation.

2. The information input device according to claim 1, wherein the recognition result confirming unit informs a user of the recognition result by a screen or a synthesized voice and causes the user to confirm the result.

3. An event collecting means for collecting a format and a label of an event which can be input to an application from an application to be interfaced through an communication means including an API, and an event collected by the event collecting means. And an intermediate operation for uniquely determining an event when the event duplication determining means determines that the label of the event is duplicated. An intermediate operation generating means for generating a list of the event, a format and a label of the event collected by the event collecting means, a flag indicating the presence or absence of duplication determined by the event duplication determining means, and an event generated by the intermediate operation generating means. Event holding means for holding a list of uniquely determined intermediate operations An intermediate operation candidate creating means for creating an intermediate operation candidate to be presented to the user from the list of the intermediate operations for the voice input having the overlap, and an audio input / output for performing audio input / output with the user via the interface. Means, a sound input from the sound input / output means, any one of the label of the event held in the event holding means,
Alternatively, a voice recognition unit that recognizes whether or not the voice confirmation output by the synthesized voice output unit is affirmative, a recognition result holding unit that holds a recognition result recognized by the voice recognition unit, and a recognition result holding unit that holds the recognition result. A recognition result determining unit that determines whether or not the event held in the event holding unit is uniquely determined from the recognition result; and if the recognition result determining unit determines that the event is not uniquely determined, An intermediate operation list creating unit that presents a list of intermediate operations for uniquely determining an event from the intermediate operations held in the event holding unit to the user; a recognition result held in the recognition result holding unit; and the intermediate operation An acknowledgment generating means for generating a synthesized voice to be presented to a user from a list of intermediate operations created by the candidate creating means; Result holding means, an event generation unit interface from the history holding means for generating an input event of the application of interest, the application that the interface is intended,
An information input device, comprising: an event issuing unit that issues an event via a communication unit including an API.

4. The information input device according to claim 3, wherein the event that can be input to the application is a command that can be input by a menu or a link that can be moved by clicking a mouse.

5. A history holding means for holding a history of recognition results held in said recognition result holding means, a history of recognition results held in said history holding means, and an intermediate operation held in said event holding means. 5. The information input device according to claim 4, further comprising: an intermediate operation candidate creating unit that creates a candidate for an intermediate operation to be presented to the user by referring to the list.

6. A priority intermediate operation instruction means for allowing a user to input an event and a specific intermediate operation to be given preferentially to a specific event from said voice recognition means, and said priority intermediate operation instruction means. Refer to the priority intermediate operation holding means for holding the priority intermediate operation designated by, the event held in the priority intermediate operation holding means, the intermediate operation, and the list of the intermediate operations held in the event holding means. The information input device according to claim 4, further comprising a priority intermediate operation providing unit that generates a candidate for an intermediate operation to be presented to the user.

7. A default intermediate operation holding means for holding in advance an event and a specific intermediate operation given preferentially for a specific event; an event held in the default intermediate operation holding means; And a default intermediate operation providing means for creating a candidate for an intermediate operation to be presented to the user with reference to a list of intermediate operations held in the event holding means. The information input device according to claim 1.

8. An event collection step of collecting a format and a label of an event that can be input to an application from an application to be interfaced through communication means including an API, and an event collected by the event collection step An event duplication determining step of determining whether or not the label of the event is duplicated; and, if it is determined by the event duplication determining step that the label of the event is duplicated, the event is uniquely determined. An intermediate operation generating step of generating a list of intermediate operations; a format and a label of an event collected by the event collecting step; a flag indicating whether or not duplication has been determined in the event duplication determining step; and an intermediate operation generating step. A list of intermediate operations that uniquely determine an event is stored in the event storage. An event holding step to hold, a voice input / output step of performing a voice input / output process for a user, and a voice input in the voice input / output step, any one of event labels held in the event holding unit, Alternatively, a voice recognition step of recognizing whether or not the voice confirmation output by the synthesized voice output processing is positive or negative, a recognition result holding step of storing the recognition result recognized by the voice recognition step in a recognition result holding unit, and a recognition result holding step A recognition result determining step of determining whether an event held in the event holding unit is uniquely determined from the recognition result held in the unit; and determining that the event is not uniquely determined in the recognition result determining step. In this case, a list of intermediate operations for uniquely determining an event from the intermediate operations held in the event holding unit is provided to the user. Creating a list of intermediate operations to be shown, and generating a synthesized voice to be presented to the user from a list of the intermediate results created in the intermediate operation candidate creating process and the recognition result held in the recognition result holding unit. An event generating step of generating an input event of an application targeted by the interface from the recognition result holding unit and the history holding unit;
An event issuing step of issuing an event via communication means including a PI.

9. A history holding step of holding a history of recognition results held in the recognition result holding unit in a history holding unit, a history of recognition results held in the history holding unit, and holding in the event holding unit. 9. An information input method according to claim 8, further comprising: performing an intermediate operation candidate creating step of creating an intermediate operation candidate to be presented to the user with reference to the performed intermediate operation.

10. A priority intermediate operation instructing step in which a user instructs and inputs an event and a specific intermediate operation to be given preferentially to a specific event among the events in the voice recognition step; A priority intermediate operation holding step of holding the priority intermediate operation instructed by the operation instruction step in the intermediate operation holding unit; an event and an intermediate operation held in the priority intermediate operation holding unit; and a middle held in the event holding unit. 9. A priority intermediate operation providing step of creating a candidate for an intermediate operation to be presented to the user with reference to the operation.
Or the information input method described in 9.

11. A default intermediate operation holding step of previously holding, in a default intermediate operation holding unit, a specific intermediate operation to be given with priority to an event with respect to a specific event among the events; Performing a default intermediate operation giving step of creating a candidate for an intermediate operation to be presented to the user by referring to the event and the intermediate operation held in the unit and the intermediate operation held in the event holding unit. The information input method according to claim 8.

12. A storage medium storing a program constituting each means according to claim 1 so as to be readable from a computer.

13. A storage medium storing a program for executing the information input method according to any one of claims 8 to 11 so as to be readable from a computer.