JP2924717B2

JP2924717B2 - Presentation device

Info

Publication number: JP2924717B2
Application number: JP7167896A
Authority: JP
Inventors: 浩一篠田
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1995-06-12
Filing date: 1995-06-12
Publication date: 1999-07-26
Anticipated expiration: 2014-07-26
Also published as: JPH08339198A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、発表・講演・宣伝等の
プレゼンテーションに使用するプレゼンテーション装置
に関する。特に、音声に含まれるキーワードを認識し
て、これに対応してスライド画像等の送り動作あるいは
その一部の拡大等の操作を自動的に行うプレゼンテーシ
ョン装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a presentation device used for presentations such as presentations, lectures, and advertisements. In particular, the present invention relates to a presentation device that recognizes a keyword included in a voice and automatically performs an operation of feeding a slide image or the like or magnifying a part of the operation in response to the keyword.

【０００２】[0002]

【従来の技術】近年、コンピュータの小型化，高速化，
高性能化が進み、発表，講演，宣伝等において、コンピ
ュータを用いたプレゼンテーションが盛んになってきて
いる。2. Description of the Related Art In recent years, computers have become smaller and faster,
2. Description of the Related Art With the advancement of performance, presentations using computers are becoming popular in presentations, lectures, advertisements, and the like.

【０００３】具体的には、従来から使用されていたＯＨ
Ｐ・スライド等を用いずに、フロッピィーディスク等に
プレゼンテーションの内容を示すプレゼンテーションデ
ータ（以下、題材という）を記憶させておく。[0003] Specifically, OH which has been conventionally used has been used.
Instead of using P slides, presentation data (hereinafter referred to as subject matter) indicating the contents of the presentation is stored in a floppy disk or the like.

【０００４】そして、プレゼンテーションを行う会場や
部屋等で、コンピュータに搭載されたプレゼンテーショ
ン用のソフトウェア（たとえばマイクロソフト社の製品
「パワーポイントバージョン４．０」）を起動させ、
適宜フロッピィーディスク等から題材を読み出して、プ
レゼンテーションの補助を行うようにしている。Then, at a venue or a room where a presentation is made, presentation software (for example, Microsoft's product “PowerPoint version 4.0”) mounted on a computer is started,
The subject material is read from a floppy disk or the like as appropriate to assist the presentation.

【０００５】このようなコンピュータを用いたプレゼン
テーションシステムは、ＯＨＰ・スライド等に比べて題
材の携帯がより容易であり、しかも、題材が動画、音声
等の静止画以外の情報も含んで構成されている場合であ
っても、同じコンピュータでプレゼンテーションを行え
るとともに、同期をとる操作も容易であるという利点が
ある。In a presentation system using such a computer, it is easier to carry a subject than an OHP / slide or the like, and the subject includes information other than a still image such as a moving image and a sound. In this case, there is an advantage that the presentation can be performed on the same computer and the operation for synchronizing is easy.

【０００６】[0006]

【発明が解決しようとする課題】ところで、上記従来の
プレゼンテーションシステムでは、画像，音声等の切り
替え，起動等の操作を指示するために、マウス，キーボ
ード等を用いている。By the way, in the above-mentioned conventional presentation system, a mouse, a keyboard, and the like are used to instruct operations such as switching and activation of images and sounds.

【０００７】しかしながら、効果的なプレゼンテーショ
ンを行うには、これらマウス，キーボードは望ましいも
のではない。However, these mice and keyboards are not desirable for an effective presentation.

【０００８】すなわち、マウスは釦の数が少ないため
に、可能な操作数が限られる。また、キーボードは大き
すぎるために携帯に不便であり、さらに、どちらも操作
を行う際に、プレゼンテーションが止まってしまうとい
う欠点がある。That is, since the number of buttons of a mouse is small, the number of possible operations is limited. In addition, the keyboard is too large, which is inconvenient to carry, and furthermore, there is a drawback that the presentation stops when both are operated.

【０００９】そこで本発明は、マイクロフォンに入力さ
れた音声に含まれるキーワードを認識できるとともに、
この認識したキーワードに基づいて画像，音声等のプレ
ゼンテーションデータの操作を自動的に行うことができ
るプレゼンテーション装置の提供を目的とする。Therefore, the present invention is capable of recognizing a keyword contained in voice input to a microphone,
It is an object of the present invention to provide a presentation device that can automatically operate presentation data such as images and sounds based on the recognized keywords.

【００１０】[0010]

【課題を解決するための手段】本発明のプレゼンテーシ
ョン装置は、マイクロフォン１と、音声認識用辞書を参
照して、マイクロフォン１から入力された音声に含まれ
るプレゼンテーション操作用のキーワードを認識する音
声認識部３と、単語モデルと操作コマンドとの対応付け
を記述した上記音声認識用辞書２ａの対応テーブルを参
照して、上記認識したキーワードに対応する操作コマン
ドを生成するコマンド生成部４と、プレゼンテーション
に使用する画像，音声等のプレゼンテーションデータを
記憶したプレゼンテーション記憶部６と、この記憶され
たプレゼンテーションデータに対して上記操作コマンド
に対応する操作を行うプレゼンテーション操作部５と、
この操作に係るプレゼンテーションデータを出力するプ
レゼンテーション出力部７とを有している。A presentation apparatus of the present invention refers to a microphone 1 and a voice recognition dictionary, and a voice recognition unit for recognizing a keyword for a presentation operation contained in voice input from the microphone 1. 3, a command generation unit 4 that generates an operation command corresponding to the recognized keyword by referring to a correspondence table of the voice recognition dictionary 2a describing the correspondence between the word model and the operation command, and is used for presentation. A presentation storage unit 6 that stores presentation data such as images and sounds to be played, a presentation operation unit 5 that performs an operation corresponding to the operation command on the stored presentation data,
It has a presentation output unit 7 for outputting presentation data relating to this operation.

【００１１】上記に、プレゼンテーション記憶部６か
ら、キーワードを自動的に抽出し、該キーワードに対応
する単語モデルと操作コマンドとを対応付けて音声認識
用辞書に登録する自動辞書作成部１１を加えた構成にし
てもよい。An automatic dictionary creating unit 11 for automatically extracting a keyword from the presentation storage unit 6 and associating a word model corresponding to the keyword with an operation command and registering it in a speech recognition dictionary is added. It may be configured.

【００１２】上記音声認識用辞書には、認識候補となる
単語モデルが登録されており、該単語モデルをプレゼン
テーション出力部７に表示させる認識候補単語出力部１
２を設けてもよい。さらに、プレゼンテーション出力部
７を複数設けるとともに、そのうちの１つに認識候補単
語出力部１２を設けるようにしてもよい。A word model as a recognition candidate is registered in the speech recognition dictionary, and a recognition candidate word output unit 1 for displaying the word model on a presentation output unit 7.
2 may be provided. Further, a plurality of presentation output units 7 may be provided, and a recognition candidate word output unit 12 may be provided in one of them.

【００１３】[0013]

【作用】マイクロフォンから音声を入力すると、音声認
識部は、単語モデルを参照して、入力された音声に含ま
れるプレゼンテーションデータ操作用のキーワードを認
識し、この認識結果をコマンド生成部に出力する。コマ
ンド生成部では、単語モデルと操作コマンドとの対応付
けを記述した音声認識用辞書の対応テーブルを参照し
て、認識したキーワードに対応する操作コマンドを生成
し、これをプレゼンテーション操作部に出力する。When a voice is input from the microphone, the voice recognition unit refers to the word model to recognize the keyword for operating the presentation data included in the input voice, and outputs the recognition result to the command generation unit. The command generation unit generates an operation command corresponding to the recognized keyword by referring to a correspondence table of the dictionary for speech recognition that describes the correspondence between the word model and the operation command, and outputs this to the presentation operation unit.

【００１４】プレゼンテーション操作部は、プレゼンテ
ーション記憶部に記憶されているプレゼンテーションデ
ータに対し操作コマンドに対応する操作を行う。そし
て、この操作に係るプレゼンテーションデータは、プレ
ゼンテーション出力部によって出力される。The presentation operation unit performs an operation corresponding to an operation command on the presentation data stored in the presentation storage unit. Then, presentation data relating to this operation is output by the presentation output unit.

【００１５】自動辞書作成部を設けた場合には、プレゼ
ンテーション記憶部に記憶されているプレゼンテーショ
ンデータからキーワードを自動的に抽出し、これに対応
する単語モデルと操作コマンドとを対応付けて音声認識
用辞書に登録できる。When an automatic dictionary creation unit is provided, a keyword is automatically extracted from presentation data stored in a presentation storage unit, and a corresponding word model and an operation command are associated with each other for speech recognition. Can be registered in the dictionary.

【００１６】認識候補単語出力部を設けた場合には、音
声認識用辞書から抽出された認識候補となる単語モデル
が、プレゼンテーション出力部に出力される。When a recognition candidate word output unit is provided, a word model serving as a recognition candidate extracted from the speech recognition dictionary is output to the presentation output unit.

【００１７】プレゼンテーション出力部を複数設けた場
合には、そのうちの１つだけに認識候補となる単語モデ
ルを表示できる。When a plurality of presentation output units are provided, only one of them can display a word model that is a recognition candidate.

【００１８】[0018]

【実施例】本発明の実施例について図面を参照して説明
する。図１（Ａ）は本発明のプレゼンテーション装置の
第１実施例を示すブロック図、（Ｂ）は辞書管理部の詳
細を示すブロック図である。Embodiments of the present invention will be described with reference to the drawings. FIG. 1A is a block diagram showing a first embodiment of a presentation device of the present invention, and FIG. 1B is a block diagram showing details of a dictionary management unit.

【００１９】本装置は、音声を入力するマイクロフォン
１、辞書管理部２、音声認識部３、コマンド生成部４、
プレゼンテーション操作部５、資料記憶部６、プレゼン
テーション出力部７を有している。This apparatus comprises a microphone 1 for inputting voice, a dictionary management unit 2, a voice recognition unit 3, a command generation unit 4,
It has a presentation operation unit 5, a material storage unit 6, and a presentation output unit 7.

【００２０】辞書管理部２には認識候補となる単語モデ
ルとともに、音声認識用辞書２ａが記憶されている（図
１（Ｂ））。このうち、音声認識用辞書２ａは単語モデ
ルとプレゼンテーションデータの操作コマンドとの対応
付けを記述した対応テーブル等を辞書として格納したも
のである。The dictionary management unit 2 stores a speech recognition dictionary 2a together with a word model as a recognition candidate (FIG. 1B). Among them, the speech recognition dictionary 2a stores, as a dictionary, a correspondence table or the like in which a correspondence between a word model and an operation command of presentation data is described.

【００２１】上記音声認識用辞書２ａは、プレゼンテー
ションを行う前に予め作成されるもので、プレゼンテー
ションの途中で想定される場面で必要と考えられる操作
に対し、単語が割り当てられる。たとえば、プレゼンテ
ーションにおいて使用される単語としては、「次のスラ
イド」「図からグラフ」「数式の拡大」「ビデオ開始」
「（スライドの題名）」「（スライド中のキーワー
ド）」等が考えられる。The speech recognition dictionary 2a is created in advance before giving a presentation, and a word is assigned to an operation considered necessary in a scene assumed during the presentation. For example, words used in presentations include “next slide”, “figure to graph”, “expansion of mathematical formulas”, “video start”
“(Title of slide)”, “(keyword in slide)” and the like are conceivable.

【００２２】これらの単語に対し、上記単語モデルが作
成される。すなわち、音声のモデルがたとえば音節等を
１単位として作成されていれば、それらのモデルを連結
することにより単語モデルが作成される。また、使用者
が事前に単語を発声した音声データから単語モデルを作
成することも可能である。The above word model is created for these words. That is, if a speech model is created with, for example, syllables or the like as one unit, a word model is created by connecting those models. Further, it is also possible to create a word model from voice data in which a user uttered a word in advance.

【００２３】音声認識部３は、入力された音声をある一
定時間間隔の特徴ベクトルの時系列として解析し、さら
にこれらの特徴ベクトル列と、辞書管理部２から出力さ
れた認識候補となる各々の単語モデルとのパターンマッ
チングを行うことで、入力された音声の分析を行う機能
を有している。そして、その認識結果をコマンド生成部
４に出力するものである。The speech recognition unit 3 analyzes the input speech as a time series of feature vectors at certain time intervals, and further analyzes these feature vector sequences and each of the recognition candidates output from the dictionary management unit 2 as recognition candidates. It has a function of analyzing input speech by performing pattern matching with a word model. Then, the recognition result is output to the command generation unit 4.

【００２４】パターンマッチングの方法としては、たと
えば「デジダル音声処理」（古井貞煕著、１９８５年、
東海大学出版会）、「確率モデルによる音声認識」（中
川聖一著、１９８８年、電子情報通信学会）に詳述され
ている。As a method of pattern matching, for example, "Digidal voice processing" (Tadahiro Furui, 1985,
Tokai University Press), "Speech Recognition by Stochastic Model" (Seiichi Nakagawa, 1988, IEICE).

【００２５】また、音声認識の手法としては、上記パタ
ーンマッチィングの他、入力音声中の全ての音声の中か
ら、予め登録しておいた認識候補単語に良く適合する部
分を探し出し、その適合度がある閾値を越えた場合に、
その単語を認識するキーワードスポッティングを用いて
もよい。As a method of speech recognition, in addition to the pattern matching described above, a part that matches well with a previously registered recognition candidate word is searched for from all the speeches in the input speech, and the degree of matching is determined. If a certain threshold is exceeded,
Keyword spotting that recognizes the word may be used.

【００２６】この手法としては、たとえば「拡張連続Ｄ
Ｐ法による連続音声アルゴリズム」（中川聖一、電子情
報通信学会論文誌、１９８４／１０、Ｖｏｌ．Ｊ６７−
ＤＮｏ．１０）を挙げることができる。As this method, for example, “extended continuous D
P-Method Continuous Speech Algorithm "(Seiichi Nakagawa, IEICE Transactions, 1984/10, Vol. J67-
DNo. 10).

【００２７】コマンド生成部４は、辞書管理部２に格納
されている音声認識用辞書２ａの単語モデルと操作コマ
ンドとの対応テーブルから対応関係を獲得し、これを参
照して操作コマンドを生成して出力するものである。The command generation unit 4 obtains the correspondence from the correspondence table between the word model of the speech recognition dictionary 2a stored in the dictionary management unit 2 and the operation command, and generates the operation command by referring to this. Output.

【００２８】資料記憶部６は、たとえば磁気ディスク，
光ディスク等の各種の情報記録媒体と、これを駆動する
ドライバ等から構成されており、上記磁気ディスク等に
は操作の対象となるプレゼンテーションデータが記憶さ
れている。The material storage unit 6 includes, for example, a magnetic disk,
It is composed of various information recording media such as an optical disk and a driver for driving the information recording media, and the magnetic disk or the like stores presentation data to be operated.

【００２９】このプレゼンテーションデータは、プレゼ
ンテーションに使用するスライド画像等の静止画像デー
タ，動画像データ，音声データ等であり、これらには、
それぞれを識別するタイトルや、たとえばスライド画像
データの全部又は一部を拡大／縮小するマクロ等が関連
付けされて記録してある。The presentation data includes still image data such as a slide image used for a presentation, moving image data, audio data, and the like.
A title for identifying each of them, and a macro for enlarging / reducing all or part of the slide image data, for example, are recorded in association with each other.

【００３０】プレゼンテーション操作部５は、入力され
た操作コマンドに対応した操作を資料記憶部６に記憶さ
れているプレゼンテーションデータに対して行うもので
ある。The presentation operation unit 5 performs an operation corresponding to the input operation command on the presentation data stored in the material storage unit 6.

【００３１】プレゼンテーション出力部７は、マイクロ
フォン１から入力された音声とともに、プレゼンテーシ
ョン操作部５の操作に係るプレゼンテーションデータを
出力するものであり、たとえばディスプレイ，スピー
カ，アンプ等から構成されている。The presentation output section 7 outputs presentation data relating to the operation of the presentation operation section 5 together with the voice input from the microphone 1, and is composed of, for example, a display, a speaker, an amplifier and the like.

【００３２】上記の構成を有するプレゼンテーション装
置の動作について説明する。マイクロフォン１から入力
された講演者の音声は、プレゼンテーション出力部７及
び音声認識部３に出力される。The operation of the presentation device having the above configuration will be described. The speaker's voice input from the microphone 1 is output to the presentation output unit 7 and the voice recognition unit 3.

【００３３】音声認識部３では、辞書管理部２から出力
された単語モデルを用いて入力音声の認識を行い、その
認識結果をコマンド生成部４に出力する。The speech recognition section 3 recognizes the input speech using the word model output from the dictionary management section 2 and outputs the recognition result to the command generation section 4.

【００３４】コマンド生成部４では、音声認識結果に対
応する操作コマンド、すなわち、入力音声に最も適合度
の高い単語モデルに対応する操作コマンドを生成し、こ
れをプレゼンテーション操作部５に出力する。The command generation unit 4 generates an operation command corresponding to the speech recognition result, that is, an operation command corresponding to a word model having the highest matching degree to the input voice, and outputs this to the presentation operation unit 5.

【００３５】プレゼンテーション操作部５では、資料記
憶部６に記憶されているプレゼンテーションデータに対
して操作コマンドに対応する操作を行う。たとえば操作
コマンドがスライド画像の送り動作という内容のもので
あれば、これに対応する操作をプレゼンテーションデー
タに対して行う。The presentation operation section 5 performs an operation corresponding to an operation command on the presentation data stored in the material storage section 6. For example, if the operation command is a slide image feed operation, an operation corresponding to the operation is performed on the presentation data.

【００３６】これにより次のスライド画像データが資料
記憶部６から読み出され、これがプレゼンテーション出
力部７のディスプレイに表示される。As a result, the next slide image data is read from the material storage unit 6 and displayed on the display of the presentation output unit 7.

【００３７】次に、本発明の第２実施例について図２を
参照して説明する。なお、図１において説明したものと
同等のものについては、同一の符号を付して、それらの
説明を省略する。Next, a second embodiment of the present invention will be described with reference to FIG. Note that the same components as those described in FIG. 1 are denoted by the same reference numerals, and description thereof is omitted.

【００３８】図２に示すプレゼンテーション装置は、図
１に示す回路に音声認識スイッチ部８を加えた構成とし
ている。The presentation device shown in FIG. 2 has a configuration in which a voice recognition switch unit 8 is added to the circuit shown in FIG.

【００３９】音声認識スイッチ部８はたとえばトグルス
イッチを有しており、このトグルスイッチのオン／オフ
操作によって、音声認識部３における音声認識の開始命
令及び終了命令を出力するようになっている。トグルス
イッチは、たとえばマイクロフォンに付属させることが
好ましく、この場合には、講演者が簡単な操作で音声認
識の開始／終了操作を行える。The voice recognition switch unit 8 has, for example, a toggle switch. By turning on / off the toggle switch, a command to start and end voice recognition in the voice recognition unit 3 is output. The toggle switch is preferably attached to, for example, a microphone. In this case, the speaker can start / end the voice recognition with a simple operation.

【００４０】なお、このような音声認識スイッチ部８を
設けた場合、音声認識部３においてはワードスポッティ
ングの代わりに通常の単語認識を行うことも可能であ
る。When such a voice recognition switch unit 8 is provided, the voice recognition unit 3 can perform normal word recognition instead of word spotting.

【００４１】上記音声認識スイッチ部８を設けた場合の
音声認識部３の動作は、次のようになる。The operation of the voice recognition unit 3 when the voice recognition switch unit 8 is provided is as follows.

【００４２】音声認識部３は、音声認識スイッチ部８か
ら音声認識の開始命令が出力されない間は、マイクロフ
ォン１から入力された音声の認識動作を行わない。そし
て、トグルスイッチがオン操作されると、音声認識スイ
ッチ部３から音声認識の開始命令が音声認識部３に出力
される。これにより、音声認識部３は入力音声の認識を
開始し、そのオン操作の間中その認識動作を続行する。The voice recognition unit 3 does not recognize the voice input from the microphone 1 while the voice recognition switch unit 8 does not output a voice recognition start command. When the toggle switch is turned on, a voice recognition start command is output from the voice recognition switch unit 3 to the voice recognition unit 3. As a result, the voice recognition unit 3 starts recognizing the input voice, and continues the recognition operation during the ON operation.

【００４３】次に、トグルスイッチをオフ操作すると、
音声認識部３に音声認識の終了命令が出力される。これ
により、音声認識部３は入力音声の認識を終了する。Next, when the toggle switch is turned off,
The voice recognition unit 3 outputs a voice recognition end command. Thereby, the voice recognition unit 3 ends the recognition of the input voice.

【００４４】図３に示すプレゼンテーション装置は、図
２に示す回路にデータ記憶部９を加えた構成としたもの
である。データ記憶部９は、マイクロフォン１から入力
された音声を、順次所要時間分だけ記憶しておく記憶容
量のＲＡＭ（Random Access Memory）等を有するもので
あり、このデータ記憶部９に記憶されている音声データ
は、音声認識部３に出力されるようになっている。The presentation device shown in FIG. 3 has a configuration in which a data storage unit 9 is added to the circuit shown in FIG. The data storage unit 9 includes a RAM (Random Access Memory) having a storage capacity for sequentially storing voices input from the microphone 1 for a required time, and is stored in the data storage unit 9. The voice data is output to the voice recognition unit 3.

【００４５】このようなデータ記憶部９を設けた場合
の、音声認識部の動作は次のようになる。講演を行って
いる最中の任意の時点で音声認識スイッチ部８のトグル
スイッチをオン操作すると、データ記憶部９に記憶され
ている所要時間分の音声データが音声認識部３に出力さ
れる。The operation of the voice recognition unit when such a data storage unit 9 is provided is as follows. When the toggle switch of the voice recognition switch unit 8 is turned on at any time during the lecture, voice data for the required time stored in the data storage unit 9 is output to the voice recognition unit 3.

【００４６】音声認識部３は、データ記憶部９に記憶さ
れている所要時間分の音声データと、トグルスイッチを
オン操作した以降、マイクロフォン１から入力される音
声とに基づいて入力音声の認識を実行する。そして、ト
グルスイッチがオフ操作されると、音声認識部３は認識
動作を停止し、その後、認識結果をコマンド生成部４に
出力する。The voice recognition unit 3 recognizes the input voice based on the voice data for the required time stored in the data storage unit 9 and the voice input from the microphone 1 after turning on the toggle switch. Execute. Then, when the toggle switch is turned off, the voice recognition unit 3 stops the recognition operation, and thereafter outputs the recognition result to the command generation unit 4.

【００４７】図４に示すプレゼンテーション装置は、図
２に示す回路に終了命令遅延部１０を加えた構成とした
ものである。終了命令遅延部１０は、音声認識スイッチ
部８のトグルスイッチのオフ操作に伴う終了命令の、音
声認識部３への伝達を所要時間遅延させるものである。The presentation device shown in FIG. 4 has a configuration in which an end command delay unit 10 is added to the circuit shown in FIG. The end command delay unit 10 delays the transmission of the end command accompanying the OFF operation of the toggle switch of the voice recognition switch unit 8 to the voice recognition unit 3 by a required time.

【００４８】このような終了命令遅延部１０を設けた場
合の、音声認識部３の動作は次のようになる。音声認識
スイッチ部８から出力された終了命令は、終了命令遅延
部１０に入力される。そして、ある一定時間が経過した
後に音声認識部３に伝達される。所要時間後に終了命令
を受けた音声認識部３は認識動作を終了し、その認識結
果をコマンド生成部４に出力する。The operation of the voice recognition unit 3 in the case where such an end command delay unit 10 is provided is as follows. The end command output from the voice recognition switch unit 8 is input to the end command delay unit 10. Then, the information is transmitted to the voice recognition unit 3 after a certain time has elapsed. After receiving the end command after the required time, the voice recognition unit 3 ends the recognition operation, and outputs the recognition result to the command generation unit 4.

【００４９】図５に示すプレゼンテーション装置は、図
１に示す構成に、上述した音声認識スイッチ部８、デー
タ記憶部９及び終了命令遅延部１０を設けたものであ
る。The presentation apparatus shown in FIG. 5 has the same configuration as that shown in FIG. 1 except that the above-described speech recognition switch section 8, data storage section 9, and end command delay section 10 are provided.

【００５０】このような構成とした場合の、音声認識部
３の動作は次のようになる。音声認識部３は、データ記
憶部９に記憶されている所要時間分の音声データと、ト
グルスイッチをオン操作した以降、マイクロフォン１か
ら入力される音声とに基づいて入力音声の認識を実行す
る。The operation of the speech recognition section 3 in such a configuration is as follows. The voice recognition unit 3 performs input voice recognition based on voice data for a required time stored in the data storage unit 9 and voice input from the microphone 1 after turning on the toggle switch.

【００５１】そして、音声認識スイッチ部８のトグルス
イッチがオフ操作されると、該音声認識スイッチ部８か
らの終了命令が終了命令遅延部１０に出力される。そし
て、終了命令が終了命令遅延部１０に出力されから所定
時間が経過すると、入力された終了命令が音声認識部３
に出力される。所要時間後に終了命令を受けた音声認識
部３は認識動作を終了し、その認識結果をコマンド生成
部４に出力する。When the toggle switch of the voice recognition switch unit 8 is turned off, a termination command from the voice recognition switch unit 8 is output to the termination command delay unit 10. When a predetermined time has elapsed since the end command was output to the end command delay unit 10, the input end command is output to the speech recognition unit 3.
Is output to After receiving the end command after the required time, the voice recognition unit 3 ends the recognition operation, and outputs the recognition result to the command generation unit 4.

【００５２】図６に示すプレゼンテーション装置は、図
２に示す回路に自動辞書作成部１１を加えた構成とした
ものである。自動辞書作成部１１は、資料記憶部６に記
憶されているプレゼンテーションデータからキーワード
を自動的に抽出し、音声認識用辞書を作成する機能を有
するものである。The presentation device shown in FIG. 6 has a configuration in which an automatic dictionary creation unit 11 is added to the circuit shown in FIG. The automatic dictionary creation unit 11 has a function of automatically extracting a keyword from the presentation data stored in the material storage unit 6 and creating a speech recognition dictionary.

【００５３】具体的には、どの場面でも有効な「次のス
ライド」「前のスライド」等のキーワードは、予め登録
しておく。そして、「（各スライドの題名）」「（図の
名前）」等のキーワードを、自動的にプレゼンテーショ
ンデータから抽出し、該キーワードに対応する単語モデ
ルを作成してこれに操作コマンドを対応付けして、音声
認識用辞書に登録する。Specifically, keywords such as “next slide” and “previous slide” that are effective in any scene are registered in advance. Then, keywords such as “(title of each slide)” and “(name of figure)” are automatically extracted from the presentation data, a word model corresponding to the keyword is created, and an operation command is associated with the word model. And register it in the voice recognition dictionary.

【００５４】このような構成とした場合には、単語モデ
ルと操作コマンドとの対応関係を示す対応テーブルをプ
レゼンテーション毎に音声認識用辞書に登録する作業を
必要としない。With such a configuration, it is not necessary to register a correspondence table indicating the correspondence between the word model and the operation command in the speech recognition dictionary for each presentation.

【００５５】図７に示すプレゼンテーション装置は、図
２に示す回路に認識候補単語出力部１２を加えた構成と
したものである。The presentation device shown in FIG. 7 has a configuration in which a recognition candidate word output unit 12 is added to the circuit shown in FIG.

【００５６】認識候補単語出力部１２は、辞書管理部２
から出力された認識候補となる単語モデルを、プレゼン
テーション出力部７に出力する機能を有するものであ
る。具体的には、プレゼンテーション出力部７を複数設
け、このうちの１つに認識候補単語出力部１２を接続す
る構成が好ましい。The recognition candidate word output unit 12 includes the dictionary management unit 2
Has the function of outputting the word model that is a recognition candidate output from the presentation output unit 7. Specifically, a configuration is preferable in which a plurality of presentation output units 7 are provided, and the recognition candidate word output unit 12 is connected to one of them.

【００５７】この認識候補単語出力部１２は、プレゼン
テーションが始まると、当該プレゼンテーションで使用
する単語モデルを、辞書管理部２から読み出して、これ
をプレゼンテーション出力部７に出力する。プレゼンテ
ーション出力部７は、入力した単語モデルをディスプレ
イ上の講演者の見える位置に表示する。When the presentation starts, the recognition candidate word output unit 12 reads a word model used in the presentation from the dictionary management unit 2 and outputs it to the presentation output unit 7. The presentation output unit 7 displays the input word model at a position on the display where the speaker can see.

【００５８】また、プレゼンテーション出力部７に複数
のディスプレイを設けた場合には、そのうちの１つのデ
ィスプレイを講演者の方に向けておき、このディスプレ
イだけに単語モデルを表示する。When a plurality of displays are provided in the presentation output unit 7, one of the displays is directed to the speaker, and the word model is displayed only on this display.

【００５９】図８に示すプレゼンテーション装置は、図
２に示す回路に複数の副コマンド生成部１３₁〜１３_n
と、メディア選択部１４とを設けた構成のものである。
各コマンド生成部１３₁等は、たとえばキーボード，マ
ウスあるいは他の音声等、外部からの入力データに基づ
いて操作コマンドを生成する機能を有するものである。The presentation device shown in FIG. 8 includes a plurality of subcommand generators 13 _{1 to} 13 _{n in the} circuit shown in FIG.
And a media selection unit 14.
Each command generation unit 13 ₁ and the like, for example, those having a keyboard, a mouse or other voice, etc., the function of generating an operation command based on the input data from the outside.

【００６０】メディア選択部１４は、副コマンド生成部
１３₁及びコマンド生成部４から出力された操作コマン
ドを選択して、これらのうちの１つをプレゼンテーショ
ン出力部７に出力する機能を有している。この選択は、
たとえばあるタイミングで該メディア選択部１４に最初
に入力された操作コマンドのみをプレゼンテーション出
力部７に出力する等、様々な基準のものを採用できる。[0060] The media selection section 14 selects an operation command outputted from the sub command generation unit 13 ₁ and the command generating unit 4, and outputting one of these in the presentation output section 7 I have. This choice is
For example, various criteria can be adopted, such as outputting only the operation command first input to the media selection unit 14 to the presentation output unit 7 at a certain timing.

【００６１】なお、本発明は前述した実施例に限るもの
ではなく、その要旨の範囲内で様々に変形実施が可能で
ある。The present invention is not limited to the above-described embodiment, but can be variously modified within the scope of the invention.

【００６２】[0062]

【発明の効果】請求項１〜４に記載した発明によれば、
マイクロフォンに入力された音声に含まれるキーワード
を自動的に認識し、この認識したキーワードに基づいて
画像，音声等のプレゼンテーションデータの操作を自動
的に行うことができる。According to the invention described in claims 1 to 4,
A keyword included in the voice input to the microphone is automatically recognized, and the operation of presentation data such as an image and a voice can be automatically performed based on the recognized keyword.

【００６３】具体的には、たとえば項目の強調、図から
表への変換、図の拡大、任意のスライドへの移動等のプ
レゼンテーションデータの操作を、音声によって容易に
行うことができる。また、講演者自身の音声によって操
作できるため、講演者はマウスやキーボード等の操作に
煩わされることなく、途切れない自然なプレゼンテーシ
ョンを行うことができる。Specifically, for example, the operation of presentation data such as emphasis of items, conversion from a diagram to a table, enlargement of a diagram, and movement to an arbitrary slide can be easily performed by voice. In addition, since the operation can be performed by the speaker's own voice, the speaker can make a natural presentation without interruption without being bothered by the operation of the mouse and the keyboard.

【００６４】請求項２に記載した発明によれば、単語モ
デルと操作コマンドとの対応関係を示す対応テーブル
を、プレゼンテーション毎に音声認識用辞書に登録する
作業を必要とせず、事前の登録作業を軽減することがで
きる。According to the second aspect of the present invention, it is not necessary to register the correspondence table indicating the correspondence between the word model and the operation command in the speech recognition dictionary for each presentation, and it is possible to perform the prior registration work. Can be reduced.

【００６５】請求項３に記載した発明によれば、プレゼ
ンテーションを行う者は、ディスプレイに表示されてい
るキーワードを参照しながらプレゼンテーションを行え
るので、予めキーワードを記憶することや、暗記した場
合の記憶違いを防止することができる。According to the third aspect of the present invention, the person giving the presentation can perform the presentation while referring to the keyword displayed on the display. Can be prevented.

【００６６】請求項４に記載した発明によれば、複数の
ディスプレイのうちの１台をプレゼンテーションを行う
者に向けておき、このディスプレイだけにキーワードを
表示させられる。このため、観衆にはプレゼンテーショ
ンデータのみを見せることができ、キーワードが表示さ
れることによる注意力の低下を防止することができる。According to the fourth aspect of the present invention, one of the plurality of displays is directed to the person who makes the presentation, and the keyword can be displayed only on this display. Therefore, only the presentation data can be shown to the audience, and a decrease in attention due to the display of the keyword can be prevented.

[Brief description of the drawings]

【図１】（Ａ）は本発明のプレゼンテーション装置の第
１実施例を示すブロック図、（Ｂ）は辞書管理部の詳細
を示すブロック図である。FIG. 1A is a block diagram illustrating a first embodiment of a presentation device according to the present invention, and FIG. 1B is a block diagram illustrating details of a dictionary management unit.

【図２】本発明プレゼンテーション装置の第２実施例を
示すブロック図である。FIG. 2 is a block diagram showing a second embodiment of the presentation device of the present invention.

【図３】本発明プレゼンテーション装置の第３実施例を
示すブロック図である。FIG. 3 is a block diagram showing a third embodiment of the presentation device of the present invention.

【図４】本発明プレゼンテーション装置の第４実施例を
示すブロック図である。FIG. 4 is a block diagram showing a fourth embodiment of the presentation device of the present invention.

【図５】本発明プレゼンテーション装置の第５実施例を
示すブロック図である。FIG. 5 is a block diagram showing a fifth embodiment of the presentation device of the present invention.

【図６】本発明プレゼンテーション装置の第６実施例を
示すブロック図である。FIG. 6 is a block diagram showing a sixth embodiment of the presentation device of the present invention.

【図７】本発明プレゼンテーション装置の第７実施例を
示すブロック図である。FIG. 7 is a block diagram showing a seventh embodiment of the presentation device of the present invention.

【図８】本発明プレゼンテーション装置の第８実施例を
示すブロック図である。FIG. 8 is a block diagram showing an eighth embodiment of the presentation device of the present invention.

[Explanation of symbols]

１マイクロフォン２辞書管理部２ａ音声認識用辞書３音声認識部４コマンド生成部５プレゼンテーション操作部６資料記憶部（プレゼンテーション記憶
部）７プレゼンテーション出力部１１自動辞書作成部１２認識候補単語出力部DESCRIPTION OF SYMBOLS 1 Microphone 2 Dictionary management part 2a Speech recognition dictionary 3 Speech recognition part 4 Command generation part 5 Presentation operation part 6 Material storage part (presentation storage part) 7 Presentation output part 11 Automatic dictionary creation part 12 Recognition candidate word output part

Claims

(57) [Claims]

1. A microphone, a voice recognition unit for recognizing a keyword for a presentation operation included in voice input from a microphone with reference to a voice recognition dictionary, and a correspondence between a word model and an operation command are described. Referring to the correspondence table of the speech recognition dictionary described above,
A command generation unit that generates an operation command corresponding to the recognized keyword; a presentation storage unit that stores presentation data such as images and sounds to be used for a presentation; and a storage unit that corresponds to the operation command with respect to the stored presentation data. A presentation device having a presentation operation unit for performing an operation of performing a presentation operation and a presentation output unit for outputting presentation data relating to the operation .
Keyword automatically from presentation storage
Extract the word model and operation frame corresponding to the keyword.
An automatic dictionary that associates a command with a command and registers it in the dictionary for speech recognition.
Presentation equipment characterized by having a creation unit
Place.

The 2. A dictionary for speech recognition, word models to be recognized candidate is registered, in that a recognition candidate word output unit for displaying it to the presentation output unit
The presentation device according to claim 1, wherein

3. The presentation device according to claim 2 , wherein a plurality of presentation output units are provided, and one of the presentation output units is provided with a recognition candidate word output unit.