JP2000242464A

JP2000242464A - Processor and method for processing voice information and storage medium stored with voice information processing program

Info

Publication number: JP2000242464A
Application number: JP11045119A
Authority: JP
Inventors: Hitoshi Higaki; 整桧垣
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1999-02-23
Filing date: 1999-02-23
Publication date: 2000-09-08

Abstract

PROBLEM TO BE SOLVED: To automatically decide which of processing contents corresponding to a voice command and other processing contents are executed according to voice information by providing a voice command execution part and a voice information processing part which execute processing contents according to whether or not the voice information matches the voice command. SOLUTION: The voice information processor is constituted by providing a voice recognition dictionary 14 containing voice information corresponding to voice and a voice command table previously stored with voice commands corresponding to the voice information. Voice information is recognized from inputted voice according to the voice recognition dictionary and compared with voice commands in the voice command table 17 to check whether or not there is a matching voice command. When a matching voice command is found, a voice command execution part 18 executes the processing contents corresponding to the voice command. When no matching command is found, a voice information processing part executes the other processing contents set specially for the voice information.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ワードプロセッ
サ、パーソナルコンピュータ、及び各種情報処理装置等
に適用され、音声入力により文書入力や各種アプリケー
ションを実行する音声情報処理装置及び方法並びに音声
情報処理プログラムを記憶した記憶媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention is applied to a word processor, a personal computer, various information processing apparatuses, and the like, and stores a voice information processing apparatus and method for executing document input and various applications by voice input, and a voice information processing program. Related to a storage medium.

【０００２】[0002]

【従来の技術】近年、音声認識技術の発展によって、音
声情報から特定話者または不特定話者の音声コマンドを
認識して、パーソナルコンピュータに対して直接音声コ
マンドで指令したり、音声情報からテキスト情報に変換
して文書入力としても利用されるようになった。2. Description of the Related Art In recent years, with the development of voice recognition technology, a voice command of a specific speaker or an unspecified speaker is recognized from voice information, and commands are issued directly to a personal computer by voice commands, or text information is output from voice information. It has been converted to information and used as document input.

【０００３】しかし、現状では、人間が機械と自然な対
話を可能にする音声認識装置は実現していない。この原
因の一つに自然な対話における発話のゆれの問題があ
る。例えば、「あのー」や「ええと」などのように、そ
れ自体意味を持たない語を正しく音声認識することがで
きなかった。この問題を解決する従来技術として、例え
ば、特開平５−１９７３８９号公報の記載によれば、入
力された音声信号から音声特徴パラメータの時系列を求
め、その時系列から求めた音声特徴ベクトルと予め用意
された認識対象単語と、構文解析、意味解析を用いて、
「あのー」や「ええと」などそれ自体意味を持たない語
など自然な発話に対して音声認識する音声認識装置が提
案されている。[0003] However, at present, a speech recognition device that enables a human to have a natural conversation with a machine has not been realized. One of the causes is a problem of fluctuation of utterance in natural dialogue. For example, words that have no meaning in themselves, such as "um" or "um", could not be correctly recognized. As a conventional technique for solving this problem, for example, according to Japanese Patent Application Laid-Open No. 5-197389, a time series of voice feature parameters is obtained from an input voice signal, and a voice feature vector obtained from the time series is prepared in advance. Using the recognized recognition target words, syntactic analysis, and semantic analysis,
2. Description of the Related Art There has been proposed a speech recognition apparatus that recognizes natural utterances such as words that have no meaning in themselves, such as "um" and "um".

【０００４】また、語順によって意味が異なる文を同じ
意味の文として誤認識するという問題があった。この問
題を解決する従来技術として、例えば、特開平６−１８
６９９４号公報の記載によれば、認識すべき語、語の意
味、語のグループが入力音声中の出現順序に従って記述
されている文テンプレートと、語の意味、語のグループ
が記述されている単語辞書とを用いて、語順など自由度
の高い表現を精度良く認識する音声認識装置が提案され
ている。Another problem is that sentences having different meanings depending on the word order are erroneously recognized as sentences having the same meaning. As a conventional technique for solving this problem, for example, Japanese Patent Laid-Open No.
According to the description of Japanese Patent No. 6994, a sentence template in which words to be recognized, meanings of words, and groups of words are described according to the order of appearance in the input speech, and words in which meanings of words and groups of words are described 2. Description of the Related Art There has been proposed a speech recognition apparatus that accurately recognizes expressions having a high degree of freedom, such as word order, using a dictionary.

【０００５】また、近年、パソコンを端末とするネット
ワークサービスの利用が盛んであるが、パソコン通信端
末のユーザ・インタフェースは、コマンドを覚えるのが
難しく、メニュー項目間の移動に時間がかかるという問
題があった。この問題を解決する従来技術として、例え
ば、特開平８−１９４６００号公報の記載によれば、音
声入力から得られる認識結果を、ネットワークサービス
の通信規約に基づくデータ形式に変換してネットワーク
サービスに対して音声入力によりアクセスするユーザ・
インタフェースを構成して、音声入力による使いやすい
音声入力端末装置が提案されている。In recent years, network services using personal computers as terminals have been actively used. However, the user interface of personal computer communication terminals has a problem that it is difficult to learn commands and it takes time to move between menu items. there were. As a conventional technique for solving this problem, for example, according to Japanese Patent Application Laid-Open No. 8-194600, a recognition result obtained from a voice input is converted into a data format based on the communication protocol of the network service, and the network service is provided to the network service. Access by voice input
2. Description of the Related Art There has been proposed an easy-to-use voice input terminal device by configuring an interface and using voice input.

【０００６】[0006]

【発明が解決しようとする課題】上記の特開平５−１９
７３８９号公報、及び特開平６−１８６９９４号公報で
は、音声情報の認識精度を高める音声認識技術を提供
し、特開平８−１９４６００号公報では、音声入力によ
るユーザ・インタフェースアクセスとして利用する音声
認識技術を提供するものである。The above-mentioned JP-A-5-19
JP-A-7389 and JP-A-6-186994 provide a speech recognition technique for improving the recognition accuracy of speech information. JP-A-8-194600 discloses a speech recognition technique used as a user interface access by speech input. Is provided.

【０００７】しかしながら、音声認識された音声情報
は、例えば、音声コマンドに変換してその音声コマンド
に従ってアプリケーションを実行したり、テキスト情報
に変換して文書を作成したり、音声情報をそのまま音声
メモとして多機能に利用することが考えられる。上記の
従来の音声認識技術は、音声情報を文書処理装置に入力
する場合、認識された音声情報から音声コマンド、テキ
スト情報などに使い分けて多機能に対応するよう構成さ
れていない。However, the speech information recognized as speech is converted into a voice command and an application is executed according to the voice command, a document is created by converting the text information into text information, or the voice information is directly converted into a voice memo. It can be used for multiple functions. When inputting voice information to a document processing apparatus, the above-described conventional voice recognition technology is not configured to correspond to multiple functions by selectively using voice commands and text information from the recognized voice information.

【０００８】本発明は以上の事情を考慮してなされたも
のであり、例えば、音声で入力された音声情報を、音声
コマンドに変換してその音声コマンドに従ってアプリケ
ーションを実行したり、テキスト情報に変換して文書を
作成したり、音声情報をそのまま音声メモとして多機能
に利用することができる音声情報処理装置及び方法並び
に音声情報処理プログラムを記憶した記憶媒体を提供す
る。The present invention has been made in consideration of the above circumstances. For example, voice information input by voice is converted into voice commands to execute an application in accordance with the voice commands, or converted into text information. The present invention provides an audio information processing apparatus and method, and a storage medium storing an audio information processing program, which can create a document by using the audio information as it is as a voice memo for multiple functions.

【０００９】[0009]

【課題を解決するための手段】本発明は、音声に対応す
る音声情報を記憶した音声認識辞書と、音声情報に対応
する音声コマンドを予め記憶した音声コマンドテーブル
と、音声を入力する音声入力部と、入力された音声から
音声情報を音声認識辞書に基づいて認識する音声認識部
と、音声情報と音声コマンドテーブルの音声コマンドと
が一致するか否かを比較する音声コマンド比較部と、音
声情報と音声コマンドとが一致する場合、その音声コマ
ンドに対応する処理内容を実行する音声コマンド実行部
と、音声情報と音声コマンドとが一致しない場合、音声
情報に対し別に設定された処理内容を実行する音声情報
処理部とを備えたことを特徴とする音声情報処理装置で
ある。According to the present invention, there is provided a voice recognition dictionary storing voice information corresponding to voice, a voice command table storing voice commands corresponding to voice information in advance, and a voice input unit for inputting voice. A voice recognition unit that recognizes voice information from the input voice based on a voice recognition dictionary; a voice command comparison unit that compares whether voice information matches a voice command in a voice command table; When the voice command matches, the voice command execution unit executes the processing content corresponding to the voice command, and when the voice information does not match the voice command, executes the processing content separately set for the voice information An audio information processing apparatus comprising: an audio information processing unit.

【００１０】なお、本発明において、音声入力部は、マ
イク、アンプ、フィルタなどで構成することができる。
また、音声認識部、音声コマンド比較部及び音声情報処
理部は、ＣＰＵ、ＲＯＭ、ＲＡＭ、Ｉ／Ｏポートからな
るコンピュータで構成してもよい。音声認識辞書、音声
コマンドテーブルは、ＲＯＭ、フロッピーディスク、ハ
ードディスクなどの記憶装置で構成することができる。[0010] In the present invention, the audio input section can be composed of a microphone, an amplifier, a filter, and the like.
Further, the voice recognition unit, the voice command comparison unit, and the voice information processing unit may be configured by a computer including a CPU, a ROM, a RAM, and an I / O port. The voice recognition dictionary and the voice command table can be configured by a storage device such as a ROM, a floppy disk, and a hard disk.

【００１１】本発明によれば、音声入力部から入力され
た音声情報に基づいて音声コマンドに対応する処理内容
を実行するか、または別に設定された処理内容を実行す
るかを自動判定して利用することができる。このため、
入力した音声情報を処理する前に、その実行すべき目的
の処理内容を設定する必要がない。According to the present invention, it is automatically determined whether to execute the processing content corresponding to the voice command or to execute the processing content set separately based on the voice information input from the voice input unit. can do. For this reason,
There is no need to set the target processing content to be executed before processing the input audio information.

【００１２】
｜前記音声情報処理部は、文書作成の単語
検索中に音声コマンドと一致しない音声情報を取得した
場合、その音声情報を検索キーワードとしてテキスト情
報に変換する構成にしてもよい。この構成によれば、音
声コマンドと一致しない音声情報を検索キーワードとし
て文書作成の単語検索処理が可能になる。[0012]
| If the voice information processing unit acquires voice information that does not match a voice command during a word search for document creation, the voice information may be converted into text information as a search keyword. According to this configuration, word search processing for document creation can be performed using voice information that does not match a voice command as a search keyword.

【００１３】前記音声情報処理部は、文書作成中に音声
コマンドテーブルの音声コマンドと一致しない音声情報
を取得した場合、その音声情報を単語としてテキスト情
報に変換して作成中の文書に入力する構成にしてもよ
い。この構成によれば、音声コマンドと一致しない音声
情報を単語として文書に入力することができる。When the voice information processing unit acquires voice information that does not match a voice command in the voice command table during document creation, the voice information is converted into text information as words and input to the document being created. It may be. According to this configuration, voice information that does not match the voice command can be input to the document as a word.

【００１４】前記音声情報処理部は、音声メモ処理中に
音声コマンドと一致しない音声情報を取得した場合、そ
の音声情報をテキスト情報に変換せずにそのまま音声メ
モに入力する構成にしてもよい。この構成によれば、音
声コマンドと一致しない音声情報のまま音声メモとして
の入力することができる。When the voice information processing unit acquires voice information that does not match the voice command during the voice memo processing, the voice information may be directly input to the voice memo without being converted into text information. According to this configuration, voice information that does not match the voice command can be input as a voice memo.

【００１５】前記音声情報処理部は、文書作成の単語検
索中に取得した音声情報を検索キーワードとしてテキス
ト情報に変換する際、変換されたテキスト情報に含まれ
る意味を持たない語を削除する構成にしてもよい。この
構成によれば、接続語などの無意味語を対象外として音
声認識単語をキーワードとする検索処理が可能になる。[0015] The speech information processing unit is configured to, when converting speech information acquired during word search for document creation into text information as a search keyword, delete words having no meaning contained in the converted text information. You may. According to this configuration, it is possible to perform a search process using a speech recognition word as a keyword while excluding nonsense words such as connective words as targets.

【００１６】前記音声情報処理部は、文書作成中に取得
した音声情報を単語としてテキスト情報に変換する際、
文の終端を判断して句読点に変換して作成中の文書に入
力する構成にしてもよい。この構成によれば、音声によ
り文書入力中に文の終端を判断して句読点を入力するこ
とが可能になる。The speech information processing unit converts the speech information acquired during document creation into text information as words.
A configuration may be adopted in which the end of a sentence is determined, converted into punctuation marks, and input into the document being created. According to this configuration, it is possible to input a punctuation mark by determining the end of a sentence during input of a document by voice.

【００１７】前記音声情報処理部は、文書作成中に入力
されたテキスト情報が音声コマンドと一致する場合、そ
のテキスト情報を文書から削除する構成にしてもよい。
この構成によれば、音声コマンドを含む音声入力情報を
テキスト情報として入力された文書からその文書に不要
な音声コマンドに相当するテキスト情報を削除すること
が可能になる。[0017] The voice information processing section may be configured to delete the text information from the document when the text information input during document creation matches the voice command.
According to this configuration, it is possible to delete the text information corresponding to the voice command unnecessary for the document from the document in which the voice input information including the voice command is input as the text information.

【００１８】前記音声情報処理部は、音声メモ処理中に
入力された音声情報に音声コマンドを含む場合、その音
声メモからその音声情報を削除する構成にしてもよい。
この構成によれば、音声コマンドを含む音声情報を入力
された音声メモからその音声メモに不要な音声コマンド
に相当する音声情報を削除することが可能になる。When the voice information input during the voice memo processing includes a voice command, the voice information processing section may delete the voice information from the voice memo.
According to this configuration, it is possible to delete voice information corresponding to voice commands unnecessary for the voice memo from the voice memo to which voice information including the voice command is input.

【００１９】本発明の別の観点によれば、音声に対応す
る音声情報を音声認識辞書に記憶し、音声情報に対応す
る音声コマンドを音声コマンドテーブルに予め記憶し、
音声入力部を用いて音声を入力し、音声認識部を用い
て、入力された音声から音声情報を音声認識辞書に基づ
いて認識し、音声コマンド比較部を用いて、認識された
音声情報と音声コマンドテーブルの音声コマンドとが一
致するか否かを比較し、音声情報と音声コマンドとが一
致する場合、音声コマンド実行部を用いて、その音声コ
マンドに対応する処理内容を実行し、音声情報と音声コ
マンドとが一致しない場合、音声情報処理部を用いて、
音声情報に対し別に設定された処理内容を実行すること
を特徴とする音声情報処理方法が提供される。According to another aspect of the present invention, voice information corresponding to voice is stored in a voice recognition dictionary, voice commands corresponding to voice information are stored in a voice command table in advance,
A voice is input using a voice input unit, voice information is recognized from the input voice based on a voice recognition dictionary using a voice recognition unit, and the recognized voice information and voice are recognized using a voice command comparison unit. Compare whether or not the voice command in the command table matches, and if the voice information and the voice command match, execute the processing content corresponding to the voice command using a voice command execution unit, and If the voice command does not match, using the voice information processing unit,
A voice information processing method is provided, wherein a separately set processing content is executed for voice information.

【００２０】[0020]

【発明の実施の形態】以下、図に示す実施例に基づいて
本発明を詳述する。本発明はこれによって限定されるも
のではない。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described in detail based on an embodiment shown in the drawings. The present invention is not limited by this.

【００２１】図１は本発明の一実施例である情報処理装
置の構成を示すブロック図である。図１において、１は
制御プログラムにより本発明の音声情報処理装置の各部
を制御するコンピュータのＣＰＵを示す。２はＣＲＴデ
ィスプレイ、液晶ディスプレイ（ＬＣＤ）、プラズマデ
ィスプレイなどからなる表示部を示し、入力された文
書、入力指示情報を表示したりする。３は操作者からの
情報の入力や印刷時の指示などを行うキーボード４、マ
ウス５及び音声を入力するマイクロホン６と音声信号を
音声データに変換するＡ／Ｄ変換回路からなる音声入力
部で構成される入力部を示す。FIG. 1 is a block diagram showing the configuration of an information processing apparatus according to one embodiment of the present invention. In FIG. 1, reference numeral 1 denotes a CPU of a computer that controls each unit of the audio information processing apparatus of the present invention by a control program. Reference numeral 2 denotes a display unit including a CRT display, a liquid crystal display (LCD), a plasma display, and the like, and displays input documents and input instruction information. Reference numeral 3 denotes a keyboard 4 for inputting information from an operator and instructions for printing, a mouse 5, a microphone 6 for inputting voice, and a voice input unit including an A / D conversion circuit for converting a voice signal into voice data. The input part to be performed is shown.

【００２２】２３はサーマルプリンタ、レーザプリンタ
からなる印刷部を示し、文書の印刷などを行う。２４は
通信回線上にある他の情報処理端末と通信接続する通信
部を示す。２５はスピーカ、Ｄ／Ａ変換回路などからな
る音声出力部を示し、音声メモバッファに記録された音
声情報（音声データ）を再生する。２６はアドレス、制
御プログラムデータ、音声データ（音声コマンド、音声
テキスト）などを転送するバスを示す。Reference numeral 23 denotes a printing unit including a thermal printer and a laser printer, which prints a document. Reference numeral 24 denotes a communication unit for communication connection with another information processing terminal on a communication line. Reference numeral 25 denotes an audio output unit including a speaker, a D / A conversion circuit, and the like, and reproduces audio information (audio data) recorded in an audio memo buffer. Reference numeral 26 denotes a bus for transferring addresses, control program data, voice data (voice commands, voice texts) and the like.

【００２３】７は情報処理装置における読み書き可能な
メモリであるＲＡＭや読み出し専用のメモリであるＲＯ
Ｍや磁気記憶装置などから構成される情報蓄積部を示
す。情報蓄積部のＲＡＭとしては、プログラムを展開し
て保持したり、検索キーワードを格納する検索キーワー
ドバッファ９、テキスト化された文書を格納する文書バ
ッファ１０、音声情報のまま保持する音声メモバッファ
１２、音声認識単語を格納する音声認識単語辞書格納部
１５、音声入力により入力される音声データを格納する
音声入力バッファ（音声情報記憶部）１９として機能す
る領域などで構成される。Reference numeral 7 denotes a RAM which is a readable / writable memory in the information processing apparatus and an RO which is a read-only memory.
1 shows an information storage unit including an M and a magnetic storage device. As a RAM of the information storage unit, a search keyword buffer 9 for storing and expanding a program or storing a search keyword, a document buffer 10 for storing a text document, a voice memo buffer 12 for storing voice information as it is, It comprises a speech recognition word dictionary storage unit 15 for storing speech recognition words, an area functioning as a speech input buffer (speech information storage unit) 19 for storing speech data input by speech input, and the like.

【００２４】情報蓄積部のＲＯＭには、デジタルの音声
データを認識してコードデータに変換するための音声認
識データを予め格納した音声認識辞書１４、音声コマン
ドを予め格納した音声コマンドテーブル１７として機能
するデータが格納されている。The ROM of the information storage unit functions as a voice recognition dictionary 14 in which voice recognition data for recognizing digital voice data and converting it into code data is stored in advance, and a voice command table 17 in which voice commands are stored in advance. Data to be stored.

【００２５】さらに、このＲＯＭは、文書作成時の検索
においてキーワードを抽出して検索キーワードバッファ
９に格納するキーワード抽出部８として機能するプログ
ラム、音声認識辞書１４を参照して入力された音声デー
タを音声認識単語に変換して音声認識単語辞書格納部１
５に格納する音声認識部１３として機能する音声認識プ
ログラム、音声認識単語格納部１５に格納された音声認
識単語と音声コマンドテーブル１７を比較する音声コマ
ンド比較部１６として機能する音声コマンド比較プログ
ラム、認識された音声コマンドを実行する音声コマンド
実行部１８として機能するプログラムが格納されてい
る。Further, the ROM functions as a keyword extracting unit 8 for extracting a keyword in a search at the time of document creation and storing the keyword in a search keyword buffer 9, and stores voice data input with reference to a voice recognition dictionary 14. Speech recognition word dictionary storage unit 1 after converting to speech recognition word
5, a voice command comparison program that functions as a voice command comparison unit 16 that compares the voice recognition word stored in the voice recognition word storage unit 15 with the voice command table 17, A program that functions as the voice command execution unit 18 that executes the voice command is stored.

【００２６】音声認識部１３では、幾つかの音声認識方
法を用いることができる。例えば、マイクロホンなどの
音声入力部から入力された音声信号をデジタル信号に変
換し、このデジタル信号に、ＦＦＴ、フィルタ解析、Ｌ
ＰＣ分析、ケプストラム分析などの分析処理を施し、音
声特徴パラメータ系列を生成する。音声認識部１３にお
いて、生成された音声特徴パラメータ系列と予め定めら
れた認識対象単語であるキーワードの音声特徴パターン
を記憶した音声認識辞書１４と照合する。The voice recognition unit 13 can use several voice recognition methods. For example, an audio signal input from an audio input unit such as a microphone is converted into a digital signal, and this digital signal is subjected to FFT, filter analysis, L
Analysis processing such as PC analysis and cepstrum analysis is performed to generate a speech feature parameter sequence. The speech recognition unit 13 compares the generated speech feature parameter sequence with a speech recognition dictionary 14 that stores a speech feature pattern of a keyword that is a predetermined recognition target word.

【００２７】また、音声入力部により生成された音声特
徴パラメータ系列をある一定間隔の周期毎に音声認識辞
書と連続的に照合するワードスポッティング法による音
声認識がある。Also, there is speech recognition by a word spotting method in which a speech feature parameter sequence generated by a speech input unit is continuously collated with a speech recognition dictionary at certain fixed intervals.

【００２８】２０は電池でバックアップされたＲＡＭ、
ＥＥＰＲＯＭからなる不揮発性メモリで構成されたプロ
グラムメモリを示し、ＣＰＵ１の制御プログラムや各種
データ等を記憶する。２１はフロッピーディスク、ハー
ドディスク、ＭＤ、ＣＤ−ＲＯＭなどで構成される記憶
媒体を示し、記憶媒体２１には、本発明の音声情報処理
プログラム、音声認識辞書、音声コマンドテーブルのデ
ータが記憶されている。20 is a RAM backed up by a battery,
A program memory constituted by a nonvolatile memory composed of an EEPROM is shown, and stores a control program of the CPU 1, various data, and the like. Reference numeral 21 denotes a storage medium including a floppy disk, a hard disk, an MD, and a CD-ROM. The storage medium 21 stores data of a voice information processing program, a voice recognition dictionary, and a voice command table of the present invention. .

【００２９】２２は記憶媒体読取部を示し、記憶媒体読
取部２２は、必要に応じて記憶媒体２１に記憶された本
発明の音声情報処理プログラム、各種データを読み取
り、プログラムメモリ２０にインストールすることがで
きる。作成された文書や音声メモのデータを記憶媒体２
１に記憶して外部の情報処理装置でも利用することもで
きる。また、通信部２４を介して本発明の音声情報処理
プログラムを外部の情報処理装置に送信してもよい。Reference numeral 22 denotes a storage medium reading section. The storage medium reading section 22 reads the voice information processing program of the present invention and various data stored in the storage medium 21 as necessary, and installs them in the program memory 20. Can be. The data of the created document or voice memo is stored in the storage medium 2
1 and can be used by an external information processing device. Further, the voice information processing program of the present invention may be transmitted to an external information processing device via the communication unit 24.

【００３０】本発明の別の観点によれば、音声に対応す
る音声情報を記憶した音声認識辞書と、音声情報に対応
する音声コマンドを予め記憶した音声コマンドテーブル
と、音声を入力する音声入力部とを備えた音声情報処理
装置のコンピュータに、入力された音声から音声情報を
音声認識辞書に基づいて認識させ、認識された音声情報
と音声コマンドテーブルの音声コマンドとが一致するか
否かを比較させ、音声情報と音声コマンドとが一致する
場合、その音声コマンドに対応する処理内容を実行さ
せ、音声情報と音声コマンドとが一致しない場合、音声
情報に対し別に設定された処理内容を実行させることを
特徴とする音声情報処理プログラムを記憶した記憶媒体
２１が提供される。According to another aspect of the present invention, a voice recognition dictionary storing voice information corresponding to voice, a voice command table storing voice commands corresponding to voice information in advance, and a voice input unit for inputting voice And make the computer of the voice information processing device with voice information recognize the voice information from the input voice based on the voice recognition dictionary, and compare the recognized voice information with the voice command in the voice command table. If the voice information matches the voice command, the process corresponding to the voice command is executed. If the voice information does not match the voice command, the process set separately for the voice information is executed. A storage medium 21 storing a voice information processing program characterized by the following is provided.

【００３１】この構成により、本発明の音声情報処理プ
ログラムを記憶媒体２１に記憶し、必要に応じてプログ
ラムメモリ２０にインストールすることにより外部の情
報処理装置でも本発明の音声情報処理装置を実現するこ
とができる。With this configuration, the voice information processing program of the present invention is stored in the storage medium 21 and installed in the program memory 20 as necessary, thereby realizing the voice information processing apparatus of the present invention even with an external information processing apparatus. be able to.

【００３２】図２は本実施例のアプリケーション選択処
理の手順を示す概略フローチャートである。図１０は本
実施例のアプリケーションの選択画面例を示す図であ
る。図２に示すアプリケーション選択処理は、図１０に
示す選択画面から、実行したいアプリケーション１０−
１〜１０−６を選択する。Ｓ１０１：実行したいアプリケーションをキーボード、
マウス、マイクロホンなどの音声入力によって選択す
る。Ｓ１０２：図１０において、１０−１の「文書作成」の
アプリケーションを選んだ場合、Ｓ１０３に移る。Ｓ１０３：文書作成処理を実行する（図３参照）。FIG. 2 is a schematic flowchart showing the procedure of the application selecting process of the present embodiment. FIG. 10 is a diagram illustrating an example of an application selection screen according to the present embodiment. The application selection process shown in FIG. 2 is executed by selecting an application 10-
Select 1 to 10-6. S101: Keyboard application to be executed
Select by voice input from mouse, microphone, etc. S102: In FIG. 10, when the “document creation” application 10-1 is selected, the process proceeds to S103. S103: Execute document creation processing (see FIG. 3).

【００３３】Ｓ１０４：１０−４の「音声入力」のアプ
リケーションを選んだ場合、Ｓ１０５に移る。Ｓ１０５：音声入力処理を実行する（図５、図６参
照）。Ｓ１０６：１０−５の「音声メモ」のアプリケーション
を選んだ場合、Ｓ１０７に移る。Ｓ１０７：音声メモ処理を実行する（図７参照）。Ｓ１０８：また、１０−２の「呼出」や１０−３の「図
形」、１０−６の「補助」などのアプリケーションを選
んだ場合、その他の処理が実行される。S104: If the "voice input" application of 10-4 has been selected, the process proceeds to S105. S105: Execute voice input processing (see FIGS. 5 and 6). S106: If the “voice memo” application of 10-5 has been selected, the process proceeds to S107. S107: Execute voice memo processing (see FIG. 7). S108: When an application such as "call" in 10-2, "graphic" in 10-3, and "auxiliary" in 10-6 is selected, other processing is executed.

【００３４】図１１は本実施例の文書作成画面での文書
入力例を示す図である。例えば、図１１で示す文書作成
画面は、キーボードやマウスなどの入力により文書を入
力して編集する画面である。その画面でマイクロホンな
どの音声入力部により音声コマンドや検索時のキーワー
ドなどを入力することもできる。FIG. 11 is a diagram showing an example of inputting a document on the document creation screen of this embodiment. For example, the document creation screen shown in FIG. 11 is a screen for inputting and editing a document by inputting with a keyboard or a mouse. On the screen, a voice command or a keyword at the time of search can be input by a voice input unit such as a microphone.

【００３５】図３は図２のＳ１０３による文書作成処理
の手順を示す詳細フローチャートである。図３におい
て、Ｓ２０２：図１１の文書作成画面で音声入力された音声
情報を音声認識単語として認識する。Ｓ２０３：音声認識により得られた音声認識単語が音声
コマンドと一致するか否かを音声コマンド比較部１６に
より比較する。一致するとき、Ｓ２０５に移り、そうで
ないとき、Ｓ２０６に移る。FIG. 3 is a detailed flowchart showing the procedure of the document creation process in S103 of FIG. In FIG. 3, S202: Recognize voice information input by voice on the document creation screen of FIG. 11 as a voice recognition word. S203: The voice command comparing unit 16 compares whether or not the voice recognition word obtained by the voice recognition matches the voice command. If they match, the process moves to S205; otherwise, the process moves to S206.

【００３６】図１９は本実施例の音声コマンドテーブル
と音声認識単語の比較例を示す図である。図１９に示す
ように、例えば、入力された音声認識単語が“保存”で
ある場合、音声認識単語バッファ１５にある音声認識単
語と１９−１〜１９−７の音声コマンドテーブルの音声
コマンドと一致するかを順に比較する。Ｓ２０４：保存や印刷などの予めメモリ内に保持されて
いる音声コマンドテーブル１７の音声コマンドと一致す
るか否かを調べる。音声コマンド（例えば、“保存”）
と一致するとき、Ｓ２０５に移る。そうでないとき（例
えば、“師走で”）、Ｓ２０６に移る。FIG. 19 is a diagram showing a comparison example between the voice command table of the present embodiment and voice recognition words. As shown in FIG. 19, for example, when the input speech recognition word is “save”, the speech recognition word in the speech recognition word buffer 15 matches the speech command in the speech command table of 19-1 to 19-7. Are compared in order. S204: It is checked whether or not it matches the voice command of the voice command table 17 stored in the memory in advance such as saving or printing. Voice command (eg, "Save")
When it matches, it moves to S205. If not (for example, “at the master”), the process proceeds to S206.

【００３７】Ｓ２０５：音声コマンド実行部１８で音声
コマンドである“保存”を実行する。Ｓ２０６：検索キーワードを抽出する（図４参照）。図
２０は本実施例の音声入力された音声認識単語の削除例
を示す図である。例えば、図２０に示すように、入力さ
れた音声認識単語が“師走で”である場合に、音声コマ
ンドテーブルの音声コマンドと一致しないとき、検索キ
ーワードを抽出するＳ２０７：抽出した検索キーワード“師走”で検索を実
行する。図１２は本実施例の文書作成画面での検索処理
例を示す図である。図１２に示すように、音声により入
力された検索キーワード“師走”を検索して実行する。S205: The voice command execution unit 18 executes "save" which is a voice command. S206: Extract a search keyword (see FIG. 4). FIG. 20 is a diagram showing an example of deleting a voice recognition word input by voice according to the present embodiment. For example, as shown in FIG. 20, when the input speech recognition word is “Shoshiri” and does not match the speech command in the speech command table, a search keyword is extracted. S207: Extracted search keyword “Shashiri” Perform a search with. FIG. 12 is a diagram illustrating an example of a search process on the document creation screen according to the present embodiment. As shown in FIG. 12, a search keyword "shiwashi" input by voice is searched and executed.

【００３８】図４は図３のＳ２０６による検索キーワー
ド抽出処理の手順を示す詳細フローチャートである。図
４において、Ｓ３０２：図３のＳ２０２で得られた音声認識単語を読
み込む。Ｓ３０３：例えば、図２０の２０−１に示すように、音
声認識単語に保持されている“師走で”を最小単語単位
に分割する。FIG. 4 is a detailed flowchart showing the procedure of the search keyword extraction process in S206 of FIG. In FIG. 4, S302: The speech recognition word obtained in S202 of FIG. 3 is read. S303: For example, as shown at 20-1 in FIG. 20, "at the master" held in the speech recognition word is divided into minimum word units.

【００３９】Ｓ３０４：日本語ワードプロセッサなどの
かな漢字変換などで利用される構文解析などにより音声
認識単語から接続語を抽出する（図２０の２０−３参
照）。Ｓ３０６：入力された音声認識単語から無意味な語接続
語である“で”を削除する（図２０の２０−５参照）。Ｓ３０７：検索で使用される検索キーワードである“師
走”を確定する（図２０の２０−４参照）。S304: Connected words are extracted from the speech recognition words by syntactic analysis used in kana-kanji conversion of a Japanese word processor or the like (see 20-3 in FIG. 20). S306: The meaningless word connecting word “de” is deleted from the input speech recognition word (see 20-5 in FIG. 20). S307: “Shoshi”, a search keyword used in the search, is determined (see 20-4 in FIG. 20).

【００４０】図５は図２のＳ１０５による音声入力処理
の手順（１）を示す詳細フローチャートである。図５に
おいて、Ｓ４０２：音声入力による文書を作成する際、入力され
た音声を音声認識部により音声認識単語に変換する。Ｓ４０３：音声認識により得られた音声認識単語を音声
コマンドテーブルの音声コマンドと比較する。Ｓ４０４：音声コマンドテーブルの音声コマンドと一致
するか否かを調べる。一致する音声コマンドとき、Ｓ４
０５に移る。そうでないとき、Ｓ４０６に移る。Ｓ４０５：音声コマンドを実行する。FIG. 5 is a detailed flowchart showing the procedure (1) of the voice input process in S105 of FIG. In FIG. 5, S402: When creating a document by voice input, the input voice is converted into a voice recognition word by a voice recognition unit. S403: The speech recognition word obtained by the speech recognition is compared with the speech command in the speech command table. S404: It is checked whether or not it matches the voice command in the voice command table. When the voice command matches, S4
Move to 05. Otherwise, the process moves to S406. S405: Execute a voice command.

【００４１】Ｓ４０６：音声コマンドテーブルの音声コ
マンドと一致しなければ、入力された音声認識単語から
変換されたテキスト情報（文字列）を文書入力に利用す
る。図１３は本実施例の音声入力による文書入力例
（１）を示す図である。図１３に示すように、まず、音
声で入力された音声認識単語が“今年も”である場合、
文書入力の表示領域１３−３に“今年も”が表示され
る。なお、１３−１は音声入力による文書作成画面を示
す。１３−２はカーソル位置を示す。S406: If the voice command does not match the voice command in the voice command table, the text information (character string) converted from the input voice recognition word is used for document input. FIG. 13 is a diagram showing a document input example (1) by voice input according to the present embodiment. As shown in FIG. 13, first, when the voice recognition word input by voice is “this year”,
"This year" is displayed in the document input display area 13-3. Reference numeral 13-1 denotes a document creation screen by voice input. 13-2 indicates a cursor position.

【００４２】図１４は本実施例の音声入力による文書入
力例（２）を示す図である。図１４に示すように、続け
て、音声で入力された音声情報を音声認識で変換された
音声認識単語に順に“銀杏の実を拾い茶碗蒸しを楽しま
れたことでしょう。とうとう今年もカレンダーが”があ
る場合、文書入力の表示領域１４−３に“銀杏の実を拾
い茶碗蒸しを楽しまれたことでしょう。とうとう今年も
カレンダーが”が表示される。なお、１４−１は音声入
力による文書作成画面を示す。１４−２はカーソル位置
を示す。FIG. 14 is a diagram showing a document input example (2) by voice input according to the present embodiment. As shown in FIG. 14, the voice information input by voice is sequentially converted into the voice recognition words converted by voice recognition. If there is, the display area 14-3 of the document input will indicate that "you have enjoyed picking the ginkgo berry and enjoying the chawanmushi. Reference numeral 14-1 denotes a document creation screen by voice input. 14-2 indicates a cursor position.

【００４３】Ｓ４０７：続いて、音声で入力され音声情
報に音声認識で変換された音声認識単語が“てん”また
は“まる”であるか否かを調べる。そうであるとき、Ｓ
４０８に移る。そうでないとき、Ｓ４１２に移る。Ｓ４０８：音声認識単語が“てん”である場合、“て
ん”の前後の単語を判断して、読点の位置か否かを調べ
る。Ｓ４０９：読点“、”の位置か否かを調べる。S407: Subsequently, it is checked whether or not the speech recognition word which is inputted by speech and converted into speech information by speech recognition is "ten" or "maru". If so, S
Move to 408. Otherwise, the process moves to S412. S408: If the voice recognition word is "ten", the words before and after "ten" are determined and it is checked whether or not it is the position of the reading point. S409: It is checked whether or not it is the position of the reading point ",".

【００４４】図１５は本実施例の音声入力による文書入
力例（３）を示す図である。図１５では、特に、文書に
読点を挿入する例を示す。図２１は本実施例の文書バッ
ファへの句読点挿入処理例を示す図である。例えば、図
１５の１５−２に示すように、“〜が”のように文の終
端で且つ読点の前に出現する語であれば、図２１の２１
−３に示すように、読点表記“、”を文書バッファに挿
入し、文書作成画面に表示する。FIG. 15 is a diagram showing a document input example (3) by voice input according to the present embodiment. FIG. 15 particularly shows an example of inserting a reading point into a document. FIG. 21 is a diagram illustrating an example of processing for inserting punctuation marks into the document buffer according to the present embodiment. For example, as shown at 15-2 in FIG. 15, if the word appears at the end of the sentence and before the punctuation mark, such as "-", 21 in FIG.
As shown in -3, the reading notation "," is inserted into the document buffer and displayed on the document creation screen.

【００４５】ここで、音声で入力された音声認識単語が
図２１の２１−５の例のように“商品を展示会に出す”
などの“展（てん）”のような語の途中であれば、変換
された音声認識単語を表示する。図１６は本実施例の音
声入力による文書入力例（４）を示す図である。図１６
に示すように、同様に音声で入力され音声認識で変換さ
れた音声認識単語に“最後の一枚となってしまいまし
た”を、表示領域１６−３に表示する。Here, the voice recognition word input by voice is "release the product to the exhibition" as in the example of 21-5 in FIG.
If it is in the middle of a word such as "ten", the converted speech recognition word is displayed. FIG. 16 is a diagram showing a document input example (4) by voice input according to the present embodiment. FIG.
As shown in FIG. 6, "the last one has been output" is displayed in the display area 16-3 as a speech recognition word which is similarly input by speech and converted by speech recognition.

【００４６】Ｓ４１０：続いて、音声で入力され音声情
報に音声認識で変換えた音声認識単語が“まる”である
場合、“まる”の前後の語を判断して、句点の位置か否
かを調べる。図１７は本実施例の音声入力による文書入
力例（５）を示す図である。図１７では、特に、文書に
句点を挿入する例を示す。Ｓ４１１：例えば、図１７の１７−２で示すように、
“しまいました”のように文の終端で且つ句点の前に出
現する語であれば、図２１の２１−４に示すように、句
点表記“。”を文書バッファに挿入し、文書作成画面に
表示する。S410: Subsequently, when the speech recognition word inputted by speech and converted into speech information by speech recognition is "maru", words before and after "maru" are determined, and it is determined whether or not the word is a position of a period. Find out. FIG. 17 is a diagram showing a document input example (5) by voice input according to the present embodiment. FIG. 17 particularly shows an example of inserting a period into a document. S411: For example, as shown by 17-2 in FIG.
If the word appears at the end of the sentence and before the period, such as "I ended up," the period notation "." Is inserted into the document buffer as shown at 21-4 in FIG. To be displayed.

【００４７】図６は図２のＳ１０５による音声入力処理
の手順（２）を示す詳細フローチャートである。図６に
おいて、Ｓ５０２：音声入力により入力された音声認識単語（テ
キスト列）を文書バッファに格納する。Ｓ５０３：文書バッファに格納されたテキスト列を音声
コマンドテーブルの音声コマンドと比較する。FIG. 6 is a detailed flowchart showing the procedure (2) of the voice input process in S105 of FIG. In FIG. 6, S502: The voice recognition word (text string) input by voice input is stored in the document buffer. S503: Compare the text string stored in the document buffer with the voice command in the voice command table.

【００４８】Ｓ５０４：音声コマンドテーブルの音声コ
マンドと一致するか否かを調べる。一致する音声コマン
ドのとき、Ｓ５０５に移る。そうでないとき、Ｓ５０６
に移り、終了する。Ｓ５０５：音声コマンドと一致した場合には、その入力
された文書バッファ内のテキスト列を削除する。S504: It is checked whether the voice command matches the voice command in the voice command table. When the voice command matches, the process proceeds to S505. Otherwise, S506
And ends. S505: If the command matches the voice command, the input text string in the document buffer is deleted.

【００４９】図１８は本実施例の音声メモ記録例を示す
図である。図１８において、１８−１は入力音声を音声
データとして音声メモに記録（録音）する際の音声メモ
画面を示す。１８−２は音声データを記録する音声メモ
先の選択を示す。１８−３は音声データを記録する際の
操作画面を示す。１８−５は音声メモを開始するための
開始ボタンを示す１８−６は音声メモを実行中に音声メ
モを停止するための停止ボタンを示す。FIG. 18 is a diagram showing a voice memo recording example of the present embodiment. In FIG. 18, reference numeral 18-1 denotes a voice memo screen when recording (recording) an input voice as voice data in a voice memo. Reference numeral 18-2 indicates selection of a voice memo destination for recording voice data. 18-3 shows an operation screen when recording audio data. Reference numeral 18-5 denotes a start button for starting the voice memo. Reference numeral 18-6 denotes a stop button for stopping the voice memo during execution of the voice memo.

【００５０】図７は図２のＳ１０７による音声メモ処理
の手順を示す詳細フローチャートである。図７の（１）
において、Ｓ６０２：音声による文書を入力する際、音声で入力さ
れた音声情報を音声認識部により音声認識単語に変換す
る。Ｓ６０３：音声認識により得られた音声認識単語を音声
コマンドテーブルの音声コマンドと比較する。FIG. 7 is a detailed flowchart showing the procedure of the voice memo processing in S107 of FIG. FIG. 7 (1)
S602: When inputting a voice document, the voice information input by voice is converted into a voice recognition word by a voice recognition unit. S603: The speech recognition word obtained by the speech recognition is compared with the speech command in the speech command table.

【００５１】Ｓ６０４：音声コマンドテーブルの音声コ
マンドと一致するか否かを調べる。一致する音声コマン
ドとき、Ｓ６０５に移る。そうでないとき、Ｓ６０６に
移る。Ｓ６０５：音声コマンドを実行する。Ｓ６０６：音声コマンドテーブルの音声コマンドと一致
しなければ、入力された音声情報を音声データのまま、
音声メモとして記録するS604: It is checked whether the voice command matches the voice command in the voice command table. If there is a matching voice command, the process moves to S605. Otherwise, the process moves to S606. S605: Execute the voice command. S606: If the voice command does not match the voice command in the voice command table, the input voice information remains as voice data.
Record as a voice memo

【００５２】図７の（２）において、音声メモ処理を別
の手段で行う。Ｓ６１０：入力された音声情報を音声データのまま、音
声メモとして音声メモバッファに記録する。Ｓ６１１：音声コマンドテーブルの音声コマンドと比較
する。Ｓ６１２：音声コマンドテーブルの音声コマンドと一致
するか否かを調べる。Ｓ６１３：その記録された音声メモバッファから音声コ
マンドに相当する音声情報を削除する。In (2) of FIG. 7, voice memo processing is performed by another means. S610: Input voice information is recorded as voice memo in voice memo buffer as voice data. S611: Compare with the voice command in the voice command table. S612: It is checked whether or not it matches the voice command in the voice command table. S613: The voice information corresponding to the voice command is deleted from the recorded voice memo buffer.

【００５３】図８は図３のＳ２０２、図５のＳ４０２及
び図７のＳ６０２による音声認識処理の手順を示す詳細
フローチャートである。図８において、Ｓ７０２：マイクロホン６などの音声入力部より入力さ
れた音声情報を音声入力バッファ１９に格納する。Ｓ７０３：音声入力バッファ１９に格納された音声情報
を音声認識部１３で音声認識単語に変換する。Ｓ７０４：これらの音声認識部で照合されて得られた音
声認識単語は、テキスト列に変換する。Ｓ７０５：テキスト変換された音声認識単語を音声認識
単語格納部１５に格納する。FIG. 8 is a detailed flowchart showing the procedure of the voice recognition process in S202 of FIG. 3, S402 of FIG. 5, and S602 of FIG. In FIG. 8, S702: The audio information input from the audio input unit such as the microphone 6 is stored in the audio input buffer 19. S703: The speech information stored in the speech input buffer 19 is converted by the speech recognition unit 13 into a speech recognition word. S704: The speech recognition words obtained by collation by these speech recognition units are converted into text strings. S705: The text-converted speech recognition word is stored in the speech recognition word storage unit 15.

【００５４】図９は図３のＳ２０３、図５のＳ４０３及
び図７のＳ６０３による音声コマンド比較処理の手順を
示す詳細フローチャートである。図９において、Ｓ８０２：音声認識単語を読み込む。Ｓ８０３：音声認識単語に変換されたテキスト列と音声
コマンドテーブルの音声コマンドと比較する。FIG. 9 is a detailed flowchart showing the procedure of the voice command comparison process in S203 of FIG. 3, S403 of FIG. 5, and S603 of FIG. In FIG. 9, S802: The voice recognition word is read. S803: The text string converted into the voice recognition word is compared with the voice command in the voice command table.

【００５５】[0055]

【発明の効果】本発明によれば、パーソナルコンピュー
タやワードプロセッサにおいて、音声入力部から入力さ
れた音声情報に基づいて音声コマンドに対応する処理内
容を実行するか、または別に設定された処理内容を実行
するかを自動判定して利用することができる。このた
め、入力した音声情報を処理する前に、その実行すべき
目的の処理内容を設定する必要がない。According to the present invention, in a personal computer or a word processor, processing content corresponding to a voice command is executed based on voice information input from a voice input unit, or processing content set separately is executed. It is possible to automatically determine whether to do so. Therefore, there is no need to set the target processing content to be executed before processing the input audio information.

[Brief description of the drawings]

【図１】本発明の一実施例である情報処理装置の構成を
示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of an information processing apparatus according to an embodiment of the present invention.

【図２】本実施例のアプリケーション選択処理の手順を
示す概略フローチャートである。FIG. 2 is a schematic flowchart illustrating a procedure of an application selection process according to the embodiment;

【図３】図２のＳ１０３による文書作成処理の手順を示
す詳細フローチャートである。FIG. 3 is a detailed flowchart showing a procedure of a document creation process in S103 of FIG. 2;

【図４】図３のＳ２０６によるキーワード抽出処理の手
順を示す詳細フローチャートである。FIG. 4 is a detailed flowchart showing a procedure of a keyword extraction process in S206 of FIG. 3;

【図５】図２のＳ１０５による音声入力処理の手順
（１）を示す詳細フローチャートである。FIG. 5 is a detailed flowchart showing a procedure (1) of a voice input process in S105 of FIG. 2;

【図６】図２のＳ１０５による音声入力処理の手順
（２）を示す詳細フローチャートである。FIG. 6 is a detailed flowchart showing a procedure (2) of a voice input process in S105 of FIG. 2;

【図７】図２のＳ１０７による音声メモ処理の手順を示
す詳細フローチャートである。FIG. 7 is a detailed flowchart showing a procedure of a voice memo process in S107 of FIG. 2;

【図８】図３のＳ２０２、図５のＳ４０２及び図７のＳ
６０２による音声認識処理の手順を示す詳細フローチャ
ートである。FIG. 8 is a flowchart showing the operation of S202 of FIG. 3, S402 of FIG. 5, and S of FIG. 7;
6 is a detailed flowchart showing a procedure of a voice recognition process by 602.

【図９】図３のＳ２０３、図５のＳ４０３及び図７のＳ
６０３による音声コマンド比較処理の手順を示す詳細フ
ローチャートである。FIG. 9 is a view showing S203 in FIG. 3, S403 in FIG. 5, and S in FIG.
603 is a detailed flowchart showing the procedure of a voice command comparison process by 603.

【図１０】本実施例のアプリケーションの選択画面例を
示す図である。FIG. 10 is a diagram illustrating an example of an application selection screen according to the present embodiment.

【図１１】本実施例の文書作成画面での文書入力例を示
す図である。FIG. 11 is a diagram illustrating a document input example on a document creation screen according to the present embodiment.

【図１２】本実施例の文書作成画面での検索処理例を示
す図である。FIG. 12 is a diagram illustrating an example of a search process on a document creation screen according to the present embodiment.

【図１３】本実施例の音声入力による文書入力例（１）
を示す図である。FIG. 13 shows a document input example by voice input according to the present embodiment (1).
FIG.

【図１４】本実施例の音声入力による文書入力例（２）
を示す図である。FIG. 14 shows an example of a document input by voice input according to the embodiment (2).
FIG.

【図１５】本実施例の音声入力による文書入力例（３）
を示す図である。FIG. 15 shows a document input example by voice input according to the present embodiment (3).
FIG.

【図１６】本実施例の音声入力による文書入力例（４）
を示す図である。FIG. 16 shows an example of a document input by voice input according to the embodiment (4).
FIG.

【図１７】本実施例の音声入力による文書入力例（５）
を示す図である。FIG. 17 shows an example of a document input by voice input according to the embodiment (5).
FIG.

【図１８】本実施例の音声入力による音声メモ入力例を
示す図である。FIG. 18 is a diagram illustrating an example of voice memo input by voice input according to the present embodiment.

【図１９】本実施例の音声コマンドテーブルと音声認識
単語の比較例を示す図である。FIG. 19 is a diagram illustrating a comparative example of a voice command table and a voice recognition word according to the present embodiment.

【図２０】本実施例の音声入力された音声認識単語の削
除例を示す図である。FIG. 20 is a diagram illustrating an example of deleting a voice recognition word input by voice according to the embodiment;

【図２１】本実施例の文書バッファへの句読点挿入処理
例を示す図である。FIG. 21 is a diagram illustrating an example of processing for inserting punctuation marks into a document buffer according to the present embodiment.

[Explanation of symbols]

１ＣＰＵ２表示部３入力部４キーボード５マウス６マイクロホン（音声入力部）７情報蓄積部８キーワード抽出部９検索キーワードバッファ１０文書バッファ１１句読点挿入部１２音声メモバッファ１３音声認識部１４音声認識辞書１５音声認識単語格納部１６音声コマンド比較部１７音声コマンドテーブル１８音声コマンド実行部１９音声入力バッファ２０プログラムメモリ２１記憶媒体２２記憶媒体読取部２３印刷部２４通信部２５音声出力部２６バス DESCRIPTION OF SYMBOLS 1 CPU 2 Display part 3 Input part 4 Keyboard 5 Mouse 6 Microphone (voice input part) 7 Information storage part 8 Keyword extraction part 9 Search keyword buffer 10 Document buffer 11 Punctuation insertion part 12 Voice memo buffer 13 Voice recognition part 14 Voice recognition dictionary Reference Signs List 15 Voice recognition word storage unit 16 Voice command comparison unit 17 Voice command table 18 Voice command execution unit 19 Voice input buffer 20 Program memory 21 Storage medium 22 Storage medium reading unit 23 Printing unit 24 Communication unit 25 Voice output unit 26 Bus

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 3/00 ５６１Ｈ５７１Ｊ ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) G10L 3/00 561H 571J

Claims

[Claims]

A voice recognition dictionary storing voice information corresponding to voice; a voice command table storing voice commands corresponding to voice information in advance; a voice input unit for inputting voice; A voice recognition unit that recognizes information based on a voice recognition dictionary, a voice command comparison unit that compares whether voice information matches a voice command in a voice command table, and a case where voice information matches a voice command A voice command execution unit that executes processing content corresponding to the voice command, and a voice information processing unit that executes processing content separately set for the voice information when the voice information does not match the voice command. A voice information processing apparatus characterized by the above-mentioned.

2. The method according to claim 1, wherein the voice information processing unit converts the voice information into text information as a search keyword when obtaining voice information that does not match a voice command during a word search for document creation. 2. The voice information processing device according to 1.

3. When the voice information processing unit obtains voice information that does not match a voice command in a voice command table during document creation, the voice information is converted into text information as words and input to a document being created. The voice information processing apparatus according to claim 1, wherein

4. When the voice information processing unit acquires voice information that does not match a voice command during voice memo processing,
2. The voice information processing apparatus according to claim 1, wherein the voice information is directly input to a voice memo without being converted into text information.

5. The speech information processing section, when converting speech information acquired during a word search for document creation into text information as a search keyword, deletes words having no meaning included in the converted text information. 3. The audio information processing apparatus according to claim 2, wherein:

6. The speech information processing section, when converting speech information acquired during document creation into text information as a word, judges the end of a sentence, converts the sentence into punctuation, and inputs it to the document being created. The audio processing device according to claim 3, wherein:

7. The voice information processing unit, when text information input during document creation matches a voice command,
4. The voice information processing apparatus according to claim 3, wherein the text information is deleted from the document.

8. The voice processing apparatus according to claim 4, wherein when the voice information input during the voice memo processing includes a voice command, the voice information processing unit deletes the voice information from the voice memo. Information processing device.

9. Voice information corresponding to voice is stored in a voice recognition dictionary, voice commands corresponding to voice information are stored in a voice command table in advance, voice is input using a voice input unit, and voice recognition is performed. The voice command is used to recognize voice information based on a voice recognition dictionary, and a voice command comparison unit is used to compare the recognized voice information with voice commands in a voice command table. When the voice information and the voice command match, the processing corresponding to the voice command is executed using the voice command execution unit, and when the voice information and the voice command do not match, the voice information And performing processing separately set on the voice information using the unit.

10. A voice information processing apparatus comprising: a voice recognition dictionary storing voice information corresponding to voice, a voice command table storing voice commands corresponding to voice information in advance, and a voice input unit for inputting voice. On the computer
When voice information is recognized based on the voice recognition dictionary from the input voice and the recognized voice information is compared with a voice command in the voice command table to determine whether or not the voice command matches the voice command. Storing a voice information processing program for executing the processing content corresponding to the voice command, and executing the processing content separately set for the voice information when the voice information does not match the voice command. Medium.