JPH09288494A

JPH09288494A - Voice recognition device and voice recognizing method

Info

Publication number: JPH09288494A
Application number: JP8100944A
Authority: JP
Inventors: Hiroshi Tsunoda; 弘史角田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1996-04-23
Filing date: 1996-04-23
Publication date: 1997-11-04

Abstract

PROBLEM TO BE SOLVED: To improve the accuracy and the processing speed of a voice recognition. SOLUTION: Recognition objective words made to be objects of the voice recognition are stored in a voice recognition data ROM 15. Moreover, words constituting sentences made to be displayed on an LCD 17 are stored in a sentence data ROM 14 by being related to corresponding recognition objective words. Then, when a voice is inputted to a microphone 1 in a state in which a certain sentence is displayed on the LCD 17, only voices of words related to words constituting the sentence being displayed on the LCD 17 among recognition objective words stored in the voice recognition data ROM 15 are recognized as objects in a voice recognizing circuit 5.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、音声認識装置およ
び音声認識方法に関する。特に、音声の認識精度および
認識処理速度を向上させることができるようにする音声
認識装置および音声認識方法に関する。The present invention relates to a voice recognition device and a voice recognition method. In particular, the present invention relates to a voice recognition device and a voice recognition method capable of improving the voice recognition accuracy and the recognition processing speed.

【０００２】[0002]

【従来の技術】従来の、例えば電子辞書装置などにおい
ては、英単語を、キーボードを操作することにより入力
すると、その英単語の発音記号や、意味（語義）を解説
する情報、その英単語を用いた例文など（以下、適宜、
このような情報を解説情報という）が、その内蔵する電
子辞書から検索されて表示されるようになされている。2. Description of the Related Art In a conventional electronic dictionary device, for example, when an English word is input by operating a keyboard, a phonetic symbol of the English word, information explaining the meaning (sense), and the English word are input. Example sentences used (hereinafter, as appropriate
Such information is referred to as commentary information), which is retrieved and displayed from the built-in electronic dictionary.

【０００３】さらに、このようにして検索された解説情
報の中の英単語（例えば、例文に用いられている英単
語）の解説情報を得たいときには、その英単語を、やは
りキーボードを操作して入力したり、あるいは、表示さ
れている解説情報の中の所望する英単語を、カーソルキ
ーを操作して指定することによって、その英単語の解説
情報が検索されて表示されるようになされている。Further, when it is desired to obtain commentary information of an English word (for example, an English word used in an example sentence) in the commentary information retrieved in this way, the English word is also operated by the keyboard. By inputting or specifying a desired English word in the displayed commentary information by operating the cursor keys, the commentary information of the English word is searched and displayed. .

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、英単語
を入力するのに、キーボードやカーソルキーを操作する
のは面倒である。However, it is troublesome to operate a keyboard or a cursor key to input an English word.

【０００５】そこで、電子辞書装置が内蔵する電子辞書
に登録されている英単語すべてを音声認識の対象（認識
対象語）とし、これにより、英単語を、音声により入力
することができるようにする方法がある。Therefore, all of the English words registered in the electronic dictionary built in the electronic dictionary device are set as the target of speech recognition (recognition target word), so that the English words can be input by voice. There is a way.

【０００６】しかしながら、ある程度実用的な電子辞書
には、例えば数万語程度の英単語が登録されており、こ
のような莫大な数の英単語を対象として、入力された音
声の音声認識を行うのでは、認識精度が劣化し、また、
認識処理速度も低下することになる。However, some tens of thousands of English words are registered in a practical electronic dictionary to some extent, and voice recognition of the input voice is performed for such a huge number of English words. , The recognition accuracy will deteriorate, and
The recognition processing speed will also decrease.

【０００７】さらに、例えば、解説情報の中の例文で用
いられている英単語は格変化している場合があるが、電
子辞書には、そのように格変化した英単語の見出しまで
は登録されていないことが多く、従って、格変化した英
単語を音声認識するのは困難である。また、認識精度お
よび認識処理速度の向上の観点から、例えばｂｅ動詞
や、前置詞などの、検索頻度が低いと予想される英単語
は、認識対象語としない方が好ましい場合がある。[0007] Further, for example, the English words used in the example sentences in the commentary information may change, but even the headings of such changed English words are registered in the electronic dictionary. In many cases, it is difficult to recognize a changed English word by voice. In addition, from the viewpoint of improving the recognition accuracy and the recognition processing speed, it may be preferable not to use words such as be verbs and prepositions that are expected to be searched infrequently as words to be recognized.

【０００８】本発明は、このような状況に鑑みてなされ
たものであり、音声の認識精度および認識処理速度を向
上させることができるようにするものである。The present invention has been made in view of such a situation, and it is possible to improve the accuracy of voice recognition and the speed of recognition processing.

【０００９】[0009]

【課題を解決するための手段】請求項１に記載の音声認
識装置は、情報を表示する表示手段に表示させる語句で
ある表示語を、対応する認識対象語と関係付けて記憶し
ている表示語記憶手段を備え、認識対象語記憶手段に記
憶されている認識対象語を対象として音声認識する音声
認識手段が、音声を、認識対象語記憶手段に記憶されて
いる認識対象語のうち、表示手段に表示されている表示
語と関係付けられているもののみを対象として音声認識
することを特徴とする。According to a first aspect of the present invention, there is provided a voice recognition device in which a display word, which is a word to be displayed on a display means for displaying information, is stored in association with a corresponding recognition target word. A voice recognition unit that includes a word storage unit and that performs voice recognition on a recognition target word stored in the recognition target word storage unit displays a voice among the recognition target words stored in the recognition target word storage unit. It is characterized in that the voice recognition is performed only on the words associated with the display word displayed on the means.

【００１０】請求項４に記載の音声認識方法は、表示手
段に、表示語を表示させ、音声認識手段に、音声を、認
識対象語記憶手段に記憶されている認識対象語のうち、
表示手段に表示されている表示語と関係付けられている
もののみを対象として音声認識させることを特徴とす
る。According to a fourth aspect of the present invention, in the voice recognition method, a display word is displayed on the display means, and the voice recognition means selects a voice from among the recognition target words stored in the recognition target word storage means.
It is characterized in that only the words associated with the display word displayed on the display means are subjected to voice recognition.

【００１１】請求項１に記載の音声認識装置において
は、表示語記憶手段は、情報を表示する表示手段に表示
させる語句である表示語を、対応する認識対象語と関係
付けて記憶しており、音声認識手段は、音声を、認識対
象語記憶手段に記憶されている認識対象語のうち、表示
手段に表示されている表示語と関係付けられているもの
のみを対象として音声認識するようになされている。In the voice recognition apparatus according to the first aspect, the display word storage means stores the display word, which is a phrase displayed on the display means for displaying information, in association with the corresponding recognition target word. The voice recognition unit recognizes the voice only for the recognition target words stored in the recognition target word storage unit that are associated with the display word displayed on the display unit. Has been done.

【００１２】請求項４に記載の音声認識方法において
は、表示手段に、表示語を表示させ、音声を、認識対象
語記憶手段に記憶されている認識対象語のうち、表示手
段に表示されている表示語と関係付けられているものの
みを対象として音声認識するようになされている。In the voice recognition method according to the present invention, the display word is displayed on the display means, and the voice is displayed on the display means among the recognition target words stored in the recognition target word storage means. The speech recognition is performed only for the words associated with the displayed word.

【００１３】[0013]

【発明の実施の形態】図１は、本発明を適用した電子辞
書装置の一実施例の構成を示している。なお、この電子
辞書装置は、例えば、持ち運びに便利なように携帯型と
されており、また、音声により英単語の検索を行うこと
ができるようになされている。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 shows the configuration of an embodiment of an electronic dictionary device to which the present invention is applied. It should be noted that this electronic dictionary device is, for example, of a portable type so as to be convenient to carry, and is capable of searching English words by voice.

【００１４】マイク１（入力手段）は、そこに入力され
る音声を、電気信号である音声信号にして、Ａ／Ｄ変換
器２に出力するようになされている。Ａ／Ｄ変換器２
は、マイク１からのアナログの音声信号を、所定のサン
プリングクロックにしたがってサンプリングし、さらに
量子化することで、ディジタルの音声信号とするように
なされている。Ａ／Ｄ変換器２でディジタル信号とされ
た音声信号は、音声認識回路５に供給されるようになさ
れている。The microphone 1 (input means) converts the voice input therein into a voice signal which is an electric signal and outputs the voice signal to the A / D converter 2. A / D converter 2
Is adapted to sample an analog voice signal from the microphone 1 in accordance with a predetermined sampling clock and further quantize it to obtain a digital voice signal. The voice signal converted into a digital signal by the A / D converter 2 is supplied to the voice recognition circuit 5.

【００１５】ＲＡＭ（Random Access Memory）３は、Ａ
／Ｄ変換器２から音声認識回路５を介して供給される音
声信号や、音声認識回路５の動作上必要なデータなどを
一時記憶するようになされている。ＲＯＭ（Read Only
Memory）４は、例えば、音声認識回路５に音声認識を行
わせるためのアプリケーションプログラムを記憶してい
る。音声認識回路５（音声認識手段）は、Ａ／Ｄ変換器
２から供給される音声信号に対し、ＲＯＭ４に記憶され
ているアプリケーションプログラムにしたがった処理を
施すことで、マイク１に入力された音声を、後述する音
声認識データＲＯＭ１５に記憶されている認識対象語を
対象として音声認識し、その音声認識結果を、ＣＰＵ１
０に供給するようになされている。A RAM (Random Access Memory) 3 is
A voice signal supplied from the / D converter 2 via the voice recognition circuit 5 and data necessary for the operation of the voice recognition circuit 5 are temporarily stored. ROM (Read Only
Memory) 4 stores, for example, an application program for causing the voice recognition circuit 5 to perform voice recognition. The voice recognition circuit 5 (voice recognition means) processes the voice signal supplied from the A / D converter 2 in accordance with the application program stored in the ROM 4, so that the voice input to the microphone 1 is processed. Is voice-recognized for a recognition target word stored in a voice recognition data ROM 15 to be described later, and the voice recognition result is obtained by the CPU 1
It is designed to supply 0.

【００１６】即ち、音声認識回路５は、後述するＣＰＵ
１０から供給される信号にしたがって、音声認識データ
ＲＯＭ１５に記憶されている認識対象語の幾つかを読み
出し、その認識対象語により構成される辞書（以下、適
宜、認識用辞書という）を作成する。そして、音声認識
回路５は、Ａ／Ｄ変換器２から音声信号を受信すると、
それを、ＲＡＭ３に供給して記憶させ、その後、ＲＡＭ
３に、例えば１単語分の音声信号が記憶されると、その
音声信号の、例えば音程（周波数）や、強弱（振幅）、
速度（発話速度）などを音響分析し、その分析結果に基
づいて、認識用辞書に記憶されている認識対象語それぞ
れの、マイク１に入力された音声（単語）に対する尤度
を算出する。この尤度は、対応する認識対象語それぞれ
とともに、音声認識回路５からＣＰＵ１０に供給され
る。That is, the voice recognition circuit 5 is a CPU which will be described later.
According to the signal supplied from 10, some of the recognition target words stored in the voice recognition data ROM 15 are read, and a dictionary composed of the recognition target words (hereinafter, appropriately referred to as a recognition dictionary) is created. When the voice recognition circuit 5 receives the voice signal from the A / D converter 2,
It is supplied to RAM3 for storage, and then RAM
When a voice signal for one word is stored in 3, for example, the pitch (frequency), the strength (amplitude),
The speed (speech speed) is acoustically analyzed, and the likelihood of each recognition target word stored in the recognition dictionary with respect to the voice (word) input to the microphone 1 is calculated based on the analysis result. This likelihood is supplied from the voice recognition circuit 5 to the CPU 10 together with each corresponding recognition target word.

【００１７】操作部６は、例えば、単語指定キー６Ａ、
次候補キー６Ｂ、決定キー６Ｃ、およびスクロールキー
６Ｄなどで構成され、装置に対し、所定の指示を与える
ときに操作される。即ち、単語指定キー６Ａは、後述す
るＬＣＤ１７に、認識対象語となっている単語と、なっ
ていない単語とを区別して表示させるときに操作され
る。次候補キー６Ｂは、音声認識回路５による音声認識
結果の次の候補を要求するときに操作される。即ち、Ｌ
ＣＤ１７には、音声認識回路５による音声認識結果とし
て、まず、尤度の最も高い単語が表示されるようになさ
れているが、その音声認識結果が誤っている場合があ
る。このような場合に、次候補キー６Ｂが操作される
と、ＬＣＤ１７には、次に尤度が高い単語が表示される
ようになされている。決定キー６Ｃは、ＬＣＤ１７に表
示された音声認識結果が正しい場合に、その音声認識結
果を確定するときに操作される。スクロールキー６Ｄ
は、ＬＣＤ１７の表示をスクロールさせるときに操作さ
れる。The operation unit 6 is, for example, a word designation key 6A,
It is composed of a next candidate key 6B, an enter key 6C, a scroll key 6D, etc., and is operated when giving a predetermined instruction to the device. That is, the word designation key 6A is operated when the LCD 17 described later distinguishes and displays a word that is a recognition target word and a word that is not a recognition target word. The next candidate key 6B is operated when requesting the next candidate of the voice recognition result by the voice recognition circuit 5. That is, L
As the voice recognition result by the voice recognition circuit 5, the word with the highest likelihood is first displayed on the CD 17, but the voice recognition result may be incorrect. In such a case, when the next candidate key 6B is operated, the word having the next highest likelihood is displayed on the LCD 17. The enter key 6C is operated to confirm the voice recognition result when the voice recognition result displayed on the LCD 17 is correct. Scroll key 6D
Is operated when scrolling the display of the LCD 17.

【００１８】キー入力回路９は、単語指定キー６Ａ、次
候補キー６Ｂ、決定キー６Ｃ、またはスクロールキー６
Ｄのうちのいずれかが操作されると、その操作に対応す
る操作信号を、ＣＰＵ１０に出力するようになされてい
る。ＣＰＵ（Central Processor Unit）１０（検索手
段）は、キー入力回路９からの操作信号にしたがって、
音声認識回路５や文字表示回路１３などを制御するよう
になされている。さらに、ＣＰＵ１０は、後述する文章
データＲＯＭ１４または単語辞書データＲＯＭ１６にそ
れぞれ記憶されている文章または単語（辞書単語）の解
説情報を検索するようにもなされている。The key input circuit 9 includes a word designation key 6A, a next candidate key 6B, an enter key 6C, or a scroll key 6.
When any one of D is operated, an operation signal corresponding to the operation is output to the CPU 10. The CPU (Central Processor Unit) 10 (search means) is operated according to the operation signal from the key input circuit 9.
The voice recognition circuit 5 and the character display circuit 13 are controlled. Further, the CPU 10 is also configured to retrieve commentary information of a sentence or a word (dictionary word) stored in a sentence data ROM 14 or a word dictionary data ROM 16 described later, respectively.

【００１９】ＲＯＭ１１は、システムプログラムや、所
定の処理を行うためのアプリケーションプログラムを記
憶しており、ＣＰＵ１１は、このＲＯＭ１１に記憶され
ているプログラムを実行することで、各種の処理を行う
ようになされている。ＲＡＭ１２は、ＣＰＵ１０の動作
上必要なデータを記憶するようになされている。The ROM 11 stores a system program and an application program for performing a predetermined process, and the CPU 11 executes various programs by executing the program stored in the ROM 11. ing. The RAM 12 stores data necessary for the operation of the CPU 10.

【００２０】文字表示回路１３は、ＣＰＵ１０から、例
えば、単語や解説情報などの情報を受信すると、その情
報を構成する文字のビットパターンを生成し、ＬＣＤ１
７に供給して表示させるようになされている。その他、
文字表示回路１３は、ＣＰＵ１０の制御にしたがって、
ＬＣＤ１７の表示制御を行うようになされている。When the character display circuit 13 receives information such as words and commentary information from the CPU 10, the character display circuit 13 generates a bit pattern of characters constituting the information, and the LCD 1
It is designed to be supplied to the No. 7 and displayed. Other,
The character display circuit 13, under the control of the CPU 10,
The display of the LCD 17 is controlled.

【００２１】文章データＲＯＭ１４（表示語記憶手
段）、音声認識データＲＯＭ１５（認識対象語記憶手
段）、または単語辞書データＲＯＭ１６（解説情報記憶
手段）は、後述する文章データ、音声認識データ、また
は単語辞書データを、それぞれ記憶している。The text data ROM 14 (display word storage means), the voice recognition data ROM 15 (recognition target word storage means), or the word dictionary data ROM 16 (commentary information storage means) are used for the text data, voice recognition data, or word dictionary described later. The data is stored respectively.

【００２２】ＬＣＤ（液晶ディスプレイ）１７（表示手
段）は、文字表示回路１３の制御にしたがって、情報を
表示するようになされている。The LCD (liquid crystal display) 17 (display means) displays information under the control of the character display circuit 13.

【００２３】次に、図２を参照して、文章データＲＯＭ
１４、音声認識データＲＯＭ１５、または単語辞書デー
タＲＯＭ１６それぞれに記憶されている文章データ、音
声認識データ、または単語辞書データについて説明す
る。Next, referring to FIG. 2, the text data ROM
14, the sentence recognition data, the voice recognition data, and the word dictionary data stored in each of the voice recognition data ROM 15 and the word dictionary data ROM 16 will be described.

【００２４】まず、図２（Ａ）は、文章データＲＯＭ１
４に記憶されている文章データを示している。文章デー
タは、ＬＣＤ１７に表示させる種々の文章（本実施例で
は、英文とする）に対して付されたユニークな文章番号
（従って、文章番号と文章とは１対１に対応してい
る）、各文章を構成する英単語（文章は、この単語の並
びが表示されることによって表示されるので、この文章
を構成する単語を、以下、適宜、表示語という）、およ
び各表示語に対して付された音声認識単語番号から構成
されている。First, FIG. 2A shows a sentence data ROM 1
4 shows the text data stored therein. The text data is a unique text number attached to various texts (English texts in this embodiment) to be displayed on the LCD 17 (hence, the text numbers and the texts have a one-to-one correspondence), The English words that make up each sentence (since sentences are displayed by displaying the sequence of these words, the words that make up this sentence are hereinafter referred to as display words as appropriate), and for each display word It is composed of the attached voice recognition word numbers.

【００２５】図２（Ａ）の実施例では、表示語「Seein
g」、「is」、および「believing」で構成される文章
「Seeing is believing.」に対し、文章番号１００が付
されており、各表示語「Seeing」、「is」、または「be
lieving」には、音声認識単語番号２０３，０、または
２２２がそれぞれ付されている。また、表示語「Yo
u」、「shall」、および「see」で構成される文章「You
shall see.」に対し、文章番号１０１が付されてお
り、各表示語「You」、「shall」、または「see」に
は、音声認識単語番号０，２３０、または２００がそれ
ぞれ付されている。In the embodiment of FIG. 2A, the display word "Seein
The sentence number 100 is attached to the sentence “Seeing is believing.” composed of “g”, “is”, and “believing”, and each display word “Seeing”, “is”, or “be” is added.
The speech recognition word number 203, 0, or 222 is attached to "lieving", respectively. In addition, the display word "Yo
The sentence "You" consisting of "u", "shall", and "see"
The sentence number 101 is attached to "should see.", and the voice recognition word number 0, 230, or 200 is attached to each display word "You", "shall", or "see". .

【００２６】図２（Ｂ）は、音声認識データＲＯＭ１５
に記憶されている音声認識データを示している。音声認
識データは、各認識対象語に対して付されたユニークな
音声認識単語番号（従って、音声認識単語番号と認識対
象語とは１対１に対応している）、認識対象語としての
単語（ここでは、英単語とする）、および各認識対象語
に対して付された辞書単語番号から構成されている。FIG. 2B shows the voice recognition data ROM 15
3 shows the voice recognition data stored in FIG. The voice recognition data includes a unique voice recognition word number assigned to each recognition target word (hence, there is a one-to-one correspondence between the voice recognition word number and the recognition target word), and the word as the recognition target word. (Here, it is an English word), and a dictionary word number given to each recognition target word.

【００２７】図２（Ｂ）の実施例では、英単語の原形
「シー（see）」や「ビリーブ（believe）」などの他、
それらが格変化した英単語「ソウ（saw）」、「シーン
（seen）」、「シーイング（seeing）」、「ビリーブド
（believed）」、「ビリービング（believing）」など
も、認識対象語として、音声認識データＲＯＭ１５に記
憶されている。In the embodiment of FIG. 2 (B), in addition to the original English words "see" and "believe",
The English words "saw", "seen", "seeing", "believed", "believing", etc., which have changed their meaning, are also recognized words. It is stored in the voice recognition data ROM 15.

【００２８】ここで、本実施例では、図２（Ａ）の表示
語に付されている音声認識単語番号は、その表示語に対
応する認識対象語のものとなっている。即ち、文章デー
タＲＯＭ１４には、表示語が、対応する認識対象語と関
係付けられて記憶されている。具体的には、例えば、表
示語「Seeing」には、音声認識単語番号２０３が対応付
けられており、従って、音声認識単語番号２０３が付さ
れた認識対象語「シーイング（seeing）」と関係付けら
れている。また、例えば、表示語「believing」には、
音声認識単語番号２２２が対応付けられており、従っ
て、音声認識単語番号２２２が付された認識対象語「ビ
リービング（believing）」と関係付けられている。さ
らに、例えば、表示語「see」には、音声認識単語番号
２００が対応付けられており、従って、音声認識単語番
号２００が付された認識対象語「シー（see）」と関係
付けられている。以上のように、表示語から認識対象語
に対しては、リンクがはられている。Here, in this embodiment, the voice recognition word number given to the display word in FIG. 2A is the recognition target word corresponding to the display word. That is, the display word is stored in the text data ROM 14 in association with the corresponding recognition target word. Specifically, for example, the display word “Seeing” is associated with the voice recognition word number 203, and thus is associated with the recognition target word “seeing” with the voice recognition word number 203. Has been. Also, for example, in the display word "believing",
The voice recognition word number 222 is associated, and thus is associated with the recognition target word “believing” with the voice recognition word number 222. Furthermore, for example, the display word “see” is associated with the voice recognition word number 200, and thus is associated with the recognition target word “see” with the voice recognition word number 200. . As described above, the display word is linked to the recognition target word.

【００２９】なお、本実施例では、音声認識単語番号と
して、例えば０以上の整数が用いられるようになされて
いる。但し、認識対象語に付される音声認識単語番号
は、例えば、０を除いたもの、即ち、正の整数が用いら
れるようになされている。従って、表示語に、音声認識
単語番号として０が付されている場合、その表示語と関
係付けられている認識対象語は存在せず、その結果、そ
の表示語は、音声認識の対象とはされない。図２（Ａ）
の実施例では、上述したように表示語「is」および「Yo
u」に対して、音声認識単語番号０が付されており、従
って、この「is」および「You」は音声認識されないよ
うになされている。In this embodiment, as the voice recognition word number, for example, an integer of 0 or more is used. However, as the voice recognition word number given to the recognition target word, for example, one excluding 0, that is, a positive integer is used. Therefore, when 0 is added to the display word as the voice recognition word number, there is no recognition target word associated with the display word, and as a result, the display word is not the target of voice recognition. Not done. FIG. 2 (A)
In this example, the display words "is" and "Yo
The voice recognition word number 0 is added to "u", so that "is" and "You" are not recognized by voice.

【００３０】図２（Ｃ）は、単語辞書データＲＯＭ１６
に記憶されている単語辞書データを示している。単語辞
書データは、例えば、通常の英和辞書などに掲載されて
いる英単語（以下、適宜、辞書単語という）それぞれに
対して付されたユニークな辞書単語番号（従って、辞書
単語番号と辞書単語とは１対１に対応している）、辞書
単語、および各辞書単語の解説情報から構成されてい
る。FIG. 2C shows the word dictionary data ROM 16
The word dictionary data stored in FIG. The word dictionary data is, for example, a unique dictionary word number assigned to each English word (hereinafter, appropriately referred to as a dictionary word) published in a normal English-Japanese dictionary (hence, the dictionary word number and the dictionary word Correspond to one to one), dictionary words, and commentary information of each dictionary word.

【００３１】また、各辞書単語の解説情報は、この実施
例では、例えば、その辞書単語の発音記号や、品詞、変
化形、意味などでなる辞書内容と、その辞書単語の例文
（用例）としての文章に付された文章番号とで構成され
ている。Further, in this embodiment, the commentary information of each dictionary word is, for example, as a phonetic symbol of the dictionary word, dictionary contents including a part of speech, a variation, a meaning, and an example sentence (example) of the dictionary word. And the sentence number attached to the sentence.

【００３２】ここで、本実施例では、図２（Ｂ）の認識
対象語に付されている辞書単語番号は、その認識対象語
に対応する辞書単語のものとなっている。即ち、音声認
識データＲＯＭ１５には、認識対象語が、対応する辞書
単語と関係付けられて記憶されている（従って、その辞
書単語の解説情報とも関係付けられて記憶されてい
る）。このことは、認識対象語から辞書単語に対して、
リンクがはられているということができる。In this embodiment, the dictionary word number given to the recognition target word in FIG. 2B is that of the dictionary word corresponding to the recognition target word. That is, the recognition target word is stored in the voice recognition data ROM 15 in association with the corresponding dictionary word (hence, in association with the commentary information of the dictionary word). This means that from the recognition target word to the dictionary word,
It can be said that the link has been made.

【００３３】図２の実施例では、辞書単語としての単語
の原形「see」には、その原形を音声認識するための認
識対象語「シー（see）」の他、その変化形「ソウ（sa
w）」、「シーン（seen）」、および「シーイング（see
ing）」も関係付けられている。さらに、辞書単語とし
ての単語の原形「bilieve」には、やはり、その原形を
音声認識するための認識対象語「ビリーブ（believ
e）」の他、その変化形「ビリーブド（believed）」お
よび「ビリービング（believing）」も関係付けられて
いる。In the embodiment shown in FIG. 2, in addition to the recognition target word "see" for speech recognition of the original form of a word as a dictionary word, its variation "saw (sa)
w) ”,“ seen ”, and“ seeing ”(see
ing) ”is also associated. Furthermore, the original form of a word as a dictionary word, "bilieve", is still the recognition target word "believ" for speech recognition of the original form.
e) ”, as well as its variants“ believed ”and“ believing ”.

【００３４】また、本実施例では、図２（Ｃ）の辞書単
語の解説情報における文章番号は、その辞書単語に対応
する表示語を用いた文章に付されているものとなってい
る。即ち、単語辞書データＲＯＭ１６には、単語辞書
が、対応する文章と関係付けられて記憶されている。こ
のことは、辞書単語から文章に対して、リンクがはられ
ているということができる。Further, in this embodiment, the sentence number in the commentary information of the dictionary word in FIG. 2C is attached to the sentence using the display word corresponding to the dictionary word. That is, the word dictionary data ROM 16 stores the word dictionary in association with the corresponding sentence. This means that a dictionary word is linked to a sentence.

【００３５】以上のように、文章データと音声認識デー
タとの間、音声認識データと単語辞書データとの間、お
よび単語辞書データと文章データとの間にはリンクがは
られている。As described above, links are established between the sentence data and the voice recognition data, between the voice recognition data and the word dictionary data, and between the word dictionary data and the sentence data.

【００３６】次に、図３のフローチャートおよび図４を
参照して、その動作について説明する。まず最初に、ス
テップＳ１において、ＣＰＵ１０は、文章データＲＯＭ
１４から、所定の文章番号の文章を構成する表示語およ
びそれに付された音声認識単語番号を読み出し、そのう
ちの表示語を、文字表示回路１３を介して、ＬＣＤ１７
に供給して表示させる。即ち、ステップＳ１では、所定
の文章番号が付された文章が、ＬＣＤ２に表示される。Next, the operation will be described with reference to the flowchart of FIG. 3 and FIG. First, in step S1, the CPU 10 determines that the text data ROM
A display word forming a sentence having a predetermined sentence number and a voice recognition word number attached to the display word are read from 14, and the display word among them is displayed on the LCD 17 via the character display circuit 13.
And display it. That is, in step S1, the sentence with a predetermined sentence number is displayed on the LCD 2.

【００３７】ここで、ステップＳ１における文章の表示
は、例えば次のようにして行われる。即ち、ユーザが、
所定の英単語を発話すると、その音声は、マイク１およ
びＡ／Ｄ変換器２を介して音声認識回路５に供給され
る。この場合、音声認識回路５は、例えば、音声認識デ
ータＲＯＭ１５に記憶されている認識対象語すべてを対
象に音声認識を行い、その音声認識結果を、ＣＰＵ１０
に出力する。ＣＰＵ１０は、音声認識回路５から音声認
識結果としての英単語を受信すると、単語辞書データＲ
ＯＭ１６から、その英単語を検索し、それに付されてい
る文章番号を読み出す。さらに、ＣＰＵ１０は、その文
章番号を、文章データＲＯＭ１４から検索し、その文章
番号が付された文章を構成する表示語を、文章データＲ
ＯＭ１４から読み出して、文字表示回路１３を介して、
ＬＣＤ１７に供給する。以上のようにして、ステップＳ
１では、例えば、ユーザが発話した英単語を用いた文章
（例文）などが表示される。Here, the text display in step S1 is performed as follows, for example. That is, the user
When a predetermined English word is uttered, the voice is supplied to the voice recognition circuit 5 via the microphone 1 and the A / D converter 2. In this case, the voice recognition circuit 5 performs voice recognition on all the recognition target words stored in the voice recognition data ROM 15, for example, and outputs the result of the voice recognition to the CPU 10.
Output to When the CPU 10 receives an English word as a voice recognition result from the voice recognition circuit 5, the word dictionary data R
The English word is searched from the OM 16 and the sentence number attached to it is read. Further, the CPU 10 retrieves the sentence number from the sentence data ROM 14 and retrieves the display word forming the sentence with the sentence number as the sentence data R.
It is read from the OM 14 and via the character display circuit 13,
It is supplied to the LCD 17. As described above, step S
In 1, for example, a sentence (example sentence) using English words spoken by the user is displayed.

【００３８】例えば、単語辞書データＲＯＭ１６から読
み出された文章番号が１００などである場合、ステップ
Ｓ１では、図４（Ａ）に示すように、文書番号１００が
付された文章（図２（Ａ）に示したように、表示語「Se
eing」、「is」、および「believing」で構成される文
章）「Seeing is believing.」が、ＬＣＤ１７に表示さ
れる。For example, when the sentence number read from the word dictionary data ROM 16 is 100, etc., in step S1, as shown in FIG. 4 (A), the sentence with the document number 100 is added (see FIG. 2 (A)). ), The display word “Se
“Seeing is believing.”, which is a sentence composed of “eing”, “is”, and “believing”, is displayed on the LCD 17.

【００３９】ステップＳ１において文章が表示される
と、ステップＳ２に進み、ユーザにより単語指定キー６
Ａが操作されたかどうかが、ＣＰＵ１０によって判定さ
れる。ステップＳ２において、単語指定キー６Ａが操作
されていないと判定された場合、ステップＳ２に戻る。
また、ステップＳ２において、単語指定キー６Ａが操作
されたと判定された場合、即ち、ユーザにより単語指定
キー６Ａが操作され、その操作に対応する操作信号が、
キー入力回路９からＣＰＵ１０に供給された場合、ステ
ップＳ３に進み、ＬＣＤ１７において、ステップＳ１で
表示された文章を構成する表示語のうち、認識対象語と
なっているものと、なっていないものとが区別して表示
される。When the sentence is displayed in step S1, the process proceeds to step S2 and the user designates the word designation key 6
The CPU 10 determines whether A has been operated. When it is determined in step S2 that the word designation key 6A is not operated, the process returns to step S2.
Further, when it is determined in step S2 that the word designation key 6A is operated, that is, the user operates the word designation key 6A, and an operation signal corresponding to the operation is
When supplied from the key input circuit 9 to the CPU 10, the process proceeds to step S3, and among the display words forming the sentence displayed in step S1 on the LCD 17, some are words to be recognized and some are not. Are displayed separately.

【００４０】即ち、ＣＰＵ１０は、ステップＳ１で文章
データＲＯＭ１４から読み出した音声認識単語番号が０
となっていない表示語、即ち、認識対象語となっている
表示語に、例えば下線などを付すように、文字表示回路
１３を制御する。これに対応して、文字表示回路１３
は、ＬＣＤ１７を制御し、ステップＳ１で表示された文
章を構成する表示語のうち、音声認識単語番号が０とな
っていないものに、下線を表示させる。That is, the CPU 10 determines that the voice recognition word number read from the text data ROM 14 in step S1 is 0.
The character display circuit 13 is controlled so that, for example, an underline is added to a display word that is not defined, that is, a display word that is a recognition target word. In response to this, the character display circuit 13
Controls the LCD 17 to cause the underline to be displayed on the display words constituting the sentence displayed in step S1 whose voice recognition word number is not 0.

【００４１】ここで、例えば、いま、図４（Ａ）に示し
たように、文章「Seeing is believing.」が表示されて
いるとすると、この文章を構成する表示語「Seeing」、
「is」、または「believing」それぞれに付された音声
認識単語番号は、図２（Ａ）に示したように、２０３，
０、または２２２となっている。従って、この場合、ス
テップＳ３では、図４（Ｂ）に示すように、この文章を
構成する表示語のうち、音声認識単語番号が０となって
いない表示語「Seeing」および「believing」に下線が
表示される。従って、ステップＳ３では、ＬＣＤ１７に
表示された表示語のうち、認識対象語と関係付けられて
いるものだけに下線が付される。Here, for example, if the sentence "Seeing is believing." Is displayed as shown in FIG. 4 (A), the display word "Seeing" constituting this sentence,
As shown in FIG. 2A, the voice recognition word numbers assigned to “is” and “believing” are 203,
It is 0 or 222. Therefore, in this case, in step S3, as shown in FIG. 4B, the display words “Seeing” and “believing” whose voice recognition word number is not 0 are underlined among the display words forming the sentence. Is displayed. Therefore, in step S3, among the display words displayed on the LCD 17, only the words related to the recognition target word are underlined.

【００４２】その後、ステップＳ４に進み、ＣＰＵ１０
は、ステップＳ１で文章データＲＯＭ１４から読み出し
た音声認識単語番号を、音声認識回路５に供給する。音
声認識回路５は、ＣＰＵ１０から音声認識単語番号を受
信すると、その音声認識単語番号を、音声認識データＲ
ＯＭ１５から検索し、その音声認識単語番号が付されて
いる認識対象語によって、認識用辞書を構成する。即
ち、音声認識回路５は、ＣＰＵ１０から受信した音声認
識単語番号を、いわば検索キーとして認識対象語を検索
し、その検索した認識対象語によって認識用辞書を構成
する。After that, the process proceeds to step S4, and the CPU 10
Supplies the voice recognition word number read from the text data ROM 14 in step S1 to the voice recognition circuit 5. When the voice recognition circuit 5 receives the voice recognition word number from the CPU 10, the voice recognition circuit 5 sets the voice recognition word number to the voice recognition data R.
A recognition dictionary is constructed by the words to be recognized, which are searched from the OM 15 and are given the voice recognition word numbers. That is, the voice recognition circuit 5 searches for a recognition target word using the voice recognition word number received from the CPU 10 as a search key, and forms a recognition dictionary with the searched recognition target word.

【００４３】ここで、音声認識データＲＯＭ１５に記憶
されている認識対象語に付された音声認識単語番号は、
上述したように、正の整数であるから、音声認識回路５
は、ＣＰＵ１０から音声認識単語番号０を受信した場合
には、その０の音声認識単語番号は無視するようになさ
れている。Here, the voice recognition word number attached to the recognition target word stored in the voice recognition data ROM 15 is
As described above, since it is a positive integer, the voice recognition circuit 5
When the voice recognition word number 0 is received from the CPU 10, the 0 voice recognition word number is ignored.

【００４４】従って、ステップＳ１において、例えば、
図４（Ａ）に示したように、文章「Seeing is believin
g.」が表示された場合、この文章を構成する表示語「Se
eing」、「is」、または「believing」それぞれに付さ
れた音声認識単語番号は、図２（Ａ）に示したように、
２０３，０、または２２２となっているから、音声認識
回路５では、このうちの０が無視され、残りの２０３ま
たは２２２をそれぞれ音声認識単語番号とする認識対象
語「シーイング（Seeing）」または「ビリービング（be
lieving）」から認識用辞書が構成される。Therefore, in step S1, for example,
As shown in FIG. 4A, the sentence “Seeing is believin
g. ”is displayed, the display word“ Se
As shown in FIG. 2 (A), the voice recognition word numbers assigned to “eing”, “is”, and “believing” are as follows.
Since it is 203, 0, or 222, the speech recognition circuit 5 ignores 0 of these, and recognizes the remaining 203 or 222 as a speech recognition word number, respectively, “Seeing” or “Seeing”. Believing (be
lieving) ”forms a recognition dictionary.

【００４５】なお、ステップＳ３およびＳ４について
は、ステップＳ４の処理を先に行ってから、ステップＳ
３の処理を行うようにすることもできるし、ステップＳ
３およびＳ４の処理を同時に行うようにすることもでき
る。Regarding steps S3 and S4, the process of step S4 is performed first, and then step S3.
It is also possible to perform the processing of step 3, or step S
It is also possible to perform the processes of 3 and S4 at the same time.

【００４６】以上のように、認識対象語となっている表
示語に下線が付されるとともに、その表示語に対応する
認識対象語から認識用辞書が構成された後、ユーザは、
下線が付された表示語（英単語）の解説情報を得たい場
合には、その英単語を発話する。このユーザが発した音
声は、マイク１を介することで、アナログの音声信号と
され、さらに、Ａ／Ｄ変換器２を介することで、ディジ
タルの音声信号とされる。このディジタルの音声信号
は、ステップＳ５において、音声認識回路５を介して、
ＲＡＭ３に供給されて記憶される（取り込まれる）。音
声認識回路５は、ＲＡＭ３において音声信号の記憶が開
始されると、ステップＳ６に進み、ＲＡＭ３に１単語分
の音声信号が記憶されたかどうかを判定する。ステップ
Ｓ６において、ＲＡＭ３に１単語分の音声信号が、まだ
記憶されていないと判定された場合、ステップＳ５に戻
り、これにより、ＲＡＭ３において、音声信号が記憶し
続けられる。As described above, after the display word which is the recognition target word is underlined and the recognition dictionary is constructed from the recognition target words corresponding to the display word, the user:
When it is desired to obtain commentary information on an underlined display word (English word), the English word is uttered. The voice uttered by the user is converted into an analog voice signal by passing through the microphone 1, and is further converted into a digital voice signal by passing through the A / D converter 2. This digital voice signal is passed through the voice recognition circuit 5 in step S5.
It is supplied to the RAM 3 to be stored (acquired). When the voice signal is stored in the RAM 3, the voice recognition circuit 5 proceeds to step S6 and determines whether or not the voice signal for one word is stored in the RAM 3. When it is determined in step S6 that the voice signal for one word is not yet stored in the RAM 3, the process returns to step S5, whereby the voice signal is continuously stored in the RAM 3.

【００４７】また、ステップＳ６において、ＲＡＭ３に
１単語分の音声信号が記憶されたと判定された場合、ス
テップＳ７に進み、音声認識回路５は、ＲＡＭ３に記憶
された音声信号に基づき、ステップＳ４で構成した認識
用辞書に登録されている認識対象語のみを対象として、
マイク１に入力された音声を認識する。即ち、音声認識
回路５は、認識用辞書に記憶されている認識対象語それ
ぞれの、マイク１に入力された音声（単語）に対する尤
度を算出し、各認識対象語の音声認識単語番号と対応付
けて、ＣＰＵ１０に供給する。When it is determined in step S6 that the voice signal for one word is stored in the RAM3, the process proceeds to step S7, and the voice recognition circuit 5 determines in step S4 based on the voice signal stored in the RAM3. Targeting only the recognition target words registered in the configured recognition dictionary,
Recognize the voice input to the microphone 1. That is, the voice recognition circuit 5 calculates the likelihood of each of the recognition target words stored in the recognition dictionary with respect to the voice (word) input to the microphone 1, and associates it with the voice recognition word number of each recognition target word. And supply it to the CPU 10.

【００４８】従って、音声認識回路５では、ステップＳ
７において、音声認識データＲＯＭ１５に記憶されてい
る認識対象語のうち、ＬＣＤ１７に表示されている表示
語と関係付けられているもののみを対象として、音声認
識が行われるので、即ち、少ない語数の単語を対象とし
て、音声認識が行われるので、音声認識データＲＯＭ１
５に記憶されている認識対象語すべてを対象として音声
認識を行う場合に比較して、認識精度および認識処理速
度を向上させることができる。Therefore, in the voice recognition circuit 5, step S
7, among the recognition target words stored in the voice recognition data ROM 15, only the words related to the display word displayed on the LCD 17 are subjected to the voice recognition, that is, the number of words is small. Since voice recognition is performed for words, the voice recognition data ROM 1
It is possible to improve the recognition accuracy and the recognition processing speed as compared with the case where the voice recognition is performed on all the recognition target words stored in FIG.

【００４９】さらに、ステップＳ３では、音声認識回路
５において音声認識の対象とする単語に、下線が付され
ることにより、音声認識の対象としない単語と区別して
表示される。従って、ユーザは、いま、音声認識の対象
となっている単語（あるいは、音声認識の対象となって
いない単語）を、容易に認識することができる。Further, in step S3, the words to be recognized by the speech recognition circuit 5 are underlined so that they are displayed separately from the words not to be recognized by speech. Therefore, the user can easily recognize the word that is currently the target of voice recognition (or the word that is not the target of voice recognition).

【００５０】ＣＰＵ１０は、音声認識回路５から、認識
用辞書に登録された認識対象単語それぞれの音声認識単
語番号および尤度を受信すると、ステップＳ８におい
て、音声認識単語番号のうち、最も高い尤度と対応付け
られているものを選択する。さらに、ＣＰＵ１０は、そ
の選択された音声認識単語番号（以下、適宜、選択音声
認識単語番号という）に対応する認識対象語に関係付け
られている表示語であって、ＬＣＤ１７に表示されてい
るものを、それが音声認識結果の第１候補とわかるよう
に、例えば反転表示させるように、文字表示回路１３を
制御する。When the CPU 10 receives the voice recognition word number and the likelihood of each recognition target word registered in the recognition dictionary from the voice recognition circuit 5, in step S8, the highest likelihood of the voice recognition word numbers. Select the one associated with. Further, the CPU 10 is a display word associated with the recognition target word corresponding to the selected voice recognition word number (hereinafter, appropriately referred to as the selected voice recognition word number), which is displayed on the LCD 17. The character display circuit 13 is controlled so that it can be recognized as the first candidate of the voice recognition result, for example, in reverse display.

【００５１】これにより、例えば、いま、図４（Ｂ）で
説明したように、表示語「Seeing」および「believin
g」が認識対象語となっている場合において、例えば「S
eeing」、「believing」の順で、尤度が高かったときに
は、ステップＳ８では、図４（Ｃ）に示すように、最も
尤度の高い「Seeing」が反転表示される。Thus, for example, as described with reference to FIG. 4B, the display words "Seeing" and "believin" are displayed.
When "g" is the recognition target word, for example, "S
When the likelihood is high in the order of “eeing” and “believing”, in step S8, “Seeing” with the highest likelihood is highlighted as shown in FIG. 4C.

【００５２】その後、ステップＳ９に進み、決定キー６
Ｃが操作されたか否かが、ＣＰＵ１０によって判定され
る。ステップＳ９において、決定キー６Ｃが操作されて
いないと判定された場合、ステップＳ１０に進み、次候
補キー６Ｂが操作されたか否かが、ＣＰＵ１０によって
判定される。ステップＳ１０において、次候補キー６Ｂ
が操作されていないと判定された場合、ステップＳ９に
戻り、ステップ９で決定キー６Ｃが操作されたと判定さ
れるか、またはステップＳ１０で次候補キー６Ｂが操作
されたと判定されるまで、ステップＳ９およびＳ１０の
処理を繰り返す。After that, the process proceeds to step S9, and the enter key 6
The CPU 10 determines whether C has been operated. When it is determined in step S9 that the enter key 6C has not been operated, the process proceeds to step S10, in which the CPU 10 determines whether or not the next candidate key 6B has been operated. In step S10, the next candidate key 6B
If it is determined that is not operated, the process returns to step S9, and step S9 is repeated until it is determined in step 9 that the decision key 6C has been operated, or until it is determined in step S10 that the next candidate key 6B has been operated. And the processing of S10 is repeated.

【００５３】また、ステップＳ１０において、次候補キ
ー６Ｂが操作されたと判定された場合、即ち、ＬＣＤ１
７に反転表示された表示語が、ユーザが発話した単語で
はなく、次に尤度の高いものを反転表示させるために、
ユーザが、次候補キー６Ｂを操作した場合、ステップＳ
１１に進み、ＣＰＵ１０は、次に高い尤度と対応付けら
れている音声認識単語番号を、新たに選択し、その新た
に選択された音声認識単語番号（選択音声認識単語番
号）に対応する認識対象語に関係付けられている表示語
であって、ＬＣＤ１７に表示されているものを反転表示
するように、文字表示回路１３を制御する。これによ
り、いま反転表示されている表示語に代えて、次に尤度
の高い表示語が反転表示される。If it is determined in step S10 that the next candidate key 6B has been operated, that is, the LCD 1
In order that the display word highlighted in 7 is not the word spoken by the user but the word with the next highest likelihood,
If the user operates the next candidate key 6B, step S
In step 11, the CPU 10 newly selects a voice recognition word number associated with the next highest likelihood, and recognizes the voice recognition word number corresponding to the newly selected voice recognition word number (selected voice recognition word number). The character display circuit 13 is controlled so that the display word associated with the target word, which is displayed on the LCD 17, is highlighted. As a result, the display word having the next highest likelihood is highlighted in place of the display word currently highlighted.

【００５４】即ち、例えば、いま、図４（Ｃ）で説明し
たように、表示語「Seeing」または「believing」のう
ちの、最も尤度の高い「Seeing」が反転表示されている
場合において、次候補キー６Ｂが操作されたときには、
図４（Ｄ）に示すように、「Seeing」に代えて、その次
に尤度の高い「believing」が反転表示される。That is, for example, as described with reference to FIG. 4C, in the case where "Seeing" having the highest likelihood of the display words "Seeing" or "believing" is highlighted, When the next candidate key 6B is operated,
As shown in FIG. 4D, instead of “Seeing”, “believing” having the next highest likelihood is highlighted.

【００５５】その後、ステップＳ９に戻り、再度、ステ
ップＳ９以下の処理を繰り返す。そして、ステップＳ９
において、決定キー６Ｃが操作されたと判定された場
合、即ち、ＬＣＤ１７に反転表示された表示語が、ユー
ザが発話した単語であり、それを最終的な音声認識結果
として確定させるために、ユーザが、決定キー６Ｃを操
作した場合、ステップＳ１２に進み、ＣＰＵ１０は、そ
の表示語の解説情報を、単語辞書データＲＯＭ１６から
検索する。After that, the process returns to step S9, and the processes after step S9 are repeated again. Then, step S9
In the case where it is determined that the enter key 6C is operated, that is, the display word highlighted on the LCD 17 is the word uttered by the user, and the user needs to confirm the word as the final voice recognition result. When the enter key 6C is operated, the process proceeds to step S12, and the CPU 10 searches the word dictionary data ROM 16 for the commentary information of the display word.

【００５６】即ち、ＣＰＵ１０は、決定キー６Ｃが操作
されたときに選択音声認識単語番号とされていた音声認
識単語番号と対応付けられている辞書単語番号（図２
（Ｂ））を、音声認識回路５を介して、音声認識データ
ＲＯＭ１５を参照することで認識する。さらに、ＣＰＵ
１０は、その認識した辞書単語番号を、単語辞書データ
ＲＯＭ１６から検索する。That is, the CPU 10 causes the dictionary word number (FIG. 2) associated with the voice recognition word number which is the selected voice recognition word number when the enter key 6C is operated.
(B)) is recognized by referring to the voice recognition data ROM 15 via the voice recognition circuit 5. Furthermore, CPU
10 searches the word dictionary data ROM 16 for the recognized dictionary word number.

【００５７】そして、ＣＰＵ１０は、ステップＳ１３に
進み、検索した単語辞書番号と対応付けられている解説
情報を表示するように、文字表示回路１３を制御し、処
理を終了する。以上のように、単語辞書番号を検索キー
として検索が行われ、これにより、ＬＣＤ１７には、確
定された音声認識結果と関係付けられている辞書単語に
対応する解説情報が表示される。Then, the CPU 10 proceeds to step S13, controls the character display circuit 13 so as to display the commentary information associated with the retrieved word dictionary number, and ends the processing. As described above, the search is performed by using the word dictionary number as the search key, whereby the LCD 17 displays the commentary information corresponding to the dictionary word associated with the confirmed voice recognition result.

【００５８】即ち、例えば、図４（Ｃ）に示したよう
に、表示語「Seeing」が反転表示されている場合におい
て、決定キー６Ｃが操作されたときには、その表示語
「Seeing」と関係付けられている認識対象語「シーイン
グ（Seeing）」に対応付けられている辞書単語番号３０
０の辞書単語「see」（図２（Ｃ））の解説情報が検索
されて表示される。また、例えば、図４（Ｄ）に示した
ように、表示語「believing」が反転表示されている場
合において、決定キー６Ｃが操作されたときには、その
表示語「believing」と関係付けられている認識対象語
「ビリービング（believing）」に対応付けられている
辞書単語番号３０２の辞書単語「believe」（図２
（Ｃ））の解説情報が検索されて表示される。That is, for example, when the display word "Seeing" is highlighted as shown in FIG. 4C, when the enter key 6C is operated, the display word "Seeing" is associated with the display word "Seeing". Dictionary word number 30 associated with the recognized word “Seeing”
The commentary information of the dictionary word "see" (FIG. 2 (C)) of 0 is retrieved and displayed. Further, for example, as shown in FIG. 4D, when the display word "believing" is highlighted, when the enter key 6C is operated, it is associated with the display word "believing". The dictionary word “believe” of the dictionary word number 302 associated with the recognition target word “believing” (FIG. 2
The explanation information of (C)) is retrieved and displayed.

【００５９】上述したように、音声認識データＲＯＭ１
５には、英単語の原形「シー（see）」や「ビリーブ（b
elieve）」の他、その変化形「ソウ（saw）」、「シー
ン（seen）」、および「シーイング（seeing）」や、
「ビリーブド（believed）」および「ビリービング（be
lieving）」なども、認識対象語として登録されている
から、原形の他、変化形も音声認識することができる。As described above, the voice recognition data ROM 1
5, the original English words "see" and "believe (b)
"elieve)", its variations "saw", "seen", and "seeing",
"Believed" and "believed"
Since "lieving)" and the like are also registered as recognition target words, it is possible to recognize not only the original form but also the modified form by voice.

【００６０】さらに、原形「シー（see）」、並びにそ
の変化形「ソウ（saw）」、「シーン（seen）」、およ
び「シーイング（seeing）」は、いずれも、その原形で
ある辞書単語「see」と関係付けられている。同様に、
「ビリーブ（believe）」、「ビリーブド（believe
d）」、および「ビリービング（believing）」も、原形
の辞書単語「believe」と関係付けられている。従っ
て、原形についてだけの解説情報が登録された単語辞書
データから、原形または変化形のうちのいずれの発話が
なされても、原形の解説情報を得ることができる。即
ち、原形または変化形それぞれごとに解説情報を用意し
ておく必要がなく、そのように別々に解説情報を用意し
ておく場合に比較して、単語辞書データのデータ量が少
なくなり、いわば効率的な単語辞書データを構成するこ
とが可能となる。Further, the original form "see" and its variations "saw", "seen", and "seeing" are all original dictionary words "." It is associated with "see". Similarly,
"Believe", "believe"
d) ”, and“ believing ”are also associated with the original dictionary word“ believe ”. Therefore, the commentary information of the original form can be obtained from the word dictionary data in which the commentary information of only the original form is registered, regardless of whether the utterance of the original form or the variation form is made. In other words, it is not necessary to prepare commentary information for each original form or variation, and the data amount of the word dictionary data is smaller than that in the case where separate commentary information is prepared, which is, so to speak, efficiency. It becomes possible to compose typical word dictionary data.

【００６１】また、表示語と認識対象語とが関係付けら
れており、認識対象語と辞書単語とも関係付けられてい
るので、同一表記の表示語であって、その発音や意味が
異なるもの（例えば、過去形と過去分子形が同一表記で
あるが、発音が異なるものや、同音異義語など）であっ
ても、各表示語を、認識対象語を介して、正しい辞書単
語と関係付けておくことができ、その結果、音声の認識
結果が正しいにも拘らず、誤った辞書単語が検索される
ことがない。具体的には、例えば、ある文章中Ａにおけ
る、かけらや小片を意味する表示語「scrap」を、認識
対象語を介して、かけらや小片を意味する辞書単語「sc
rap」と関係付けておくとともに、他の文章中Ｂにおけ
る、争いやけんかを意味する表示語「scrap」を、認識
対象語を介して、争いやけんかを意味する辞書単語「sc
rap」と関係付けておくようにすることで、文章Ａまた
はＢが表示されている場合に、表示語「scrap」の発話
がなされたときには、単語辞書データＲＯＭ１６から
は、正しい意味の辞書単語、即ち、かけらや小片を意味
する辞書単語「scrap」、または争いやけんかを意味す
る辞書単語「scrap」がそれぞれ検索される。Further, since the display word and the recognition target word are associated with each other and the recognition target word and the dictionary word are also associated with each other, the display words having the same notation but having different pronunciations and meanings ( For example, even though the past tense and the past molecular tense are the same notation but have different pronunciations, or homonyms, each displayed word is related to the correct dictionary word via the recognition target word. As a result, the wrong dictionary word is not searched even though the recognition result of the voice is correct. Specifically, for example, in a certain sentence A, the display word "scrap" meaning a fragment or a small piece is converted into a dictionary word "sc
rap ”, and the display word“ scrap ”that means conflict or quarrel in the other sentence B is translated into the dictionary word“ sc ”that means conflict or quarrel through the recognition target word.
When the sentence A or B is displayed and the display word “scrap” is uttered, the word dictionary data ROM 16 indicates that the dictionary word of the correct meaning is That is, the dictionary word “scrap” meaning a fragment or a small piece or the dictionary word “scrap” meaning a fight or a quarrel are respectively searched.

【００６２】さらに、文章データＲＯＭ１４に記憶され
ている表示語を、音声認識の対象および解説情報を検索
する対象とするようにすることで、音声認識データＲＯ
Ｍ１５および単語辞書データＲＯＭ１６には、例えば、
固有名詞に関するデータも含めるようにすることができ
る。即ち、何らの制限もなく、固有名詞を、音声認識の
対象および解説情報を検索する対象とすることは、世の
中に存在する固有名詞の数からいって困難であるが、そ
のような莫大な数の固有名詞のうち、文章データＲＯＭ
１４に表示語として記憶されているものだけであれば、
その数は、それほど多くなく、従って、音声認識の対象
および解説情報を検索する対象とすることができる。Further, by making the display word stored in the text data ROM 14 the target of voice recognition and the target of searching commentary information, the voice recognition data RO
In the M15 and the word dictionary data ROM 16, for example,
Data about proper nouns can also be included. That is, it is difficult to set proper nouns as targets of speech recognition and retrieval of commentary information without any restrictions, because of the number of proper nouns in the world. Data ROM among the proper nouns
If only the words stored in 14 are displayed,
The number thereof is not so large, and therefore, it can be a target of voice recognition and a target of searching commentary information.

【００６３】なお、ステップＳ１３では、解説情報のう
ち、辞書内容（辞書単語の発音記号や、意味など）は、
単語辞書データＲＯＭ１６に記憶されているものがその
まま表示されるが、例文については、単語辞書データＲ
ＯＭ１６には、図２（Ｃ）に示したように、その例文に
対応する文章番号が登録されているので、その文章番号
に対応する文章（表示語列）が、文章データＲＯＭ１４
から検索されて表示されるようになされている。In step S13, the dictionary contents (pronunciation symbols and meanings of dictionary words) of the commentary information are
What is stored in the word dictionary data ROM 16 is displayed as it is, but for example sentences, the word dictionary data R is displayed.
As shown in FIG. 2C, since the sentence number corresponding to the example sentence is registered in the OM 16, the sentence (display word string) corresponding to the sentence number is stored in the sentence data ROM 14
It is designed to be searched and displayed from.

【００６４】従って、本実施例では、文章が表示されて
いる状態において、その中のある単語の発話がなされる
と、その単語を用いた例文として、文章データＲＯＭ１
４に記憶されている表示語列としての文章が表示される
ので、再度、その文章を構成する単語の解説情報を検索
することができる。Therefore, in this embodiment, when a certain word in the sentence is uttered while the sentence is displayed, the sentence data ROM 1 is used as an example sentence using the word.
Since the sentence as the display word string stored in No. 4 is displayed, the commentary information of the words forming the sentence can be searched again.

【００６５】次に、図３のステップＳ４における認識用
辞書を構成する処理について、図５を参照して、さらに
詳述する。Next, the process of forming the recognition dictionary in step S4 of FIG. 3 will be described in more detail with reference to FIG.

【００６６】図５（Ａ）の実施例では、例えば、２つの
文章「Second thoughts are best.」および「Seeing is
believing.」が、文章データＲＯＭ１４から検索さ
れ、ＬＣＤ１７において、その２つの文章の全体が表示
されている。そして、この実施例では、ｂｅ動詞の「ar
e」および「is」を除く表示語「Second」、「thought
s」、「best」、「Seeing」、および「believing」に下
線が表示されており、従って、この場合、ステップＳ４
では、これらの５単語から認識用辞書が構成され、音声
認識回路５で行われる音声認識の対象とされる。In the embodiment of FIG. 5A, for example, two sentences "Second thoughts are best." And "Seeing is"
“Believing.” is retrieved from the sentence data ROM 14 and the entire two sentences are displayed on the LCD 17. In this example, the be verb "ar
Display words "Second" and "thought" excluding "e" and "is"
"s", "best", "Seeing", and "believing" are underlined, so in this case, step S4
Then, a recognition dictionary is composed of these five words, and is used as the target of the voice recognition performed by the voice recognition circuit 5.

【００６７】一方、図５（Ｂ）の実施例では、例えば、
４つの文章「Second thoughts arebest.」、「Seeing i
s believing.」、「Slow but steady wins the rac
e.」、および「So many countries, so many costom
s.」が、文章データＲＯＭ１４から検索され、ＬＣＤ１
７において、同図（Ｂ）において実線で示すように、そ
のうちの「Seeing is believing.」および「Slow but s
teady wins the race.」が表示されている。但し、文章
「Slow but steady wins the race.」については、「Sl
ow but steady wins the」の部分だけが表示されてお
り、「race.」は表示されていない。On the other hand, in the embodiment of FIG. 5B, for example,
Four sentences "Second thoughts are best.", "Seeing i
s believing., '' Slow but steady wins the rac
e. ", and" So many countries, so many costom
"s." is retrieved from the text data ROM 14 and the LCD 1
7, the "Seeing is believing." And "Slow but s
"teady wins the race." is displayed. However, for the sentence "Slow but steady wins the race."
Only "ow but steady wins the" is displayed, but "race." is not displayed.

【００６８】そして、この実施例では、ｂｅ動詞の「i
s」および冠詞の「the」を除く表示語「Seeing」、「be
lieving」、「Slow」、「but」、「steady」、および
「wins」に下線が表示されており、従って、この場合、
ステップＳ４では、これらの６単語から認識用辞書が構
成され、音声認識回路５で行われる音声認識の対象とさ
れる。In this embodiment, the be verb "i"
Display words "Seeing", excluding "s" and the article "the"
lieving, "Slow", "but", "steady", and "wins" are underlined, so in this case,
In step S4, a recognition dictionary is constructed from these 6 words and is the target of the voice recognition performed by the voice recognition circuit 5.

【００６９】即ち、この場合、ＬＣＤ１７に表示されて
いない表示語「race」、並びに文章「Second thoughts
are best.」および「So many countries, so many cost
oms.」を構成する表示語は、ユーザにより発話されるこ
とはない。従って、このような表示語を、音声認識の対
象とする必要はないため、ＬＣＤ１７に、現実に表示さ
れている表示語のみ（但し、ここでは、ｂｅ動詞および
冠詞は除かれている）を対象に、認識用辞書が構成され
る。That is, in this case, the display word "race" not displayed on the LCD 17 and the sentence "Second thoughts"
are best. "and" So many countries, so many cost
The display words that make up "oms." are not uttered by the user. Therefore, since such a display word does not need to be the target of voice recognition, only the display word actually displayed on the LCD 17 (however, the be verb and the article are excluded here) is targeted. Then, a recognition dictionary is constructed.

【００７０】そして、ユーザにより、スクロールキー６
Ｄが操作され、これにより、ＬＣＤ１７において、例え
ば、同図（Ｂ）に点線で示すように、文章「Second tho
ughts are best.」および「Seeing is believing.」が
表示されるようになった場合には、これらの文章を構成
する表示語（但し、ここでは、ｂｅ動詞を除く）「Seco
nd」、「thoughts」「best」、「Seeing」、および「be
lieving」に下線が付され、さらに、これらを対象に認
識用辞書が構成される。Then, the user operates the scroll key 6
D is operated, which causes the text "Second tho" to be displayed on the LCD 17, for example, as shown by the dotted line in FIG.
If "ughts are best." and "Seeing is believing." are displayed, the display words that make up these sentences (however, here, the be verb is excluded) "Seco
nd, thoughts, best, Seeing, and be
"Lieving" is underlined, and a recognition dictionary is constructed for these.

【００７１】なお、ＬＣＤ１７に現在表示されている表
示語は、ＣＰＵ１０によって認識されるようになされて
おり、音声認識回路５では、このＣＰＵ１０による表示
語の認識結果に基づいて、認識用辞書が作成されるよう
になされている。The display word currently displayed on the LCD 17 is adapted to be recognized by the CPU 10. The voice recognition circuit 5 creates a recognition dictionary based on the recognition result of the display word by the CPU 10. It is designed to be done.

【００７２】但し、認識用辞書は、ＬＣＤ１７に現実に
表示されている表示語だけを対象とするのではなく、文
章データＲＯＭ１４から検索された文章を構成する表示
語を対象として構成するようにすることも可能である。However, the recognition dictionary is constructed not only for the display words actually displayed on the LCD 17 but also for the display words constituting the sentence retrieved from the sentence data ROM 14. It is also possible.

【００７３】以上、本発明を、電子辞書装置に適用した
場合について説明したが、本発明は、その他、情報を表
示する表示手段に表示された単語その他の語句を、音声
で入力するあらゆる装置に適用可能である。Although the present invention has been described in the case of being applied to an electronic dictionary device, the present invention can be applied to any device that inputs a word or phrase displayed on the display means for displaying information by voice. Applicable.

【００７４】なお、本実施例では、図３のステップＳ１
において、単語を検索し、その単語を用いた例文である
文章を表示させるようにしたが、その他、例えば、文章
データＲＯＭ１４には、書籍や新聞などの文章を記憶さ
せておき、装置の電源がオンされた場合には、ステップ
Ｓ１において、そのような書籍や新聞などの文章を表示
させるようにすることも可能である。In this embodiment, step S1 in FIG.
In the above, the word is searched and the sentence which is an example sentence using the word is displayed. However, in addition, for example, the sentence data ROM 14 stores a sentence such as a book or a newspaper, and the power of the device is turned on. When it is turned on, it is possible to display the text of such a book or newspaper in step S1.

【００７５】また、本実施例では、文章データＲＯＭ１
４、音声認識データＲＯＭ１５、および単語辞書データ
ＲＯＭ１６を、装置に内蔵させるようにしたが、文章デ
ータＲＯＭ１４、音声認識データＲＯＭ１５、および単
語辞書データＲＯＭ１６は、その他、例えば、装置に着
脱可能なＩＣカードなどに内蔵させるようにすることが
可能である。In the present embodiment, the text data ROM 1
4, the voice recognition data ROM 15 and the word dictionary data ROM 16 are incorporated in the device. However, the sentence data ROM 14, the voice recognition data ROM 15, and the word dictionary data ROM 16 may be, for example, an IC card that is removable from the device. It is possible to make it built in.

【００７６】さらに、本実施例では、電子辞書装置に、
いわゆる英和辞書の機能を持たせるようにしたが、電子
辞書装置には、英和辞書の他、例えば、和英辞書や、そ
の他の言語を対象とする辞書の機能を持たせるようにす
ることが可能である。Further, in this embodiment, the electronic dictionary device
Although the function of a so-called English-Japanese dictionary has been provided, the electronic dictionary device can be provided with the functions of, for example, a Japanese-English dictionary and dictionaries for other languages in addition to the English-Japanese dictionary. is there.

【００７７】また、本実施例では、単語を、音声認識の
対象とするようにしたが、単語の他、例えば熟語など
を、音声認識の対象とするようにすることも可能であ
る。Further, in the present embodiment, the word is set as the target of the voice recognition, but it is also possible to set the word as the target of the voice recognition in addition to the word.

【００７８】さらに、本実施例では、表示語が音声認識
の対象とされているかどうかを、文章データＲＯＭ１５
（図２（Ａ））に記憶されている音声認識単語番号が０
かどうかで判別するようにしたが、表示語が音声認識の
対象とされているかどうかは、その他、例えば、その旨
を表す情報（以下、適宜、判別情報という）を、文章デ
ータＲＯＭ１５に記憶されている各表示語に付加するよ
うにし、この判別情報に基づいて判別するようにするこ
とも可能である。そして、この場合、所定の条件によっ
て、特定の表示語を、音声認識の対象としたり、または
しないようにしたりすることが可能である。即ち、例え
ば、判別情報として、０，１，２のうちのいずれかを、
表示語に付加するとともに、所定のボタンを設けるよう
にし、その所定のボタンが操作されていない状態では、
０の判別情報が付された表示語のみを音声認識の対象と
し、所定のボタンが操作された状態では、０または１の
うちのいずれかの判別情報が付された表示語を音声認識
の対象とするようにすることができる。Furthermore, in the present embodiment, it is determined whether or not the display word is the target of voice recognition, by the sentence data ROM 15
The voice recognition word number stored in (FIG. 2A) is 0.
Whether or not the display word is the target of voice recognition is stored in the text data ROM 15 in addition to this, for example, information indicating that fact (hereinafter, appropriately referred to as determination information). It is also possible to add it to each displayed word and make a discrimination based on this discrimination information. Then, in this case, it is possible to set a specific display word as a target of voice recognition or not to perform it, depending on a predetermined condition. That is, for example, as the discrimination information, one of 0, 1, and
In addition to adding to the display word, a predetermined button is provided, and when the predetermined button is not operated,
Only display words with discrimination information of 0 are targeted for voice recognition, and when a predetermined button is operated, display words with discrimination information of either 0 or 1 are targeted for voice recognition. You can do so.

【００７９】また、本実施例では、文章データＲＯＭ１
４には、英語の文章（表示語列）を記憶させておくよう
にしたが、文章データＲＯＭ１４には、例えば、英語の
文章に加えて、その日本語訳なども記憶させておくよう
にすることなどが可能である。Further, in this embodiment, the text data ROM 1
Although the English sentence (display word string) is stored in 4, the sentence data ROM 14 stores the Japanese translation in addition to the English sentence, for example. Things are possible.

【００８０】さらに、本実施例では、文章データＲＯＭ
１４に、１つの表示語に対して、１つの音声認識単語番
号を対応付けておくようにしたが、例えば、複数パター
ンの発話がなされることが予想される表示語（以下、適
宜、複数発話表示語という）がある場合には、そのよう
な複数パターンの発話（この発話は、表示語の発音とし
て誤っているものであっても良い）を、認識対象語とし
て音声認識データＲＯＭ１５に記憶させておき、その複
数の認識対象語の音声認識対象番号と、複数発話表示語
とを対応付けておくようにすることが可能である。即
ち、１つの表示語に対して、複数の認識対象語を関係付
けておくようにすることが可能である。具体的には、例
えば表示語が「ＩＳＯ（International Organization f
or Standardization）」である場合には、これに、認識
対象語として、「イソ（ISO）」、「アイソ（ISO）」、
および「アイエスオウ（ISO）」などを関係付けておく
ようにすることが可能である。この場合、「イソ」、
「アイソ」、および「アイエスオウ」を、「ISO」の解
説情報に関係付けておくことで、これらのうちのいずれ
が発話されても、その発話を、表示語「ＩＳＯ」に対応
するものとして音声認識し、その解説情報を検索するこ
とが可能となる。Further, in this embodiment, the text data ROM
In FIG. 14, one speech recognition word number is associated with one display word. However, for example, a display word that is expected to have a plurality of patterns of utterance If there is a display word), such plural patterns of utterance (this utterance may be erroneous as the pronunciation of the display word) are stored in the voice recognition data ROM 15 as a recognition target word. It is possible to associate the voice recognition target numbers of the plurality of recognition target words with the plurality of utterance display words. That is, it is possible to associate a plurality of recognition target words with one display word. Specifically, for example, the display word is "ISO (International Organization f
or Standardization) ”, the words to be recognized are“ ISO (ISO) ”,“ ISO (ISO) ”,
It is also possible to associate "ISO" and "YES". In this case, "iso",
By associating "iso" and "yes" with the commentary information of "ISO", no matter which of these is uttered, the utterance will be treated as corresponding to the display word "ISO". It becomes possible to recognize and retrieve the commentary information.

【００８１】また、本実施例では、音声認識回路５から
ＣＰＵ１０に対し、音声認識単語番号とともに、それら
に対応する認識対象語それぞれの尤度を供給するように
したが、音声認識回路５からＣＰＵ１０に対しては、例
えば、音声認識単語番号だけを、それらに対応する認識
対象語それぞれの尤度の高い順に供給するようにするこ
とが可能である。この場合、ＣＰＵ１０には、音声認識
単語番号の並びから、尤度の高い認識対象語を判別させ
るようにすれば良い。In the present embodiment, the voice recognition circuit 5 supplies the CPU 10 with the voice recognition word numbers and the likelihoods of the respective recognition target words corresponding thereto, but the voice recognition circuit 5 supplies the CPU 10 with the likelihood. For example, it is possible to supply only the speech recognition word numbers in descending order of likelihood of the respective recognition target words corresponding thereto. In this case, the CPU 10 may determine the recognition target word having a high likelihood from the arrangement of the voice recognition word numbers.

【００８２】さらに、本実施例においては、ＬＣＤ１７
に表示された文章中における表示語の中に、同一の音声
認識単語番号（但し、ここでは、０を除く）が対応付け
られているものが複数ある場合については、特に言及し
なかったが、このような場合、ＣＰＵ１０は、例えば、
そのような表示語すべてに下線を表示するように、文字
表示回路１３を制御するとともに、その複数の音声認識
単語番号のうちの１つを、認識用辞書の構成のためのも
のとして、音声認識回路５に送信するようになされてい
る。Furthermore, in this embodiment, the LCD 17
Although there is a plurality of words associated with the same voice recognition word number (here, excluding 0) among the displayed words in the sentence displayed in, no particular mention was made. In such a case, the CPU 10
The character display circuit 13 is controlled so that all such display words are underlined, and one of the plurality of voice recognition word numbers is used for voice recognition as a component of the recognition dictionary. It is adapted to transmit to the circuit 5.

【００８３】また、本実施例では、動詞の原形に対応す
る認識対象語の他、その変化形に対応する認識対象語
も、その原形に対応する辞書単語に関係付けるようにし
たが、その他、例えば、同一の意味を有する名詞に対応
する認識対象語は、すべて、そのような名詞のうちのい
ずれか１つに対応する辞書単語に関係付けるようにする
ことが可能である。Further, in the present embodiment, the recognition target word corresponding to the original form of the verb and the recognition target word corresponding to its variation are related to the dictionary word corresponding to the original form. For example, all recognition target words corresponding to nouns having the same meaning can be related to a dictionary word corresponding to any one of such nouns.

【００８４】[0084]

【発明の効果】請求項１に記載の音声認識装置および請
求項４に記載の音声認識方法によれば、表示語が表示さ
れている状態において、音声が、所定の認識対象語のう
ち、その表示されている表示語と関係付けられているも
ののみを対象として音声認識される。従って、ユーザが
発話する可能性のある、必要最小限の認識対象語を用い
て、音声認識が行われるので、音声認識精度および音声
認識処理速度を向上させることが可能となる。According to the voice recognition device of the first aspect and the voice recognition method of the fourth aspect, in the state in which the display word is displayed, the voice is one of the predetermined recognition target words. Only the words associated with the displayed word are displayed for voice recognition. Therefore, since the voice recognition is performed using the minimum necessary recognition target words that the user may utter, it is possible to improve the voice recognition accuracy and the voice recognition processing speed.

[Brief description of drawings]

【図１】本発明を適用した電子辞書装置の一実施例の構
成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of an embodiment of an electronic dictionary device to which the present invention is applied.

【図２】図１の文章データＲＯＭ１４、音声認識データ
ＲＯＭ１５、または単語辞書データＲＯＭ１６にそれぞ
れ記憶されている文章データ、音声認識データ、または
単語書データを説明する図である。FIG. 2 is a diagram illustrating sentence data, voice recognition data, or word book data stored in a sentence data ROM 14, a voice recognition data ROM 15, or a word dictionary data ROM 16 of FIG. 1, respectively.

【図３】図１の電子辞書装置の動作を説明するためのフ
ローチャートである。3 is a flowchart for explaining the operation of the electronic dictionary device of FIG.

【図４】ＬＣＤ１７の表示状態を示す図である。FIG. 4 is a diagram showing a display state of LCD 17.

【図５】図３のステップＳ４の処理を説明するための図
である。FIG. 5 is a diagram for explaining the process of step S4 of FIG.

[Explanation of symbols]

１マイク（入力手段），５音声認識回路（音声認
識手段），１０ＣＰＵ（検索手段），１４文章
データＲＯＭ（表示語記憶手段），１５音声認識デ
ータＲＯＭ（認識対象語記憶手段），１６単語辞書
データＲＯＭ（解説情報記憶手段），１７ＬＣＤ
（表示手段）1 microphone (input means), 5 voice recognition circuit (voice recognition means), 10 CPU (search means), 14 text data ROM (display word storage means), 15 voice recognition data ROM (recognition target word storage means), 16 words Dictionary data ROM (commentary information storage means), 17 LCD
(Display means)

Claims

[Claims]

1. A recognition target word storage unit that stores a recognition target word that is a phrase that is a target of voice recognition, an input unit that inputs a voice, and a voice that is input to the input unit. A voice recognition device comprising voice recognition means for recognizing the recognition target word stored in the word storage means, the display means displaying information, and the display word being a phrase displayed on the display means. Further comprising a display word storage unit that stores the associated target recognition word in association with the recognition target word, wherein the voice recognition unit stores the voice in the recognition target word stored in the recognition target word storage unit. Of the above, the voice recognition device is characterized in that the voice recognition is performed only on the object associated with the display word displayed on the display means.

2. The voice according to claim 1, wherein the display unit distinguishes and displays the display words that are associated with the recognition target word from the display words. Recognition device.

3. The comment information storage means for storing commentary information for explaining a phrase, and the search means for searching the commentary information from the commentary information storage means, wherein the recognition target word storage means are: The recognition target word is stored in association with the corresponding commentary information, and the search unit searches for the commentary information associated with the recognition target word that has been speech-recognized by the voice recognition unit. The voice recognition device according to claim 1, wherein the display unit displays the commentary information retrieved by the retrieval unit.

4. A recognition target word storage unit that stores a recognition target word that is a phrase that is a target of voice recognition, an input unit that inputs a voice, and a voice that is input to the input unit as the recognition target. A voice recognition means for recognizing the recognition target word stored in the word storage means as a target, a display means for displaying information, and a display word which is a phrase displayed on the display means, corresponding to the recognition target word. A voice recognition method for a voice recognition device, comprising: a display word storage unit that stores the display word, the display unit displaying the display word, and the voice recognition unit recognizes the voice. A speech recognition method characterized in that among the recognition target words stored in the target word storage means, only the words associated with the display word displayed on the display means are subjected to voice recognition. Law.