JP4622861B2

JP4622861B2 - Voice input system, voice input method, and voice input program

Info

Publication number: JP4622861B2
Application number: JP2005517688A
Authority: JP
Inventors: 健花沢; 誠也長田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2004-02-10
Filing date: 2005-02-02
Publication date: 2011-02-02
Anticipated expiration: 2025-02-02
Also published as: JPWO2005076259A1; WO2005076259A1

Description

本発明は、音声入力システム、音声入力方法、および、音声入力用プログラムに関し、特に、音声認識を利用した音声入力システム、音声入力方法、および、音声入力用プログラムに関する。 The present invention relates to a voice input system , a voice input method, and a voice input program, and more particularly to a voice input system , a voice input method, and a voice input program using voice recognition.

電子辞書や駅名・住所・人名入力などの単語、あるいは、フレーズを入力するシステムにおける入力方法として、キー入力による入力時の手間を省くために音声認識を利用する方法がある。 As an input method in a system for inputting a word or phrase such as an electronic dictionary or a station name / address / person name input, there is a method of using voice recognition in order to save time and effort at the time of input by key input.

従来の音声入力システムの一例が、特許文献１に記載されている。この従来のシステムは、音声入力装置と、音声認識手段と、キー入力装置と、キー入力制御手段と、カテゴリ情報別辞書検索手段と、認識辞書と、認識辞書検索手段と、文字入力手段とから構成されている。 An example of a conventional voice input system is described in Patent Document 1. This conventional system includes a voice input device, a voice recognition unit, a key input device, a key input control unit, a category information dictionary search unit, a recognition dictionary, a recognition dictionary search unit, and a character input unit. It is configured.

上述の従来の音声入力システムは、次のように動作する。 The conventional voice input system described above operates as follows.

すなわち、キー入力装置でキーを押下中に音声入力装置から音声の入力を行うと、入力された音声データが音声認識手段で認識される。カテゴリ情報別辞書検索手段により押下されているキーに割り当てられたカテゴリ種別のレコードのみを対象にして認識辞書を検索するカテゴリ情報別辞書検索処理が行われ、マッチングのとれたレコードに対応した認識結果が出力される。
特開２００１−１５９８９６号公報 That is, when voice is input from the voice input device while the key is being pressed by the key input device, the input voice data is recognized by the voice recognition means. Category information dictionary search processing is performed to search the recognition dictionary only for the category type record assigned to the key pressed by the category information dictionary search means, and the recognition result corresponding to the matched record Is output.
JP 2001-159896 A

特許文献１記載の発明の問題点は、キー入力で認識対象を限定することにより認識精度を補う方法を用いた場合でも、なお誤認識の可能性があり、結果として目的の単語を選択することができないことがある、ということである。 The problem of the invention described in Patent Document 1 is that there is a possibility of misrecognition even when a method for compensating recognition accuracy by limiting recognition objects by key input, and as a result, a target word is selected. There are things that cannot be done.

その理由は、誤認識したときの回復手段がないためである。 The reason is that there is no recovery means when misrecognized.

本発明の目的は、検索対象の単語を一覧表示することで誤認識したときの回復手段を備えた音声入力システムを提供することにある。 An object of the present invention is to provide a voice input system provided with recovery means when erroneously recognized by displaying a list of search target words.

請求項１記載の発明は、ある順序関係で単語を順序づけして記憶する単語入力用辞書と、キーと対応づけられて前記順序関係をもとに認識する認識単語の部分集合が定義された音声認識用辞書と、キー入力手段で入力されたキーと入力された音声に対し前記キーに対応した認識単語の部分集合を利用して前記入力された音声を認識し認識結果候補を出力する音声認識手段と、前記認識結果候補の単語一覧を表示手段に表示する認識候補表示手段と、前記認識結果候補の前記認識単語の中の一つが前記キー入力手段を介して選択されたときに、前記選択された認識単語に対応する前記単語入力用辞書内の単語の前記順序関係において近傍の単語を前記表示手段に表示し、前記近傍の単語を表示した後、前記キー入力手段を再度用いて、表示された単語の集合を再帰的に部分集合に絞り込んで表示する検索辞書単語表示手段と、を備えることを特徴とする音声入力システムである。 According to the first aspect of the present invention , there is provided a word input dictionary for storing words in order according to a certain order relationship, and a voice in which a subset of recognition words associated with keys and recognized based on the order relationship is defined. A speech recognition system that recognizes the input speech using a recognition word and a subset of recognition words corresponding to the key input by the key input means and the input speech, and outputs a recognition result candidate. A recognition candidate display means for displaying a word list of the recognition result candidates on a display means, and when one of the recognition words of the recognition result candidates is selected via the key input means , the selection Display the neighboring words on the display means in the order relation of the words in the word input dictionary corresponding to the recognized recognition word, display the neighboring words, and then display again using the key input means Is A search dictionary word display means a set of words that displays narrow recursively subset, a voice input system, characterized in that it comprises a.

請求項２記載の発明は、請求項１記載の音声入力システムであって、単語間の順序関係が、表音表記順または五十音順であり、音声入力単語の表音表記における先頭の１ないし複数の文字の、１つまたはその集合をキー入力手段のキー入力により指定可能とすることを特徴とする The invention according to claim 2 is the speech input system according to claim 1 , wherein the order relationship between words is in the phonetic notation order or the alphabetical order, and the first one in the phonetic notation of the speech input word or a plurality of characters, and one or feature and Turkey to be designated by the key input of the key input means a set that

請求項３記載の発明は、請求項１または２記載の音声入力システムであって、前記音声認識手段は、キー入力によって音声入力の開始を行うことを特徴とする。 According to a third aspect of the invention, a claim 1 or 2, wherein the voice input system, said voice recognition means is characterized and TURMERIC rows start of the voice input by the key input.

請求項４記載の発明は、請求項１から３のいずれか１項記載の音声入力システムであって、前記検索辞書単語表示手段は、前記検索辞書単語表示手段により表示された１つまたは複数の単語集合の先頭からの共通部分を確定し、非共通部分の最初の１つまたは複数の文字をキー入力手段を再度用いて受け付け、表示された単語集合を再帰的に部分集合に絞り込んで表示することを特徴とする。 A fourth aspect of the present invention is the voice input system according to any one of the first to third aspects , wherein the search dictionary word display means is one or more displayed by the search dictionary word display means. The common part from the beginning of the word set is confirmed, the first one or more characters of the non-common part are accepted again using the key input means, and the displayed word set is recursively narrowed down to the subset and displayed. It is characterized by that.

請求項５記載の発明は、ある順序関係で単語を順序づけして記憶する単語入力用辞書と、キーと対応づけられて順序関係をもとに認識する認識単語の部分集合が定義された音声認識用辞書と、キー入力手段で入力されたキーと入力された音声に対しキーに対応した認識単語の部分集合を利用して入力された音声を認識し認識結果候補を出力する手順と、認識結果候補の認識単語一覧を表示手段に表示する手順と、前記認識結果候補の前記認識単語の中の一つが前記キー入力手段を介して選択されたときに、前記選択された認識単語に対応する前記単語入力用辞書内の単語の前記順序関係において近傍の単語を前記表示手段に表示し、前記近傍の単語を表示した後、前記キー入力手段を再度用いて、表示された単語の集合を再帰的に部分集合に絞り込んで表示する手順と、を備えることを特徴とする音声入力方法である。 The invention according to claim 5 is a word input dictionary for storing words in an ordered order and a speech recognition in which a subset of recognition words associated with keys and recognized based on the order relation is defined. A recognition dictionary, a procedure for recognizing input speech using a subset of recognition words corresponding to the key input by the key input means and a key input by the key input means, and outputting a recognition result candidate; a step of displaying on the display means the recognized word list of candidates, the recognition results when one of the recognized word candidate is selected through the key input means, wherein corresponding to the recognized word the selected In the order relation of words in the word input dictionary, neighboring words are displayed on the display means, and after displaying the neighboring words, the key input means is used again to recursively display the set of displayed words. To a subset A step of displaying in silicon, a voice input method, characterized in that it comprises a.

請求項６記載の発明は、請求項５記載の音声入力方法であって、単語間の順序関係が、表音表記順または五十音順であり、音声入力単語の表音表記における先頭の１ないし複数の文字の、１つまたはその集合を前記キー入力手段でのキー入力により指定する手順を含むことを特徴とする。 The invention according to claim 6 is the speech input method according to claim 5, wherein the order relation between words is in the phonogram order or the Japanese syllabary order, and the first one in the phonogram notation of the speech input word Or a procedure for designating one or a set of a plurality of characters by a key input by the key input means .

請求項７記載の発明は、請求項５または６記載の音声入力方法であって、キー入力によって音声入力の開始を行う手順を備えることを特徴とする。 The invention of claim 7, wherein, there is provided a claim 5 or 6 voice input method according, characterized in that it comprises a procedure for starting the voice input by the key input.

請求項８記載の発明は、請求項５から７のいずれか１項記載の音声入力方法であって、前記単語入力用辞書から選択して表示された１つまたは複数の単語集合の先頭からの共通部分を確定し、非共通部分の最初の１つまたは複数の文字をキー入力手段を再度用いて受け付け、表示された単語集合を再帰的に部分集合に絞り込んで表示する手順を備えることを特徴とする。 The invention according to claim 8 is the speech input method according to any one of claims 5 to 7 , wherein the word input dictionary is selected from the word input dictionary and displayed from the head of one or more word sets. A step of determining a common part, accepting the first one or more characters of the non-common part again using the key input means, and recursively narrowing the displayed word set into a subset And

請求項９記載の発明は、ある順序関係で単語を順序づけして記憶する単語入力用辞書と、キーと対応づけられて順序関係をもとに認識する認識単語の部分集合が定義された音声認識用辞書と、キー入力手段で入力されたキーと入力された音声に対しキーに対応した認識単語の部分集合を利用して入力された音声を認識し認識結果候補を出力する手順と、認識結果候補の認識単語一覧を表示手段に表示する手順と、前記認識結果候補の前記認識単語の中の一つが前記キー入力手段を介して選択されたときに、前記選択された認識単語に対応する前記単語入力用辞書内の単語の前記順序関係において近傍の単語を前記表示手段に表示し、前記近傍の単語を表示した後、前記キー入力手段を再度用いて、表示された単語の集合を再帰的に部分集合に絞り込んで表示する手順と、をコンピュータに実行させることを特徴とする音声入力用プログラムである。 According to the ninth aspect of the present invention , there is provided a word input dictionary for storing words in order according to a certain order relationship, and speech recognition in which a subset of recognition words associated with keys and recognized based on the order relationship is defined. A recognition dictionary, a procedure for recognizing input speech using a subset of recognition words corresponding to the key input by the key input means and a key input by the key input means, and outputting a recognition result candidate; a step of displaying on the display means the recognized word list of candidates, the recognition results when one of the recognized word candidate is selected through the key input means, wherein corresponding to the recognized word the selected In the order relation of words in the word input dictionary, neighboring words are displayed on the display means, and after displaying the neighboring words, the key input means is used again to recursively display the set of displayed words. To a subset A voice input program characterized by executing the instructions to be displayed in silicon, to the computer.

請求項１０記載の発明は、請求項９記載の音声入力用プログラムであって、単語間の順序関係が、表音表記順または五十音順であり、音声入力単語の表音表記における先頭の１ないし複数の文字の、１つまたはその集合を前記キー入力手段でのキー入力により指定する手順をコンピュータに実行させることを特徴とする。 The invention according to claim 10 is the speech input program according to claim 9 , wherein the order relation between words is in the phonogram order or the Japanese syllabary order. A computer is caused to execute a procedure for designating one or a set of one or more characters by a key input by the key input means .

請求項１１記載の発明は、請求項９または１０記載の音声入力用プログラムであって、キー入力によって音声入力の開始を行う手順をコンピュータに実行させることを特徴とする。 The invention according to an eleventh aspect is the voice input program according to the ninth or tenth aspect, characterized by causing a computer to execute a procedure for starting voice input by key input.

請求項１２記載の発明は、請求項９から１１のいずれか１項記載の音声入力用プログラムであって、前記単語入力用辞書から選択して表示された１つまたは複数の単語集合の先頭からの共通部分を確定し、非共通部分の最初の１つまたは複数の文字をキー入力手段を再度用いて受け付け、表示された単語集合を再帰的に部分集合に絞り込んで表示する手順をコンピュータに実行させることを特徴とする。 A twelfth aspect of the invention is the voice input program according to any one of the ninth to eleventh aspects , wherein one or a plurality of word sets selected from the word input dictionary are displayed. Executes the procedure to confirm the common part of the text, accept the first one or more characters of the non-common part by using the key input means again, and recursively narrow down the displayed word set to the subset. It is characterized by making it.

本発明の効果は、仮に誤認識しても目的の単語が選択できることである。 The effect of the present invention is that a target word can be selected even if it is erroneously recognized.

その理由は、音声認識の結果として得られる認識結果候補から検索辞書単語を検索する際に検索辞書中の前後の単語も同時に提示するためである。 The reason is that when searching for a search dictionary word from recognition result candidates obtained as a result of speech recognition, the previous and next words in the search dictionary are also presented.

次に、本発明を実施するための第１の最良の形態について図面を参照して詳細に説明する。 Next, a first best mode for carrying out the present invention will be described in detail with reference to the drawings.

図１は、本発明を実施するための第１の最良の形態の音声入力システムの全体の構成を示すブロック図である。 FIG. 1 is a block diagram showing the overall configuration of a voice input system according to a first best mode for carrying out the present invention.

図１を参照すると、本発明を実施するための第１の最良の形態の音声入力システムは、例えば、単語が五十音順（あらかじめ単語間に定義された順序関係であれば、五十音順でなくてもよい）に登録された検索辞書１０９と、認識辞書１０５と、音声入力を行うマイクロフォン１０３と、マイクオンのためのキー入力と候補選択のためのキー入力、および単語選択のためのキー入力を受け付けるキー入力装置１０４（たとえば、キーボード）と、ディスプレイ等の表示装置１１１と、認識辞書１０５を用いて入力音声から確からしい順に複数の候補を探索する音声認識部１０６と、音声認識部１０６の認識結果である候補単語一覧を表示装置１１１に表示し、その中の１つをユーザに候補選択のためのキー入力により選択させる認識候補表示部１０７と、認識候補表示部１０７で候補が選択された場合に選択された候補および五十音順でその前後の単語を検索辞書１０９から選択して五十音順（あらかじめ単語間に定義された順序関係であれば、五十音順でなくてもよい）に表示装置１１１に表示する検索辞書単語表示部１０８とから構成される。 Referring to FIG. 1, a voice input system according to a first best mode for carrying out the present invention is, for example, in the order of Japanese syllabary (in the case of an alphabetical order defined in advance). (Not necessarily in order) registered search dictionary 109, recognition dictionary 105, microphone 103 for voice input, key input for microphone-on and key selection for candidate selection, and word selection A key input device 104 (for example, a keyboard) that accepts key input; a display device 111 such as a display; a speech recognition unit 106 that searches a plurality of candidates in order from the input speech using the recognition dictionary 105; and a speech recognition unit A list of candidate words, which is a recognition result of 106, is displayed on the display device 111, and a recognition candidate display that allows the user to select one of them by key input for candidate selection is displayed. 107 and the candidate selected when the candidate is selected in the recognition candidate display section 107, and the words before and after the selected candidate in the alphabetical order are selected from the search dictionary 109 to be in alphabetical order (predefined between the words). The search dictionary word display unit 108 displays the information on the display device 111 in the order of the Japanese alphabet.

検索辞書１０９、認識辞書１０５は、メモリ、または、ハードディスク等の記憶装置に格納されている。音声認識部１０６、認識候補表示部１０７、検索辞書単語表示部１０８は、コンピュータに、ハードウェア、または、ソフトウェア、または、それらの組み合わせとして実現できる。また、図示ないが、音声入力システムは、主記憶装置を内蔵しており、音声認識部１０６、認識候補表示部１０７、検索辞書単語表示部１０８は、ハードウェア資源である主記憶装置を使用する。たとえば、音声認識部１０６は、認識結果の候補を主記憶装置に格納し、認識候補表示部１０７は、認識結果の候補を主記憶装置から読み出して表示装置１１１に表示する。以下、主記憶装置の使用は、情報処理装置における一般的な動作なので、いちいち記述しない。 The search dictionary 109 and the recognition dictionary 105 are stored in a storage device such as a memory or a hard disk. The voice recognition unit 106, the recognition candidate display unit 107, and the search dictionary word display unit 108 can be realized on a computer as hardware, software, or a combination thereof. Although not shown, the voice input system includes a main storage device, and the voice recognition unit 106, the recognition candidate display unit 107, and the search dictionary word display unit 108 use a main storage device that is a hardware resource. . For example, the speech recognition unit 106 stores the recognition result candidates in the main storage device, and the recognition candidate display unit 107 reads the recognition result candidates from the main storage device and displays them on the display device 111. Hereinafter, the use of the main storage device is a general operation in the information processing device, and thus will not be described one by one.

次に、本発明を実施するための第１の最良の形態の音声入力システムの動作について図面を参照して説明する。 Next, the operation of the voice input system of the first best mode for carrying out the present invention will be described with reference to the drawings.

図２は、本発明を実施するための第１の最良の形態の音声入力システムの動作を示すフローチャートである。 FIG. 2 is a flowchart showing the operation of the voice input system according to the first best mode for carrying out the present invention.

マイクオンのキー入力に対して処理をスタートし、マイクロフォン１０３が入力音声を入力する（図２ステップＳ０２）。音声認識部１０６が、入力された音声を認識辞書１０５により音声認識し、認識結果の候補を出力する（ステップＳ０３）。音声認識の結果として得られる認識結果の候補を認識候補表示部１０７が表示装置１１１に表示する（ステップＳ０４）。ユーザがキー入力装置１０４から候補選択のためのキー入力をして表示された候補のうち一つを選択すると、（ステップＳ０５）、検索辞書単語表示部１０８は、選択された候補、および、五十音順でその前後の単語を検索辞書１０９から選択して、検索辞書単語（検索結果）として表示装置１１１に表示する（ステップＳ０６）。表示された検索辞書単語（検索結果）のうち一つをユーザがキー入力装置１０４から単語選択のためのキー入力をして選択する（ステップＳ０７）。 Processing is started in response to a microphone-on key input, and the microphone 103 inputs an input voice (step S02 in FIG. 2). The speech recognition unit 106 recognizes the input speech using the recognition dictionary 105 and outputs a recognition result candidate (step S03). The recognition candidate display unit 107 displays the recognition result candidates obtained as a result of the speech recognition on the display device 111 (step S04). When the user selects one of the displayed candidates by performing key input for candidate selection from the key input device 104 (step S05), the search dictionary word display unit 108 displays the selected candidate and five Words before and after the syllable are selected from the search dictionary 109 and displayed on the display device 111 as search dictionary words (search results) (step S06). The user selects one of the displayed search dictionary words (search results) by performing key input for word selection from the key input device 104 (step S07).

次に、本発明の第２の発明を実施するための最良の形態の音声入力システムについて図面を参照して説明する。 Next, a voice input system of the best mode for carrying out the second invention of the present invention will be described with reference to the drawings.

図３は、本発明を実施するための第２の最良の形態の音声入力システムの全体の構成を示すブロック図である。 FIG. 3 is a block diagram showing the overall configuration of the voice input system according to the second best mode for carrying out the present invention.

本発明の第２の最良の形態は、例えば、単語が五十音順に登録された検索辞書１０９と、辞書中の単語の部分集合がそれぞれキーと対応づけられた認識辞書３０１と、音声入力を行うマイクロフォン１０３と、ディスプレイ等の表示装置１１１と、マイクオンのためのキー入力と辞書選択のためのキー入力、候補選択のためのキー入力、および単語選択のためのキー入力を受け付けるキー入力装置１０４と、辞書選択のためのキー入力に応じて認識辞書３０１の認識対象となる部分集合を選択する辞書選択部３０２と、辞書選択部３０２によって選択された認識辞書３０１の部分集合を用いて入力音声から確からしい順に複数の候補を探索する音声認識部１０６と、音声認識部１０６の認識結果である候補単語一覧を表示し、その中の１つをユーザの候補選択のためのキー入力により選択させる認識候補表示部１０７と、認識候補表示部１０７で候補が選択された場合に選択された候補および五十音順でその前後の単語を検索辞書１０９から選択して五十音順に表示する検索辞書単語表示部１０８とから構成される。 The second best mode of the present invention is, for example, a search dictionary 109 in which words are registered in alphabetical order, a recognition dictionary 301 in which a subset of words in the dictionary is associated with a key, and voice input. A microphone 103 to perform, a display device 111 such as a display, and a key input device 104 that accepts key input for microphone-on and key input for dictionary selection, key input for candidate selection, and key input for word selection A dictionary selection unit 302 that selects a subset to be recognized in the recognition dictionary 301 in response to key input for dictionary selection, and an input speech using the subset of the recognition dictionary 301 selected by the dictionary selection unit 302. Voice recognition unit 106 that searches for a plurality of candidates in the order of probability, and a list of candidate words that are the recognition results of the voice recognition unit 106, one of which is displayed A recognition candidate display unit 107 to be selected by key input for selecting a user candidate, and a search dictionary for candidates selected when a candidate is selected by the recognition candidate display unit 107 and words before and after the candidate in alphabetical order A search dictionary word display unit 108 selected from 109 and displayed in alphabetical order.

次に、本発明を実施するための第２の最良の形態の音声入力システムの動作について図面を参照して説明する。 Next, the operation of the voice input system according to the second best mode for carrying out the present invention will be described with reference to the drawings.

図４は、本発明を実施するための第２の最良の形態の音声入力システムの動作を示すフローチャートである。 FIG. 4 is a flowchart showing the operation of the voice input system of the second best mode for carrying out the present invention.

図４を参照すると、辞書選択のためのキー入力に対して処理をスタートし、辞書選択部３０２が、入力した辞書選択のためのキーに応じて認識辞書３０１の部分集合を選択する（図４ステップＡ０２）。キー入力装置１０４は、マイクオンのためのキー入力を受け付ける（ステップＡ０３）。マイクロフォン１０３が、入力音声１０１を入力する（ステップＡ０４）。音声認識部１０６が、入力された音声を認識辞書１０５により音声認識し、認識結果の候補を出力する（ステップＡ０５）。認識候補表示部１０７は、音声認識の結果として得られる認識結果の候補を表示する（ステップＡ０６）。ユーザが、キー入力装置１０４から表示された候補のうち一つを候補選択のためのキー入力をして選択すると（ステップＡ０７）、検索辞書単語表示部１０８は、選択された候補、および、五十音順でその前後の単語を検索辞書１０９から選択して検索辞書単語（検索結果）として表示装置１１１に表示する（ステップＡ０８）。表示された検索辞書単語（検索結果）のうち一つをユーザがキー入力装置１０４から単語選択のためのキー入力をして選択する（ステップＡ０９）。 Referring to FIG. 4, processing is started for key input for dictionary selection, and dictionary selection unit 302 selects a subset of recognition dictionary 301 in accordance with the input key for dictionary selection (FIG. 4). Step A02). The key input device 104 receives a key input for turning on the microphone (step A03). The microphone 103 inputs the input voice 101 (step A04). The voice recognition unit 106 recognizes the input voice by the recognition dictionary 105 and outputs a recognition result candidate (step A05). The recognition candidate display unit 107 displays recognition result candidates obtained as a result of speech recognition (step A06). When the user selects one of the candidates displayed from the key input device 104 by performing key input for candidate selection (step A07), the search dictionary word display unit 108 selects the selected candidate and five Words before and after the ten-sound order are selected from the search dictionary 109 and displayed on the display device 111 as search dictionary words (search results) (step A08). The user selects one of the displayed search dictionary words (search results) by performing key input for word selection from the key input device 104 (step A09).

次に、本発明を実施するための第２の最良の形態の音声入力システムの別の動作について図面を参照して説明する。 Next, another operation of the voice input system of the second best mode for carrying out the present invention will be described with reference to the drawings.

図５は、本発明を実施するための第２の最良の形態の音声入力システムの別の動作を示すフローチャートである。 FIG. 5 is a flowchart showing another operation of the voice input system of the second best mode for carrying out the present invention.

図５を参照すると、辞書選択のためのキー入力１０２に対して処理をスタートし、辞書選択部３０２が、入力した辞書選択のためのキーに応じて認識辞書３０１の部分集合を選択する（図５ステップＢ０２）。キー入力装置１０４は、マイクオンのためのキー入力を受け付ける（ステップＢ０３）。マイクロフォン１０３は、入力音声を入力する（ステップＢ０４）。音声認識部１０６は、入力された音声を音声認識する（ステップＢ０５）。認識候補表示部１０７は、音声認識の結果として得られる認識結果の候補を表示装置１１１に表示する（ステップＢ０６）。ユーザが、表示された候補のうち一つをキー入力装置１０４から候補選択のためのキー入力をして選択すると（ステップＢ０７）、検索辞書単語表示部１０８は、選択された候補および五十音順でその前後の単語を検索辞書１０９から選択して検索辞書単語（検索結果）として表示装置１１１に表示する（ステップＢ０８）。表示された検索辞書単語（検索結果）をさらに絞り込む場合には、キー入力装置１０４は、２回目以降のキー入力を受け付ける（ステップＢ０９）。再度表示された検索結果をこれ以上絞り込まない場合には、そのうち一つをユーザがキー入力装置１０４から単語選択のためのキー入力をして選択する（ステップＢ１０）。 Referring to FIG. 5, processing is started for key input 102 for dictionary selection, and dictionary selection unit 302 selects a subset of recognition dictionary 301 in accordance with the input dictionary selection key (FIG. 5). 5 step B02). The key input device 104 receives a key input for turning on the microphone (step B03). The microphone 103 inputs the input sound (step B04). The voice recognition unit 106 recognizes the input voice (step B05). The recognition candidate display unit 107 displays the recognition result candidates obtained as a result of the speech recognition on the display device 111 (step B06). When the user selects one of the displayed candidates by performing key input for selecting a candidate from the key input device 104 (step B07), the search dictionary word display unit 108 displays the selected candidate and the Japanese syllabary. The words before and after that are selected from the search dictionary 109 in order and displayed on the display device 111 as search dictionary words (search results) (step B08). When further narrowing down the displayed search dictionary words (search results), the key input device 104 accepts the second and subsequent key inputs (step B09). When the search results displayed again are not narrowed down any more, the user selects one of them by performing key input for word selection from the key input device 104 (step B10).

次に、本発明を実施するための第１の最良の形態の実施例について辞書単語検索を例として説明する。 Next, an example of the first best mode for carrying out the present invention will be described taking a dictionary word search as an example.

図６は、本発明を実施するための第１の最良の形態の実施例の動作を示す説明図である。 FIG. 6 is an explanatory diagram showing the operation of the embodiment of the first best mode for carrying out the present invention.

図１０は、検索辞書単語表示部１０８が表示装置１１１に表示する検索辞書単語一覧を示す説明図である。 FIG. 10 is an explanatory diagram showing a search dictionary word list displayed on the display device 111 by the search dictionary word display unit 108.

図１１は、表示装置１１１に表示される認識結果候補、検索辞書単語一覧を示す説明図である。 FIG. 11 is an explanatory diagram showing a recognition result candidate and a search dictionary word list displayed on the display device 111.

図６を参照すると、ユーザがマイクオンのキー入力をして「警官（けいかん）」と発声した場合、マイクロフォン１０３が音声を入力し、入力音声を音声認識部１０６が認識する。認識辞書１０５は、単語をひらがなで登録している。音声認識部１０６が、認識結果候補として、例えば、確からしさの順位とともに、「えいかん」、「けいかん」などの認識結果候補を出力すると、認識候補表示部１０７は、認識結果候補を、最も確からしい認識結果候補がユーザに認識できるように（たとえば、下線）、表示装置１１１に表示する。ユーザによって、認識結果候補の１つ（この場合、けいかん）が選択されると（たとえば、クリック）、検索辞書単語表示部１０８は、図１０に示すように、検索辞書中の「けいかん」に対応する単語、および、五十音順（その他の順序でもよい）でその前後の単語である「警戒」「計画」「警官」「景観」「景気」などを表示装置１１１に一覧表示する。また、ユーザの意図しない認識結果候補が、最も確からしいとして表示される（下線が引かれる）場合もあるが、ユーザが、意図した認識結果候補を選択すれば、検索辞書単語表示部１０８は、同様に、図１０のように表示する。 Referring to FIG. 6, when the user inputs a microphone-on key and utters “Police Officer”, the microphone 103 inputs a voice, and the voice recognition unit 106 recognizes the input voice. The recognition dictionary 105 registers words in hiragana. When the speech recognition unit 106 outputs, for example, recognition result candidates such as “Eikan” and “Keikan” as the recognition result candidates, together with the probability ranking, the recognition candidate display unit 107 most certainly recognizes the recognition result candidates. It is displayed on the display device 111 so that a possible recognition result candidate can be recognized by the user (for example, underline). When one of the recognition result candidates (in this case, Keikan) is selected by the user (for example, click), the search dictionary word display unit 108 corresponds to “Keikan” in the search dictionary as shown in FIG. And a list of “warning”, “plan”, “cop”, “landscape”, “business”, and the like, which are words before and after that in order of Japanese syllabary (other orders may be used). In addition, a recognition result candidate unintended by the user may be displayed as most likely (underlined), but if the user selects a recognition result candidate intended, the search dictionary word display unit 108 Similarly, the display is as shown in FIG.

また、図１１に示すように、認識候補表示部１０７が、検索辞書単語表示部１０８と連携し、音声認識部１０６から入力した最も確からしい音声認識候補に対応する検索辞書単語一覧を表示することも可能である。図１１の左側が認識結果候補であり、右側が、検索辞書単語一覧である。 Further, as shown in FIG. 11, the recognition candidate display unit 107 displays a search dictionary word list corresponding to the most probable speech recognition candidate input from the speech recognition unit 106 in cooperation with the search dictionary word display unit 108. Is also possible. The left side of FIG. 11 is a recognition result candidate, and the right side is a search dictionary word list.

ユーザは、目的の単語である「警官」が一覧中に存在するのでそれを選択すると、例えば、国語辞典ならその単語の意味が、和英辞典ならその単語の英訳が得られることになる。この例では国語辞典や和英辞典の単語検索部を想定して入力言語は日本語、検索対象は１単語となっているが、入力言語が日本語以外の場合や検索対象が複数単語の組み合わせからなる場合も同様である。 If the user selects the target word “Police Officer” in the list, for example, the meaning of the word can be obtained in a Japanese dictionary, and the English translation of the word can be obtained in a Japanese-English dictionary. In this example, the input language is Japanese and the search target is one word, assuming the word search part of the Japanese dictionary or Japanese-English dictionary. However, if the input language is other than Japanese or the search target is a combination of multiple words. The same applies to the case.

例えば、英和辞典の単語検索を想定すると、ユーザが「ｐｏｌｉｃｅｓｔａｔｉｏｎ」の和訳を知りたくて「ぽりすすてーしょん」と発声した場合に、音声認識を行って認識結果候補として「ｐｏｌｉｃｅｓｔａｔｉｏｎ」、あるいは表音表記順またはアルファベット順でそれに近い候補が得られれば、それを選択することで、「ｐｏｌｉｃｅｓｔａｔｉｏｎ」が選択可能となり、最終的に目的の単語あるいはフレーズの和訳を得ることが可能となる。 For example, assuming a word search in an English-Japanese dictionary, if the user wants to know the Japanese translation of “policy station” and says “Polish Station”, speech recognition is performed and “policy station” is used as a recognition result candidate. ”, Or if a candidate close to that in phonetic or alphabetical order is obtained, selecting it will allow you to select“ policy station ”, and finally the Japanese translation of the target word or phrase can be obtained It becomes.

また、認識辞書３０１は検索辞書１０９中の単語をすべて含んでも良いが、代表的なもののみ登録してそれ以外は検索辞書単語一覧から選択しても良い。例えば検索辞書１０９を東京都内の駅名とした場合に、認識辞書３０１には「しんじゅく」のみ登録しておき、「新宿御苑前」「新宿三丁目」は検索辞書単語一覧から選択するようにしても良い。同様に、検索辞書１０９には含まれない単語を認識辞書３０１に持っていても良い。 The recognition dictionary 301 may include all the words in the search dictionary 109, but only representative ones may be registered and other words may be selected from the search dictionary word list. For example, if the search dictionary 109 is a station name in Tokyo, only “Shinjuku” is registered in the recognition dictionary 301, and “Shinjuku Gyoenmae” and “Shinjuku Sanchome” are selected from the search dictionary word list. May be. Similarly, the recognition dictionary 301 may have words that are not included in the search dictionary 109.

次に、本発明を実施するための第２の最良の形態の第１の実施例について説明する。 Next, a first example of the second best mode for carrying out the present invention will be described.

図７は、本発明を実施するための第２の最良の形態の第１の実施例の動作を示す説明図である。 FIG. 7 is an explanatory diagram showing the operation of the first example of the second best mode for carrying out the present invention.

認識辞書３０１中の各単語は、先頭文字ごとに部分集合に分けられている。音声認識部１０６は、ユーザの辞書選択のためのキー入力によりキーと同じ先頭文字の部分集合のみを対象として音声認識を行う。ユーザが、「警官」という単語を検索したい場合、その先頭文字である「ｋ」を辞書選択のためにキー入力してから、マイクオンのキー入力をして「けいかん」と発声すると、辞書選択部３０２は、「か行」または「が行」の文字で始まる単語のみからなる認識辞書３０１の部分集合を選択し、音声認識部１０６に出力する。音声認識部１０６は、マイクロフォン１０３から音声を認識し、辞書選択部３０２からの出力にしたがい、認識結果候補として、例えば、「けいかく」「けいさん」などの認識結果候補を出力する。認識候補表示部１０７は、認識結果候補を表示装置１１１に表示する。 Each word in the recognition dictionary 301 is divided into subsets for each head character. The speech recognition unit 106 performs speech recognition only on a subset of the same first character as the key by a user's key input for selecting a dictionary. When the user wants to search for the word “cop”, the key input of “k”, which is the first character thereof, is used to select a dictionary, and then the key input of microphone on is performed and “Keikan” is uttered. 302 selects a subset of the recognition dictionary 301 consisting only of words starting with the characters “ka line” or “ga”, and outputs the selected subset to the speech recognition unit 106. The speech recognition unit 106 recognizes speech from the microphone 103 and outputs recognition result candidates such as “Keiku” and “Keisan” as recognition result candidates according to the output from the dictionary selection unit 302. The recognition candidate display unit 107 displays the recognition result candidates on the display device 111.

ユーザにより、目的の単語に五十音順で近い「けいかく」が選択されると、検索辞書単語表示部１０８は、図１０に示すように「計画」「警官」「景観」などの検索辞書単語一覧を表示する。このとき、ユーザは、目的の単語である「警官」が一覧中に存在するのでそれを選択可能となる。 When the user selects “Keikaku” close to the target word in the order of the Japanese syllabary, the search dictionary word display unit 108 displays a search dictionary such as “plan”, “cop”, and “landscape” as shown in FIG. Display a word list. At this time, the user can select the target word “police officer” because it is in the list.

また、この例では、仮に誤認識して認識結果候補として「けいかん」が得られなかったとしても、先頭文字が「か行」または「が行」に限定されていることから「けいかく」「けいさん」など「けいかん」に五十音順で近い単語が認識結果候補として得られる。その結果、ユーザは、五十音順で近い「けいかく」を選択するのが容易になる。ユーザにより「けいかく」が選択されると、検索辞書単語表示部１０８は、検索辞書単語一覧として「計画」「警官」「景観」などが五十音順に表示するので、ユーザは、目的の単語「警官」を容易に選択できる。すなわち、単語の先頭文字の限定と五十音順の一覧表示により、音声による単語入力が容易かつ高確度で行えることになる。 Further, in this example, even if the recognition result candidate “Keikan” is not obtained as a recognition result candidate, the first character is limited to “Ka Line” or “Ga Line”. Words close to “Keikan” such as “Kei-san” in alphabetical order are obtained as recognition result candidates. As a result, it becomes easy for the user to select “Keiku” that is close in the order of the Japanese syllabary. When “Keikaku” is selected by the user, the search dictionary word display unit 108 displays “plan”, “cop”, “landscape”, and the like as the search dictionary word list in alphabetical order. You can easily select a “cop”. That is, the word input by voice can be performed easily and with high accuracy by limiting the first character of the word and displaying the list in the order of the Japanese syllabary.

ここで、検索辞書単語一覧では、「ｋ」が、指定されていることから「か行」または「が行」の文字で始まる単語のみを表示しても良いし、五十音順で「か行」または「が行」の前後の文字から始まる単語も含めて表示しても良い。 Here, in the search dictionary word list, since “k” is designated, only words starting with the letters “ka line” or “ga line” may be displayed, or “ka” in alphabetical order. Words starting with the characters before and after “line” or “ga” may also be displayed.

図７の例では先頭文字を１つだけ指定しているが、先頭の複数文字を指定する場合、文字ではなく単語の種類を指定する場合も同様である。例えば、「けいかん」を単語検索するために、「ｋ」と「ｅ」を連続してキー入力し、認識対象として「け」で始まる単語のみからなる認識辞書３０１の部分集合を辞書選択部３０２で選択させることができる。これにより「警官」が認識結果候補として出現しやすくなる。 In the example of FIG. 7, only one leading character is specified, but when specifying a plurality of leading characters, the same is true when specifying a word type instead of characters. For example, in order to search for a word “Keikan”, “k” and “e” are continuously keyed, and a subset of the recognition dictionary 301 consisting only of words starting with “ke” as a recognition target is selected as the dictionary selection unit 302. Can be selected. This makes it easier for the “cop” to appear as a recognition result candidate.

また、単語の種類として意味的カテゴリ「食べ物」「乗り物」「職業」などを定義し、キー入力装置１０４に意味的カテゴリのキーを割り振り、例えば「職業」カテゴリに対応するキー入力を行い、認識対象として「職業」のみからなる認識辞書３０１の部分集合を辞書選択部３０２で選択させることができる。これにより「けいかん」と発声したときに「警官」が認識結果候補として出現しやすくなる。 Also, semantic categories “food”, “vehicle”, “profession”, etc. are defined as word types, keys of the semantic category are assigned to the key input device 104, for example, key input corresponding to the “profession” category is performed, and recognition is performed. The dictionary selection unit 302 can select a subset of the recognition dictionary 301 consisting only of “profession” as an object. This makes it easier for a “cop” to appear as a recognition result candidate when saying “Keikan”.

次に、本発明を実施するための第２の最良の形態の第２の実施例について説明する。 Next, a second embodiment of the second best mode for carrying out the present invention will be described.

図８は、本発明を実施するための第２の最良の形態の第２の実施例の動作を示す説明図である。 FIG. 8 is an explanatory diagram showing the operation of the second embodiment of the second best mode for carrying out the present invention.

図８を参照すると、認識辞書３０１中の各単語は先頭文字ごとに部分集合に分けられており、ユーザの辞書選択のためのキー入力によりキーと同じ先頭文字の部分集合のみを対象として音声認識が行われる。「警官」という単語を検索する場合、ユーザが、その先頭文字である「ｋ」をマイクオンのキー入力として押下しながら「けいかん」と発声すると、辞書選択部３０２は、「か行」または「が行」の文字で始まる単語のみからなる認識辞書３０１の部分集合を選択し、音声認識部１０６に出力する。 Referring to FIG. 8, each word in the recognition dictionary 301 is divided into subsets for each leading character, and speech recognition is performed only for the subset of the same leading character as the key by key input for user dictionary selection. Is done. When searching for the word “cop”, when the user utters “Keikan” while pressing the first character “k” as a microphone-on key input, the dictionary selection unit 302 reads “ka line” or “ga”. A subset of the recognition dictionary 301 consisting only of words starting with the characters “line” is selected and output to the speech recognition unit 106.

音声認識部１０６は、マイクロフォン１０３からの入力音声を認識し、辞書選択部３０２からの出力にしたがい、認識結果候補として、例えば「けいかく」「けいさん」などを出力する。認識候補表示部１０７は、音声認識部１０６からの認識結果候補を表示装置１１１に表示する。ユーザにより、目的の単語に五十音順で近い「けいかく」が選択されると、検索辞書単語表示部１０８は、図１０に示すように、「計画」「警官」「景観」などが検索辞書単語一覧を表示する。このとき、ユーザは、目的の単語である「警官」が一覧中に存在するので、それを選択可能となる。この例では、図７の例に比べてユーザのキー入力が１回減っており、キー入力の手間が少ないという効果がある。 The voice recognition unit 106 recognizes the input voice from the microphone 103 and outputs, for example, “Keikaku” and “Keisan” as recognition result candidates according to the output from the dictionary selection unit 302. The recognition candidate display unit 107 displays the recognition result candidates from the voice recognition unit 106 on the display device 111. When the user selects “Keikaku” close to the target word in the order of the Japanese alphabet, the search dictionary word display unit 108 searches for “plan”, “cop”, “landscape”, etc., as shown in FIG. Display dictionary word list. At this time, the user can select the target word “cop” as it exists in the list. In this example, the user's key input is reduced by one time compared to the example of FIG.

次に、本発明を実施するための第２の最良の形態の第３の実施例について説明する。 Next, a third example of the second best mode for carrying out the present invention will be described.

図９は、本発明を実施するための第２の最良の形態の第３の実施例の動作を示す説明図である。 FIG. 9 is an explanatory diagram showing the operation of the third embodiment of the second best mode for carrying out the present invention.

図９を参照すると、認識辞書３０１中の各単語は先頭文字ごとに部分集合に分けられており、ユーザの辞書選択のためのキー入力によりキーと同じ先頭文字の部分集合のみを対象として音声認識が行われる。「警官」という単語を検索する場合、ユーザは、その先頭文字である「ｋ」を辞書選択のためにキー入力してから、マイクオンのキー入力をして、「けいかん」と発声する。辞書選択部３０２は、「か行」または「が行」の文字で始まる単語のみからなる認識辞書３０１の単語の部分集合を選択し音声認識部１０６に出力する。音声認識部１０６は、マイクロフォン１０３からの入力音声を認識し、辞書選択部３０２の出力にしたがい、例えば「けいかく」「けいさん」などを認識結果候補として出力する。認識候補表示部１０７は、認識結果候補を表示装置１１１に表示する。ユーザにより、目的の単語に五十音順で近い「けいかく」が選択されると、検索辞書単語表示部１０８は、図１０に示すように「計画」「警官」「景観」などを検索辞書単語一覧として表示する。このとき、一覧表示されている単語の共通部分である「けいか」までを検索辞書単語表示部１０８が自動的に確定するので、ユーザが、次の単語選択のためのキー入力として「ｎ」を入力すると、検索辞書単語表示部１０８は、さらに絞り込んだ「警官」「景観」だけの検索辞書単語一覧を表示する。この手順を繰り返すことで、ユーザは目的の単語である「警官」を選択可能となる。 Referring to FIG. 9, each word in the recognition dictionary 301 is divided into subsets for each leading character, and speech recognition is performed only on a subset of the same leading character as the key by the user's key input for selecting a dictionary. Is done. When searching for the word “cop”, the user inputs the key “k”, which is the first character, for selecting a dictionary, then inputs the microphone on key, and says “Keikan”. The dictionary selection unit 302 selects a subset of words in the recognition dictionary 301 that includes only words that start with the characters “ka line” or “ga”, and outputs the selected subset to the speech recognition unit 106. The voice recognition unit 106 recognizes the input voice from the microphone 103 and outputs, for example, “Keikaku” and “Keisan” as recognition result candidates according to the output of the dictionary selection unit 302. The recognition candidate display unit 107 displays recognition result candidates on the display device 111. When the user selects “Keikaku” that is close to the target word in the order of the Japanese syllabary, the search dictionary word display unit 108 searches the search dictionary for “plan”, “cop”, “landscape”, etc. as shown in FIG. Display as a word list. At this time, the search dictionary word display unit 108 automatically determines up to “Keika”, which is a common part of the words displayed in the list, so that the user can input “n” as a key input for selecting the next word. The search dictionary word display unit 108 displays a search dictionary word list including only the further narrowed down “police officer” and “landscape”. By repeating this procedure, the user can select the target word “cop”.

図７〜図９の例では、辞書選択部３０２が認識辞書３０１の部分集合を選択しているが、辞書選択部３０２が、辞書選択のためのキー入力により、複数の認識辞書３０１の中の１つあるいは複数を選択することも同様に可能である。 7 to 9, the dictionary selection unit 302 selects a subset of the recognition dictionary 301. However, the dictionary selection unit 302 can select one of the plurality of recognition dictionaries 301 by key input for dictionary selection. It is equally possible to select one or more.

図１２は、認識辞書３０１の選択の表示例を示す説明図である。 FIG. 12 is an explanatory diagram illustrating a display example of selection of the recognition dictionary 301.

図１２を参照すると、日本の地名を都道府県別の認識辞書３０１として用意しておき、キー入力で「東京」を選んだ後に、東京都内の地名である「きたみ」と音声入力するような構成が可能である。 Referring to FIG. 12, a Japanese place name is prepared as a recognition dictionary 301 for each prefecture, and after selecting “Tokyo” by key input, the place name “Kitami” in Tokyo is input by voice. Configuration is possible.

次に、本発明を実施するための第３の最良の形態の音声入力システムついて図面を参照して説明する。 Next, a voice input system according to a third best mode for carrying out the present invention will be described with reference to the drawings.

本発明を実施するための第３の最良の形態は、図２、図４、または、図５の各ステップを含む方法である。 A third best mode for carrying out the present invention is a method including the steps of FIG. 2, FIG. 4, or FIG.

次に、本発明を実施するための第４の最良の形態の音声入力システムついて図面を参照して説明する。 Next, a voice input system according to a fourth best mode for carrying out the present invention will be described with reference to the drawings.

本発明を実施するための第４の最良の形態は、実施するための第３の最良の形態の各ステップをコンピュータ（音声認識部１０６、認識候補表示部１０７、検索辞書単語表示部１０８、辞書選択部３０２）に実行させるプログラムである。 In the fourth best mode for carrying out the present invention, each step of the third best mode for carrying out the present invention is performed by a computer (voice recognition unit 106, recognition candidate display unit 107, search dictionary word display unit 108, dictionary. This program is executed by the selection unit 302).

以上では、音声をマイクロフォン１０３から入力したが、ネットワークを介して音声データを入力し、音声データを音声波形に変換し、音声認識部１０６で認識する構成も可能である。 In the above description, voice is input from the microphone 103. However, a configuration in which voice data is input via a network, the voice data is converted into a voice waveform, and recognized by the voice recognition unit 106 is also possible.

以上説明したように、本発明は、以下の効果を持つ。 As described above, the present invention has the following effects.

第１の効果は、仮に誤認識しても目的の単語が選択できることである。 The first effect is that the target word can be selected even if it is erroneously recognized.

その理由は、音声認識の結果として得られる認識結果候補から検索辞書１０９の単語を検索する際に検索辞書１０９中の前後の単語も同時に表示するためである。 The reason is that when searching for words in the search dictionary 109 from recognition result candidates obtained as a result of speech recognition, the previous and next words in the search dictionary 109 are also displayed.

第２の効果は、誤認識の発生確率が低くなり、目的の単語が選択しやすくなることである。 The second effect is that the occurrence probability of erroneous recognition is reduced and the target word can be easily selected.

その第１の理由は、認識辞書３０１の単語をサブセットに分けることで探索効率が良くなり、その結果として処理速度と認識精度が向上し、さらに先頭文字を指定させることで先頭文字の認識結果が保証されるためである。 The first reason is that the search efficiency is improved by dividing the words in the recognition dictionary 301 into subsets. As a result, the processing speed and the recognition accuracy are improved, and further, the recognition result of the first character is obtained by specifying the first character. This is because it is guaranteed.

第２の理由は、音声認識を開始する際に発声内容に関連した認識辞書３０１の選択のためのキー入力を要求することで、ユーザに発声すべき単語を意識させ、丁寧な発声を促すためである。 The second reason is to request the key input for selecting the recognition dictionary 301 related to the utterance content when starting speech recognition, so that the user is conscious of the word to be uttered and encourages polite utterance. It is.

第３の効果は、大規模な検索辞書１０９内の一部の単語しか認識辞書３０１に存在しなかった場合に、認識辞書３０１に存在しない検索辞書１０９の単語を選択できることである。 A third effect is that when only some words in the large-scale search dictionary 109 exist in the recognition dictionary 301, words in the search dictionary 109 that do not exist in the recognition dictionary 301 can be selected.

その理由は、認識辞書３０１に存在しないために、近い単語に誤認識しても、その前後の単語を検索辞書１０９から選択することが可能なためである。 The reason for this is that since it does not exist in the recognition dictionary 301, it is possible to select words before and after the word from the search dictionary 109 even if the word is erroneously recognized.

第４の効果は、ユーザの目的の単語が検索辞書１０９に存在しない場合に、その単語が検索辞書１０９に存在しないということをユーザが知ることができることである。 A fourth effect is that when the user's target word does not exist in the search dictionary 109, the user can know that the word does not exist in the search dictionary 109.

その理由は、例えば、五十音順で目的の単語の前後の単語を表示することでその単語が検索辞書１０９に存在しないことを示すことができるためである。 The reason is that, for example, displaying the words before and after the target word in the order of the Japanese syllabary can indicate that the word does not exist in the search dictionary 109.

本発明は、種々の電子機器に適用可能である。たとえば、単語の意味や訳語を知るための電子辞書に適用できる。また、携帯電話や携帯情報端末などの入力インタフェースにも適用できる。 The present invention is applicable to various electronic devices. For example, the present invention can be applied to an electronic dictionary for knowing the meaning and translation of a word. It can also be applied to input interfaces such as mobile phones and portable information terminals.

本発明を実施するための第１の最良の形態の音声入力システムの全体の構成を示すブロック図である。1 is a block diagram showing an overall configuration of a voice input system according to a first best mode for carrying out the present invention. 本発明を実施するための第１の最良の形態の音声入力システムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the audio | voice input system of the 1st best form for implementing this invention. 本発明を実施するための第２の最良の形態の音声入力システムの全体の構成を示すブロック図である。It is a block diagram which shows the whole structure of the audio | voice input system of the 2nd best form for implementing this invention. 本発明を実施するための第２の最良の形態の音声入力システムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the audio | voice input system of the 2nd best form for implementing this invention. 本発明を実施するための第２の最良の形態の音声入力システムの別の動作を示すフローチャートである。It is a flowchart which shows another operation | movement of the audio | voice input system of the 2nd best form for implementing this invention. 本発明を実施するための第１の最良の形態の実施例の動作を示す説明図である。It is explanatory drawing which shows operation | movement of the Example of the 1st best form for implementing this invention. 本発明を実施するための第２の最良の形態の第１の実施例の動作を示す説明図である。It is explanatory drawing which shows operation | movement of the 1st Example of the 2nd best form for implementing this invention. 本発明を実施するための第２の最良の形態の第２の実施例の動作を示す説明図である。It is explanatory drawing which shows operation | movement of the 2nd Example of the 2nd best form for implementing this invention. 本発明を実施するための第２の最良の形態の第３の実施例の動作を示す説明図である。It is explanatory drawing which shows operation | movement of the 3rd Example of the 2nd best form for implementing this invention. 検索辞書単語一覧を示す説明図である。It is explanatory drawing which shows a search dictionary word list. 認識結果候補、検索辞書単語一覧を示す説明図である。It is explanatory drawing which shows a recognition result candidate and a search dictionary word list. 認識辞書の選択の表示例を示す説明図である。It is explanatory drawing which shows the example of a display of selection of a recognition dictionary.

Explanation of symbols

１０３マイクロフォン
１０４キー入力装置
１０５認識辞書
１０６音声認識部
１０７認識候補表示部
１０８検索辞書単語表示部
１０９検索辞書
１１１表示装置
３０１認識辞書
３０２辞書選択部 DESCRIPTION OF SYMBOLS 103 Microphone 104 Key input device 105 Recognition dictionary 106 Speech recognition part 107 Recognition candidate display part 108 Search dictionary word display part 109 Search dictionary 111 Display apparatus 301 Recognition dictionary 302 Dictionary selection part

Claims

A word input dictionary for storing words in order according to a certain order relationship;
A speech recognition dictionary in which a subset of recognition words associated with keys and recognized based on the order relation is defined;
Voice recognition means for recognizing the input voice using a subset of recognition words corresponding to the key input by the key input means and the input voice, and outputting a recognition result candidate;
Recognition candidate display means for displaying a word list of the recognition result candidates on a display means;
When one of the recognition words of the recognition result candidate is selected via the key input means, the order relation of the words in the word input dictionary corresponding to the selected recognition word Display a word on the display means, determine a common part from the beginning of the displayed one or more word sets, and use the first one or more characters of the non-common part again using the key input means And a search dictionary word display means for recursively narrowing down and displaying the displayed word set as a subset .

The order relationship between words is in phonetic notation order or Japanese alphabetical order, and one or a set of one or a plurality of characters in the phonetic notation of a speech input word is input by the key input means. The voice input system according to claim 1, wherein the voice input system can be specified by:

The voice input system according to claim 1, wherein the voice recognition unit starts voice input by key input.

A word input dictionary for storing words in order according to a certain order relationship;
A speech recognition system that includes a dictionary for speech recognition in which a subset of recognition words is defined that is associated with a key and is recognized based on the order relationship ;
Recognizing the input speech using a subset of recognized words corresponding to the key input by the key input means and the input speech, and outputting a recognition result candidate;
Displaying a recognition word list of the recognition result candidates on a display means;
When one of the recognition words of the recognition result candidate is selected through the key input means, the order relation of words in the word input dictionary corresponding to the selected recognition word Display a word on the display means, determine a common part from the beginning of the displayed one or more word sets, and use the first one or more characters of the non-common part again using the key input means And a procedure for recursively narrowing down and displaying the displayed word set as a subset .

The order relationship between words is in phonetic notation order or Japanese alphabetical order, and one or a set of one or a plurality of characters in the phonetic notation of a speech input word is input by the key input means. The voice input method according to claim 4, further comprising a step of designating by the following.

Claim, characterized in that it comprises a procedure for starting the voice input by the key input 4 or
5. The voice input method according to 5 .

A word input dictionary for storing words in order according to a certain order relationship;
In a program for causing a computer of a speech input system to perform the following procedure , comprising a dictionary for speech recognition in which a subset of recognition words that are associated with keys and recognized based on the order relationship is defined ,
Recognizing the input speech using a subset of recognized words corresponding to the key input by the key input means and the input speech, and outputting a recognition result candidate;
Displaying a recognition word list of the recognition result candidates on a display means;
When one of the recognition words of the recognition result candidate is selected via the key input means, the order relation of the words in the word input dictionary corresponding to the selected recognition word Display a word on the display means, determine a common part from the beginning of the displayed one or more word sets, and use the first one or more characters of the non-common part again using the key input means A program for voice input, which causes a computer to execute a procedure for recursively narrowing down and displaying a displayed word set as a subset .

The order relationship between words is in phonetic notation order or Japanese alphabetical order, and one or a set of one or a plurality of characters in the phonetic notation of a speech input word is input by the key input means. 8. The voice input program according to claim 7, which causes a computer to execute a procedure specified by the step.

9. The voice input program according to claim 7 , wherein the computer executes a procedure for starting voice input by key input.