JP2005242183A

JP2005242183A - Voice recognition device, display controller, recorder device, display method and program

Info

Publication number: JP2005242183A
Application number: JP2004054499A
Authority: JP
Inventors: Kazunori Imoto; 和範井本; Munehiko Sasajima; 宗彦笹島; Hiroshi Shimomori; 大志下森
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2004-02-27
Filing date: 2004-02-27
Publication date: 2005-09-08

Abstract

<P>PROBLEM TO BE SOLVED: To notify of a voice-recognizable word to a user without spoiling the information transmission function using a display screen or the designability. <P>SOLUTION: A speech recognition part 110 can recognize a word to be recognized which is registered in a recognition dictionary 160. When a user makes a request for retrieval etc., a request processing part 120 generates constitution elements of a picture for displaying a processing result (retrieval result etc.) to the request. When there is a word to be registered in the recognition dictionary 160 among words included in such a screen, a display control part 130 determines a display style for the word, accordance with the display rules prepared for the words registered, beforehand in the recognition dictionary. Consequently, the words can be displayed, in a style different from words which are not registered in the dictionary; and while the result of processing that the user requests are shown, recognizable words can be indicated. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、音声認識により認識可能な単語を表示する表示制御装置、音声認識装置、レコーダ装置、表示方法およびプログラムに関するものである。 The present invention relates to a display control device, a speech recognition device, a recorder device, a display method, and a program that display words that can be recognized by speech recognition.

近年、音声認識技術が様々な機器において利用されており、これらの機器に音声入力インタフェースが組み込まれている。音声入力インタフェースを用いることでキーボードなどの入力デバイスを用いることなく、機器に対する入力の際の利便性が向上することになるが、ユーザが音声認識の対象となっていない単語等を発声した場合、ユーザが意図した単語と異なる単語が誤認識され、ユーザの意図しない動作が行われてしまうおそれがある。 In recent years, voice recognition technology has been used in various devices, and voice input interfaces are incorporated in these devices. By using the voice input interface, it is possible to improve convenience when inputting to the device without using an input device such as a keyboard, but when the user utters a word or the like that is not subject to voice recognition, There is a possibility that a word different from the word intended by the user is erroneously recognized and an operation not intended by the user is performed.

すなわち、一般的な音声認識技術では、音声認識の対象となる単語、つまり認識可能な単語をあらかじめ認識辞書に格納しておき、認識辞書に格納された単語の中で最も入力音声に近い単語を認識結果として出力するため、認識辞書に格納されていない単語を発声すると、上記のような誤認識がなされたり、認識できなかったりすることがあるのである。 That is, in a general speech recognition technique, a word that is a target of speech recognition, that is, a recognizable word is stored in a recognition dictionary in advance, and a word that is closest to the input speech is stored among the words stored in the recognition dictionary. In order to output as a recognition result, if a word that is not stored in the recognition dictionary is uttered, the above-mentioned erroneous recognition may be made or may not be recognized.

特に、このような音声入力インタフェースを搭載した機器をほとんど利用したことのない利用者は、何を言えば機器が動作するのかわからずに困惑することも多く、認識される単語を調べるためにマニュアルを引いていたのでは、入力を簡易にするという音声入力インタフェースの利点が生かされない。 In particular, users who have rarely used a device equipped with such a voice input interface are often confused without knowing what the device is supposed to operate, and a manual for examining recognized words. However, the advantage of the voice input interface that simplifies input is not utilized.

以上のような問題を解決するためには、ユーザに認識対象単語を音声や表示等の出力インタフェースを用いて適切に報知する必要があり、認識可能な単語のリストを表示画面の一部に表示させる技術や（例えば、特許文献１参照）、ヘルプ発話によって認識可能な単語を補助画面に表示させるといった技術が提案されている（例えば、特許文献２参照）。 In order to solve the above problems, it is necessary to appropriately notify the user of the recognition target word using an output interface such as voice or display, and a list of recognizable words is displayed on a part of the display screen. And a technique for displaying a word that can be recognized by help utterance on an auxiliary screen has been proposed (for example, see Patent Document 2).

また、装置に入力可能な単語を、赤色、青色等の異なる表示態様で表示させるとともに、ユーザが入力したい単語の表示形態の種類、例えば「アカ」、「アオ」等が発声された場合に、発声された表示形態の種類に対応する単語を入力する技術が提案されている（例えば、特許文献３参照）。 In addition, when a word that can be input to the device is displayed in different display modes such as red and blue, and the type of display form of the word that the user wants to input, for example, “red”, “blue”, etc. A technique for inputting a word corresponding to the type of display form uttered has been proposed (see, for example, Patent Document 3).

特開平６−３３２６６５号公報JP-A-6-332665 特開平１１−６５７３９号公報JP-A-11-65739 特開２００２−２７８５８７号公報Japanese Patent Laid-Open No. 2002-278587

しかしながら、上記特許文献１に開示された技術では、認識対象の単語が非常に多くなれば、全ての認識可能な単語を一度にリスト表示するのは難しく、リストを表示するために画面領域の一部に本来表示すべき内容が表示できないといったことも生じる。さらには、リスト表示のために画面のデザイン性も損なわれる。 However, with the technique disclosed in Patent Document 1, it is difficult to display a list of all recognizable words at once if the number of words to be recognized becomes very large. The content that should be originally displayed cannot be displayed on the part. Furthermore, the design of the screen is also lost for displaying the list.

また、特許文献２に開示された技術では、補助画面を表示・消去するのにヘルプ発話といった音声コマンドの入力が必要になるため、機器との間で冗長なやりとりが増え、入力の簡易化という音声入力インタフェースの利点が活かされない。 In addition, in the technique disclosed in Patent Document 2, it is necessary to input a voice command such as a help utterance in order to display / erase the auxiliary screen. Therefore, redundant communication with the device increases, and input is simplified. The advantage of the voice input interface is not utilized.

また、特許文献３に開示された技術は、表示形態の種類を発声することで所望の単語を入力する技術であり、結果として表示された入力項目を入力することができるが、音声認識可能な単語そのものをユーザに報知するものではない。 The technique disclosed in Patent Document 3 is a technique for inputting a desired word by uttering the type of display form. As a result, input items displayed can be input, but speech recognition is possible. The word itself is not notified to the user.

すなわち、特許文献３に開示された技術では、「アカ」、「アオ」といった表示形態の種類が音声認識可能な単語であるが、これらの単語についてはユーザが音声入力が可能であることを前提とし、認識可能な表示形態の種類に入力項目を割り当て、認識可能であることが既知の「アカ」等の単語を発声することで、それに対応する入力項目を選択する技術である。したがって、表示を通して音声認識可能な単語をユーザに報知することはできず、また入力が直接的ではないため、音声入力インタフェースを搭載した機器の操作などと直感的に結びつかず分かり難いという問題があった． That is, in the technique disclosed in Patent Document 3, the types of display forms such as “red” and “blue” are words that can be recognized by speech, but it is assumed that the user can input speech for these words. And an input item is assigned to a recognizable type of display form, and a word such as “red” that is known to be recognizable is uttered to select a corresponding input item. Therefore, words that can be recognized by voice cannot be notified to the user through the display, and since the input is not direct, there is a problem that it is difficult to understand because it is not intuitively linked with operation of a device equipped with a voice input interface. It was.

本発明は、上記に鑑みてなされたものであって、表示画面による情報伝達機能やデザイン性を損なうことなく、音声認識可能な単語をユーザに報知することができる音声認識装置、表示制御装置、レコーダ装置、音声認識方法およびプログラムを提供することを目的とする。 The present invention has been made in view of the above, and a speech recognition device, a display control device, and the like that can notify a user of a speech-recognizable word without impairing an information transmission function and design by a display screen, An object of the present invention is to provide a recorder device, a voice recognition method, and a program.

上述した課題を解決し、目的を達成するために、本発明の一態様にかかる音声認識装置は、音声認識の対象となる複数の単語およびこれらの単語の読み方を格納する認識辞書と、入力される音声に対し、前記認識辞書を参照して音声認識処理を行う音声認識手段と、
前記認識辞書に格納される単語の表示態様を規定する表示規則を記憶する表示規則記憶手段と、入力された要求に対する要求処理結果を表示画面に表示する際に、前記表示画面の中に前記認識辞書に格納される単語が含まれている場合、含まれている単語について前記表示規則記憶手段に記憶されている表示規則にしたがって表示態様を決定する表示制御手段と、を具備することを特徴とする。 In order to solve the above-described problems and achieve the object, a speech recognition apparatus according to one aspect of the present invention is input with a recognition dictionary that stores a plurality of words that are subject to speech recognition and how to read these words. Speech recognition means for performing speech recognition processing with reference to the recognition dictionary,
Display rule storage means for storing a display rule that defines a display mode of words stored in the recognition dictionary, and when the request processing result for the input request is displayed on the display screen, the recognition is displayed in the display screen. Display control means for determining a display mode according to a display rule stored in the display rule storage means for the included word when a word stored in the dictionary is included, To do.

また、本発明の別の態様にかかる音声認識装置は、音声認識の対象となる複数の単語およびこれらの単語の読み方を格納する認識辞書と、入力される音声に対し、前記認識辞書を参照して音声認識処理を行う音声認識手段と、前記認識辞書に格納される単語の表示態様を規定する表示規則を記憶する表示規則記憶手段と、前記音声認識手段により認識された音声に基づく要求を処理する要求処理手段と、入力された要求に対する前記要求処理手段による要求処理結果を表示画面に表示する際に、前記表示画面の中に前記認識辞書に格納される単語が含まれている場合、含まれている単語について前記表示規則記憶手段に記憶されている表示規則にしたがって表示態様を決定する表示制御手段と、を具備することを特徴とする。 A speech recognition apparatus according to another aspect of the present invention refers to a recognition dictionary that stores a plurality of words to be speech-recognized and how to read these words, and refers to the recognition dictionary for input speech. A voice recognition unit that performs voice recognition processing, a display rule storage unit that stores a display rule that defines a display mode of words stored in the recognition dictionary, and a request based on the voice recognized by the voice recognition unit Included when a word stored in the recognition dictionary is included in the display screen when displaying a request processing result by the request processing means to the request processing means to be displayed on the display screen. Display control means for determining a display mode according to a display rule stored in the display rule storage means for the stored word.

また、本発明の別の態様にかかる表示制御装置は、音声認識の対象となる複数の単語およびこれらの単語の読み方を格納する認識辞書と、入力される音声に対し、前記認識辞書を参照して音声認識処理を行う音声認識手段とを備えた音声認識装置で認識可能な単語を表示させる表示制御装置であって、入力された要求に対する要求処理結果を表示画面に表示する際に、その表示内容を制御する表示制御手段と、前記認識辞書に格納される単語の表示態様を規定する表示規則を記憶する表示規則記憶手段とを具備し、前記表示制御手段は、前記表示画面に表示すべき結果の中に、前記認識辞書に格納される単語が含まれている場合、含まれている単語について前記表示規則記憶手段に記憶されている表示規則にしたがって表示態様を決定することを特徴とする。 The display control apparatus according to another aspect of the present invention refers to a recognition dictionary that stores a plurality of words to be speech-recognized and how to read these words, and refers to the recognition dictionary for input speech. Display a word that can be recognized by a voice recognition device having voice recognition means for performing voice recognition processing, and when displaying a request processing result for an inputted request on a display screen, the display Display control means for controlling the contents, and display rule storage means for storing display rules for defining the display mode of the words stored in the recognition dictionary, the display control means should be displayed on the display screen When a word stored in the recognition dictionary is included in the result, the display mode is determined according to the display rule stored in the display rule storage unit for the included word. The features.

また、本発明の別の態様にかかるレコーダ装置は、画像を記憶する処理を行うレコーダ装置であって、音声認識の対象となる複数の単語およびこれらの単語の読み方を格納する認識辞書と、入力される音声に対し、前記認識辞書を参照して音声認識処理を行う音声認識手段と、前記音声認識手段により認識された音声に基づく要求を処理する要求処理手段と、要求に対する前記要求処理手段による結果を表示画面に表示する際にその表示内容を制御する表示制御手段と、前記認識辞書に格納される単語の表示態様を規定する表示規則を記憶する表示規則記憶手段とを具備し、前記表示制御手段は、前記表示画面に表示すべき結果の中に、前記認識辞書に格納される単語が含まれている場合、含まれている単語について前記表示規則記憶手段に記憶されている表示規則にしたがって表示態様を決定することを特徴とする。 A recorder apparatus according to another aspect of the present invention is a recorder apparatus that performs processing for storing an image, and includes a recognition dictionary that stores a plurality of words to be subjected to speech recognition and how to read these words, and an input Speech recognition means for performing speech recognition processing with reference to the recognition dictionary, request processing means for processing a request based on the speech recognized by the speech recognition means, and the request processing means for the request Display control means for controlling the display contents when the result is displayed on the display screen, and display rule storage means for storing a display rule for defining a display mode of words stored in the recognition dictionary, When the word to be stored in the recognition dictionary is included in the result to be displayed on the display screen, the control unit stores the included word in the display rule storage unit. And determining the display mode in accordance 憶 has been that display rule.

また、本発明の別の態様にかかる表示方法は、音声認識の対象となる複数の単語およびこれらの単語の読み方を格納する認識辞書と、入力される音声に対し、前記認識辞書を参照して音声認識処理を行う音声認識手段とを備えた音声認識装置が認識可能な単語を表示する方法であって、入力された要求に対する要求処理結果を表示画面に表示する際に、当該表示画面に表示すべき結果の中に、前記認識辞書に格納される単語が含まれている場合、含まれている単語についてあらかじめ決められた表示規則にしたがって表示態様を決定することを特徴とする。 In addition, a display method according to another aspect of the present invention includes a recognition dictionary that stores a plurality of words to be subjected to speech recognition and how to read these words, and refers to the recognition dictionary for input speech. A method for displaying words that can be recognized by a speech recognition device having speech recognition means for performing speech recognition processing, and displaying a request processing result for an input request on the display screen. If a word to be stored in the recognition dictionary is included in the result to be determined, a display mode is determined according to a display rule determined in advance for the included word.

また、本発明の別の態様にかかるプログラムは、コンピュータを、入力された要求に対する要求処理結果を表示画面に表示する際に、当該表示画面に表示すべき結果の中に、音声認識処理に用いられる認識辞書に格納される単語が含まれている場合、含まれている単語についてあらかじめ決められた表示規則にしたがって表示態様を決定する表示制御手段
として機能させることを特徴とする。 The program according to another aspect of the present invention is used for speech recognition processing among the results to be displayed on the display screen when the computer displays the request processing result for the input request on the display screen. When a word to be stored is included in the recognition dictionary, the display dictionary is configured to function as a display control unit that determines a display mode according to a display rule determined in advance for the included word.

本発明によれば、表示画面による情報伝達機能やデザイン性を損なうことなく、音声認識可能な単語をユーザに報知することができるという効果を奏する。 According to the present invention, it is possible to notify a user of a word that can be recognized by voice without impairing the information transmission function and design of the display screen.

以下に添付図面を参照して、この発明にかかる音声認識装置、表示制御装置、レコーダ装置、音声認識方法およびプログラムの好適な実施の形態を詳細に説明する。 Exemplary embodiments of a speech recognition device, a display control device, a recorder device, a speech recognition method, and a program according to the present invention will be explained below in detail with reference to the accompanying drawings.

（第１の実施の形態）
図１は、本発明の第１の実施の形態にかかる音声認識装置の構成を示すブロック図である。同図に示すように、この音声認識装置１０は、音声入力部１００と、音声認識部１１０と、要求処理部１２０と、表示制御部１３０と、表示部１４０と、音響辞書１５０と、認識辞書１６０と、規則適用単語テーブル１７０と、表示規則記憶部１８０と、要求処理用情報記憶部１９０とを備える。 (First embodiment)
FIG. 1 is a block diagram showing the configuration of the speech recognition apparatus according to the first embodiment of the present invention. As shown in the figure, the speech recognition apparatus 10 includes a speech input unit 100, a speech recognition unit 110, a request processing unit 120, a display control unit 130, a display unit 140, an acoustic dictionary 150, a recognition dictionary. 160, a rule application word table 170, a display rule storage unit 180, and a request processing information storage unit 190.

このように本実施の形態における音声認識装置１０は、ユーザからの要求を入力し、当該要求に応じた処理を行うための要求処理部１２０や要求処理用情報記憶部１９０を備えているが、要求処理部１２０や要求処理用情報記憶部１９０を音声認識装置内に一体に組み込む構成とする必要はなく、これらを分離した構成としてもよい。 As described above, the speech recognition apparatus 10 according to the present embodiment includes the request processing unit 120 and the request processing information storage unit 190 for inputting a request from the user and performing processing according to the request. The request processing unit 120 and the request processing information storage unit 190 do not need to be integrated into the voice recognition device, and may be separated from each other.

音声入力部１００は、マイクロホン等を有しており、当該音声認識装置１０のユーザが発声した音声等の音を受信し、受信した音を音声認識部１１０が処理可能な音響信号に変換して出力する。 The voice input unit 100 includes a microphone or the like, receives sound such as voice uttered by the user of the voice recognition device 10, and converts the received sound into an acoustic signal that can be processed by the voice recognition unit 110. Output.

音声認識部１１０は、音声入力部１００から供給される音響信号を解析し、音響辞書１５０を参照しながら認識辞書１６０に格納された単語と音響的に最も類似するものを認識結果として要求処理部１２０に出力する。これによりユーザは音声入力部１００に向けて、要求処理部１２０に対して入力したい単語（要求内容等）を発声すれば、音声認識部１１０によって入力したい単語が認識結果として入力される。なお、ここで、認識対象となる単語とは、１つの単語のみならず、複数の単語からなる単語列も含むものとする。また、音声認識部１１０が行う音声認識方法は、ＨＭＭ（Hidden Markov Model）を利用する方法など公知の種々の方法を用いることができる。 The speech recognition unit 110 analyzes the acoustic signal supplied from the speech input unit 100, refers to the acoustic dictionary 150, and requests a request processing unit as a recognition result that is acoustically most similar to a word stored in the recognition dictionary 160. 120 is output. Thus, when the user utters a word (request content or the like) to be input to the request processing unit 120 toward the voice input unit 100, the word to be input by the voice recognition unit 110 is input as a recognition result. Here, the word to be recognized includes not only one word but also a word string composed of a plurality of words. As the voice recognition method performed by the voice recognition unit 110, various known methods such as a method using an HMM (Hidden Markov Model) can be used.

音響辞書１５０には、上記のような音声認識部１１０によって用いられる音響に関する情報が格納されており、認識辞書１６０には音声認識部１１０によって認識可能な単語、および認識可能単語を認識するためにユーザが発声することが可能な読み方が格納されている。図２に、認識辞書１６０に格納される情報の一例を示す。 The acoustic dictionary 150 stores information related to the sound used by the speech recognition unit 110 as described above, and the recognition dictionary 160 recognizes words that can be recognized by the speech recognition unit 110 and recognizable words. A reading that can be uttered by the user is stored. FIG. 2 shows an example of information stored in the recognition dictionary 160.

同図に示すように、認識辞書１６０には、認識対象単語「情報通」、「しん」、「黄門様」、「高校野球」‥‥と、これらの認識対象単語に対応する読み方「じょうほうつう」、「しん」、「こうもんさま」、「こうこうやきゅう」‥‥とが対応つけて格納されている。したがって、ユーザが音声入力部１００に向けて「こうもんさま」と発声すると、音声認識部１１０によって読み方「こうもんさま」に対応する単語「黄門様」が認識され、認識結果として出力されるのである。 As shown in the figure, the recognition dictionary 160 includes the recognition target words “Information Communication”, “Shin”, “Huangmen-sama”, “High School Baseball”, etc., and the reading “Jouhou” corresponding to these recognition target words. “Tsu”, “Shin”, “Koumon-sama”, “Koukou-yakyu”, etc. are stored in correspondence. Therefore, when the user utters “Komon-sama” to the voice input unit 100, the speech recognition unit 110 recognizes the word “Komon-sama” corresponding to the reading “Koumon-sama” and outputs it as a recognition result. is there.

図１に戻り、要求処理部１２０は、上記のように音声認識部１１０の認識結果であるユーザの要求に応じた処理を行い、その処理結果を表示部１４０に表示させるべく、処理結果を表示制御部１３０に出力する。本実施の形態では、要求処理部１２０は、ユーザが入力した単語（テレビ番組名等）に関する検索処理を行うものであり、要求処理用情報記憶部１９０には要求処理部１２０が要求に応じた処理を行うために必要な情報が記憶されている。 Returning to FIG. 1, the request processing unit 120 performs processing according to the user request, which is the recognition result of the voice recognition unit 110 as described above, and displays the processing result to display the processing result on the display unit 140. Output to the control unit 130. In the present embodiment, the request processing unit 120 performs a search process on a word (such as a TV program name) input by the user, and the request processing unit 120 responds to the request in the request processing information storage unit 190. Information necessary for processing is stored.

ここで、要求処理用情報記憶部１９０に記憶される情報の一例を図３に示す。同図に示す例では、要求処理用情報記憶部１９０には、上記要求処理部１２０によって行われる検索処理に必要な番組データベースが格納されている。このような番組データベースが記憶されている場合、要求処理部１２０はユーザからの番組検索要求に応じてテレビ番組の検索を行うことができる。 An example of information stored in the request processing information storage unit 190 is shown in FIG. In the example shown in the figure, the request processing information storage unit 190 stores a program database necessary for the search processing performed by the request processing unit 120. When such a program database is stored, the request processing unit 120 can search for a television program in response to a program search request from the user.

番組データベースは、「ＩＤ」、「番組名」、「放送日時」、「放送局」、「ジャンル」、「出演者」といった番組に関する項目情報が対応つけられたものとなっている。このような番組データベースを参照することで、要求処理部１２０は、ユーザが出演者名（峰竜太郎等）やジャンル（社会・報道）をキーとして検索要求をなした場合に、当該出演者が出演している番組や要求したジャンルの番組等を検索することができるようになっている。 The program database is associated with item information relating to programs such as “ID”, “program name”, “broadcast date”, “broadcast station”, “genre”, and “performer”. By referring to such a program database, when the user makes a search request using a performer name (such as Ryutaro Mine) or a genre (society / report) as a key, the request processing unit 120 appears. It is possible to search for a program that is currently being executed or a program of a requested genre.

つまり、本実施の形態では、ユーザが検索を要求する番組名、出演者名等の単語を音声入力部１００に向けて発することで、かかる単語が音声認識されて要求処理部１２０に供給される。要求処理部１２０は、このように音声認識を利用して入力された単語に関する情報を要求処理用情報記憶部１９０に格納された情報の中から検索し、検索結果を表示するための画面の構成要素を表示制御部１３０に供給するのである。 That is, in the present embodiment, a word such as a program name or a performer name for which a user requests a search is issued to the voice input unit 100, so that the word is recognized and supplied to the request processing unit 120. . The request processing unit 120 searches the information stored in the request processing information storage unit 190 for information related to words input using speech recognition in this way, and displays a search result. Elements are supplied to the display control unit 130.

図１に戻り、表示制御部１３０は、要求処理部１２０によって処理されたユーザの要求に対する結果、つまりユーザが検索要求した単語に関する検索結果を表示部１４０に表示するための表示用データを生成し、表示部１４０に出力する。本実施の形態における表示制御部１３０は、要求処理部１２０の処理結果を単純に表示させるのではなく、かかる処理結果を表示させる際に、認識辞書１６０に格納された単語、つまり音声認識可能な単語をユーザに報知しうる表示が行われるよう表示内容を制御する。 Returning to FIG. 1, the display control unit 130 generates display data for displaying on the display unit 140 the result of the user request processed by the request processing unit 120, that is, the search result related to the word requested by the user. And output to the display unit 140. The display control unit 130 according to the present embodiment does not simply display the processing result of the request processing unit 120, but can display the word stored in the recognition dictionary 160, that is, can recognize the voice when displaying the processing result. The display content is controlled so that a display that can notify the user of the word is performed.

上記のような表示内容制御を行うため表示制御部１３０は、規則適用単語決定部１３１と、表示態様決定部１３２とを有している。規則適用単語決定部１３１は、上記のように表示部１４０に表示するべく要求処理部１２０から供給された処理結果（テレビ番組の検索結果等）の中に、認識辞書１６０に格納されている単語（図２の「情報通」、「黄門様」など）が含まれているか否かを調査し、含まれている場合にはその単語を後述する表示規則の適用がある単語であるとして規則適用単語テーブル１７０に登録する。 In order to perform the display content control as described above, the display control unit 130 includes a rule application word determination unit 131 and a display mode determination unit 132. The rule application word determination unit 131 stores the words stored in the recognition dictionary 160 among the processing results (such as the search result of the TV program) supplied from the request processing unit 120 to be displayed on the display unit 140 as described above. (“Information communication”, “Huangmen”, etc. in FIG. 2) is investigated, and if it is included, the rule is applied because the word is applied to the display rule described later. Register in the word table 170.

ここで、図４に規則適用単語テーブル１７０の内容の一例を示す。同図に示すように、規則適用単語テーブル１７０には、要求処理部１２０による処理結果に含まれる単語と、当該単語に表示規則が適用されるか否かを示す適用フラグとが登録されている。ここで、適用フラグが「１」の場合には適用があることを示し、「０」は適用がないことを表すフラグである。上記のように認識辞書１６０に格納されている「黄門様」や「次のページ」という単語については適用フラグ「１」が、認識辞書に格納されていない単語「２件」については適用フラグ「０」が付与されることになる。 Here, an example of the contents of the rule application word table 170 is shown in FIG. As shown in the figure, in the rule application word table 170, a word included in the processing result by the request processing unit 120 and an application flag indicating whether or not the display rule is applied to the word are registered. . Here, when the application flag is “1”, it indicates that there is application, and “0” is a flag indicating that there is no application. As described above, the application flag “1” is used for the words “Huangmen” and “next page” stored in the recognition dictionary 160, and the application flag “2” is not stored in the recognition dictionary. “0” is given.

表示態様決定部１３２は、上記のように規則適用単語決定部１３１によって登録された規則適用単語テーブル１７０を参照し、適用フラグが「１」、つまり規則を適用することが決定された単語について、表示規則記憶部１８０に記憶された表示規則にしたがった態様で表示がなされるよう表示内容を制御する。すなわち、規則が適用されない単語については特別の処理が行われず、当該装置の表示設定にしたがった態様で表示されるのに対し、規則が適用される旨が登録された単語については、その時点の表示設定にかかわらず表示規則記憶部１８０に記憶された表示規則にしたがった態様で表示されるよう制御されるのである。 The display mode determination unit 132 refers to the rule application word table 170 registered by the rule application word determination unit 131 as described above, and the application flag is “1”, that is, for the word that is determined to apply the rule, The display content is controlled so that display is performed in a manner according to the display rules stored in the display rule storage unit 180. In other words, special processing is not performed for words to which the rule is not applied, and is displayed in a mode according to the display setting of the device, whereas for words registered that the rule is applied, Regardless of the display setting, the display rule is controlled to be displayed in a manner according to the display rule stored in the display rule storage unit 180.

ここで、図５に表示規則記憶部１８０に記憶される表示規則の一例を示す。同図に示すように、この表示規則では、適用フラグ「０」が付与された単語の場合、つまり適用しない単語については「変更なし」、つまり装置の表示設定等にしたがった態様で表示をなすようになっているのに対し、適用フラグ「１」が付与された単語の場合、つまり適用のある単語については「Font+2、Bold化」といった態様で表示すべき旨が規定されている。なお、特許請求の範囲における表示規則は、認識辞書１６０に格納された単語について適用されるものをいうので、表示規則記憶部１８０に格納される適用フラグ「１」に対応する規則が、特許請求の範囲における表示規則に該当する。ただし、適用フラグ「０」に対応する単語について適用する他の規則を用意し、これを表示規則記憶部１８０に記憶させるようにしてもよい。 Here, FIG. 5 shows an example of display rules stored in the display rule storage unit 180. As shown in the figure, in this display rule, in the case of a word to which the application flag “0” is assigned, that is, a word that is not applied is displayed as “no change”, that is, in a mode according to the display setting of the device. On the other hand, in the case of a word to which the application flag “1” is assigned, that is, for an applied word, it is defined that it should be displayed in a form such as “Font + 2, Bold”. Since the display rule in the claims refers to a rule applied to words stored in the recognition dictionary 160, the rule corresponding to the application flag “1” stored in the display rule storage unit 180 is the claim. It corresponds to the display rule in the range. However, another rule to be applied to the word corresponding to the application flag “0” may be prepared and stored in the display rule storage unit 180.

これを参照した表示態様決定部１３２は、適用フラグ「１」が付与された単語については、通常の設定よりもフォントを２つ大きく、かつボールド化して表示させる、つまり適用のない単語よりもフォントを２つ大きく、かつボールド化して表示させるよう制御するのである。 The display mode determination unit 132 that refers to this displays the word with the application flag “1” two times larger than the normal setting and makes it bold, that is, the font is larger than the word that is not applied. Is controlled to be displayed two times larger and bold.

表示部１４０は、ＬＣＤ（Liquid Crystal Display）等の表示画面を有しており、この表示画面に要求処理部１２０の処理結果に対応する内容であり、上記表示制御部１３０によって制御された内容の表示がなされる。 The display unit 140 has a display screen such as an LCD (Liquid Crystal Display), and the content corresponding to the processing result of the request processing unit 120 on this display screen is the content controlled by the display control unit 130. Display is made.

以上が本発明の第１の実施の形態にかかる音声認識装置１０の構成であり、以下当該音声認識装置１０の動作について具体例を挙げながら説明する。ここでは、要求処理部１２０がユーザからの音声入力によるテレビ番組検索要求、例えばジャンルや出演者を指定した番組検索要求に対する処理を行い、その処理結果であるテレビ番組検索結果を表示部１４０に表示する場合を例に挙げて説明する。 The above is the configuration of the speech recognition apparatus 10 according to the first exemplary embodiment of the present invention. Hereinafter, the operation of the speech recognition apparatus 10 will be described with specific examples. Here, the request processing unit 120 performs a process for a TV program search request by voice input from the user, for example, a program search request specifying a genre or a performer, and displays the TV program search result as the processing result on the display unit 140. An example of the case will be described.

まず、ユーザが音声入力部１００に向けて、所望の検索のための情報、つまり所望番組のジャンル、出演者、チャンネル、放送時刻、番組名等の情報を発声する。例えば、「今日のドラマ」、「１０チャンネル」、「松平武司の出演している番組」等の検索条件を発声する。すると、音声入力部１００によってかかる音声が音響信号に変換され、当該音響信号に対して音声認識部１１０による音声認識処理が行われる。 First, the user utters information for a desired search toward the voice input unit 100, that is, information such as the genre of the desired program, performers, channels, broadcast time, program name, and the like. For example, a search condition such as “today's drama”, “10 channels”, “a program in which Takeshi Matsudaira appears” is uttered. Then, the voice input unit 100 converts the voice into an acoustic signal, and the voice recognition unit 110 performs voice recognition processing on the acoustic signal.

ここで、ユーザが「今日のドラマ」といった言葉を発した場合、音声認識部１１０によって「今日のドラマ」が認識され、検索条件文言として要求処理部１２０に供給される。要求処理部１２０は、かかる検索条件文言にしたがって検索処理を行う。かかる検索処理の手順を図６を参照しながら説明する。 Here, when the user utters a word such as “today's drama”, “today's drama” is recognized by the voice recognition unit 110 and is supplied to the request processing unit 120 as a search condition wording. The request processing unit 120 performs a search process according to the search condition wording. A procedure of such search processing will be described with reference to FIG.

同図に示すように、要求処理部１２０は、入力された「今日のドラマ」が受理可能か否かを判定すると同時に、番組検索処理において検索キーや操作コマンドに対応する単語と、その属性を抽出する（ステップＳ１０１）。例えば、要求処理部１２０は、受理可能な単語列をあらかじめテンプレートの形式で保持しており、テンプレートと一致するか否かで受理可能の判定を行うものとする。ここで、要求処理部１２０が保持する複数のテンプレートの「“日付”の “ジャンル"」が含まれているものとする。 As shown in the figure, the request processing unit 120 determines whether or not the input “today's drama” is acceptable, and at the same time, selects a word corresponding to a search key or an operation command and its attribute in the program search process. Extract (step S101). For example, it is assumed that the request processing unit 120 holds an acceptable word string in the form of a template in advance, and determines whether or not it is acceptable depending on whether or not it matches the template. Here, it is assumed that “genre” of “date” of a plurality of templates held by the request processing unit 120 is included.

この場合、認識結果の「今日」は日付、「ドラマ」はジャンルの具体値であると判別することができ、「“今日”の “ドラマ"」は保持するテンプレートに合致する。よって、「今日のドラマ」は受理可能な単語として判定され、「今日」という属性と、「ドラマ」という属性が抽出される。 In this case, it can be determined that “Today” of the recognition result is a date and “Drama” is a specific value of the genre, and “Drama” of “Today” matches the template held. Therefore, “Today's Drama” is determined as an acceptable word, and the attribute “Today” and the attribute “Drama” are extracted.

上記のように属性を抽出すると、要求処理部１２０は、抽出した属性値を用いて、検索クエリを作成し、要求処理用情報記憶部１９０に記憶されている番組データベース（図３参照）から条件に合致する番組を検索する（ステップＳ１０２）。例えば、その要求をなした日が２００３年８月２０日だとすれば、ＩＤ「０２」の「しん」、ＩＤ「０３」の「黄門様」などの番組が検索される。 When the attribute is extracted as described above, the request processing unit 120 creates a search query using the extracted attribute value, and creates a condition from the program database (see FIG. 3) stored in the request processing information storage unit 190. The program that matches is searched (step S102). For example, if the date of the request is August 20, 2003, a program such as “shin” with ID “02” and “Komon-sama” with ID “03” is searched.

以上のように番組検索を行うと、要求処理部１２０はかかる検索結果を表示部１４０に表示させるための内容、つまり表示画面の構成要素を生成する（ステップＳ１０３）。ここで、要求処理部１２０は、あらかじめ保持されている表示画面の構成要素のテンプレートを利用して表示画面の構成要素を生成する。ここで、図７に、保持されているテンプレートの一例を示す。 When the program search is performed as described above, the request processing unit 120 generates content for displaying the search result on the display unit 140, that is, a component of the display screen (step S103). Here, the request processing unit 120 generates a display screen component using a template of the display screen that is held in advance. Here, FIG. 7 shows an example of a held template.

要求処理部１２０は、同図に示されるようなテンプレートの[]で表現される枠内に上記検索結果等を当てはめる形で表示画面の構成要素を生成する。そして、上記のように番組「しん」と「黄門様」が検索された場合には、図８に示すような表示画面の構成要素が生成されることになる。 The request processing unit 120 generates the constituent elements of the display screen in such a manner that the search result or the like is applied to the frame represented by [] of the template as shown in FIG. Then, when the programs “shin” and “yellow gate-like” are retrieved as described above, the components of the display screen as shown in FIG. 8 are generated.

以上が要求処理部１２０による検索処理およびその結果を表示するための構成要素の生成処理であり、このように生成された検索結果の表示画面の構成要素が表示制御部１３０に供給される。表示制御部１３０では、上記のように検索結果を表示するための構成要素に含まれる単語をそのまま表示させるのではなく、かかる検索結果を表示させる際に、音声認識可能な単語をユーザに報知しうる表示が行われるよう表示内容を制御する。 The above is the search processing by the request processing unit 120 and the generation processing of the component for displaying the result. The components of the display screen of the search result generated in this way are supplied to the display control unit 130. The display control unit 130 does not display the word included in the component for displaying the search result as described above, but informs the user of a word that can be recognized when the search result is displayed. The display content is controlled so that a clear display is performed.

このような制御を行う表示制御部１３０の規則適用単語決定部１３１の処理動作について、図８に示すような検索結果表示画面の構成要素が供給された場合を例に挙げて説明する。図９に示すように、規則適用単語決定部１３１は、検索結果の表示画面の構成要素の中から最初の単語（「２件」）を取得する（ステップＳ２０１）。 The processing operation of the rule application word determination unit 131 of the display control unit 130 that performs such control will be described by taking as an example the case where components of the search result display screen as shown in FIG. 8 are supplied. As illustrated in FIG. 9, the rule application word determination unit 131 acquires the first word (“2 cases”) from the components of the search result display screen (step S <b> 201).

このように表示画面の構成要素から単語を切り出す方法としては、空白で区切られた文字列を一つの単語とみなして切り出す手法や、形態要素解析方法など公知の種々の方法を用いることができる。このように最初の単語を切り出すと、規則適用単語テーブル１７０に取得した単語（「２件」）を登録するとともに、当該単語に対応する適用フラグを初期化（フラグ＝「０」）する（ステップＳ２０２）。 As a method of cutting out words from the constituent elements of the display screen in this manner, various known methods such as a method of cutting out a character string delimited by white space as one word and a morphological element analysis method can be used. When the first word is cut out in this way, the acquired word (“2 cases”) is registered in the rule application word table 170, and the application flag corresponding to the word is initialized (flag = “0”) (step). S202).

そして、認識辞書１６０（図２参照）に格納されている最初の単語（「情報通」）を取得する（ステップＳ２０３）。この後、表示画面の構成要素から取得した単語、つまり検索結果として表示すべき内容に含まれる単語の中から取得した単語と、認識辞書１６０から取得した単語とを比較し、両者が一致するか否かを判別する（ステップＳ２０４）。 Then, the first word (“information communication”) stored in the recognition dictionary 160 (see FIG. 2) is acquired (step S203). Thereafter, the word acquired from the components of the display screen, that is, the word acquired from the words included in the content to be displayed as the search result is compared with the word acquired from the recognition dictionary 160, and whether or not they match. It is determined whether or not (step S204).

ここで、両者が一致しない場合（上記例では取得される単語が「２件」と「情報通」であるので一致しない）、表示画面の構成要素から取得された単語と、認識辞書１６０に格納されている単語すべてとを比較したか否かを判別する（ステップＳ２０５）。そして、すべての単語と比較していない場合には、認識辞書１６０の中から次の単語を取得し（ステップＳ２０６）、ステップＳ２０４に戻り、認識辞書１６０から取得した単語と、表示画面の構成要素から取得した単語とが一致するか否かを判別する。つまり、検索結果として表示すべき内容に含まれる単語が、認識辞書１６０に格納されている単語であるか否かを判別するのである。 Here, if the two do not match (in the above example, the acquired words are “2” and “informative”, they do not match), the words acquired from the constituent elements of the display screen are stored in the recognition dictionary 160. It is determined whether or not all the words that have been compared have been compared (step S205). If not all words are compared, the next word is acquired from the recognition dictionary 160 (step S206), and the process returns to step S204 to return the word acquired from the recognition dictionary 160 and the components of the display screen. It is determined whether or not the word acquired from the above matches. That is, it is determined whether or not the word included in the content to be displayed as the search result is a word stored in the recognition dictionary 160.

一方、ステップＳ２０４において両単語が一致すると判別される場合、規則適用単語テーブル１７０の当該単語の適用フラグに「１」と立て（ステップＳ２０７）、当該単語が規則適用単語である旨の登録を行う。そして、表示画面の構成要素に含まれるすべての単語について認識辞書１６０に格納される単語との比較処理を行ったか否かを判別し（ステップＳ２０８）、すべての単語について処理済でない場合、表示画面の構成要素から次の単語を取得し（ステップＳ２０９）、ステップＳ２０２に戻り、規則適用単語テーブル１７０に登録するとともに、当該単語に対応する適用フラグを初期化する。 On the other hand, if it is determined in step S204 that both words match, “1” is set in the application flag of the word in the rule application word table 170 (step S207), and registration that the word is a rule application word is performed. . Then, it is determined whether or not all the words included in the constituent elements of the display screen have been compared with the words stored in the recognition dictionary 160 (step S208). If all the words have not been processed, the display screen The next word is acquired from the component (step S209), the process returns to step S202, and is registered in the rule application word table 170, and the application flag corresponding to the word is initialized.

また、ステップＳ２０５において表示画面の構成要素に含まれるある単語について認識辞書１６０中のすべての単語との比較が終了したと判別された場合、適用フラグを立てる処理（ステップＳ２０７）は行われず、ステップＳ２０８に進む。つまり、表示画面の構成要素中の当該ある単語については、適用フラグは「０」のままである。 If it is determined in step S205 that the comparison of all the words included in the components of the display screen with all the words in the recognition dictionary 160 has been completed, the process of setting the application flag (step S207) is not performed. Proceed to S208. That is, the application flag remains “0” for the certain word in the constituent elements of the display screen.

そして、ステップＳ２０８において、表示画面の構成要素に含まれるすべての単語について比較処理が終了したと判別された場合、規則適用単語テーブル１７０への登録処理を終了する。すなわち、検索結果表示画面の構成要素に含まれるすべての単語について、認識辞書１６０中のすべての単語との一致不一致の判定が行われると、当該処理が終了する。 If it is determined in step S208 that the comparison processing has been completed for all the words included in the constituent elements of the display screen, the registration processing in the rule application word table 170 is terminated. That is, when it is determined whether all the words included in the constituent elements of the search result display screen match all the words in the recognition dictionary 160, the process ends.

以上のような処理を行うことで、要求処理部１２０の検索結果を表示すべき内容、つまり表示画面の構成要素に含まれる単語のすべてについて、規則を適用すべきか否かの判断が行われ、規則適用のある単語についてはその旨のフラグ「１」が規則適用単語テーブル１７０に登録される。 By performing the processing as described above, it is determined whether or not the rule should be applied to the contents to be displayed of the search result of the request processing unit 120, that is, all the words included in the constituent elements of the display screen. For a word to which a rule is applied, a flag “1” to that effect is registered in the rule application word table 170.

上記のような規則適用単語テーブル１７０への登録処理が終了すると、表示態様決定部１３２は、登録処理後の規則適用単語テーブル１７０を参照して、要求処理部１２０から供給された表示画面の構成要素中の各単語の表示態様を決定する処理を行う。かかる処理の詳細について図１０を参照しながら説明する。 When the registration process to the rule application word table 170 as described above is completed, the display mode determination unit 132 refers to the rule application word table 170 after the registration process, and the configuration of the display screen supplied from the request processing unit 120 Processing for determining the display mode of each word in the element is performed. Details of this processing will be described with reference to FIG.

同図に示すように、まず規則適用単語テーブル１７０に登録された最初の単語を取得する（ステップＳ３０１）。そして、表示規則記憶部１８０に記憶されている表示規則（図５参照）を参照し、取得した単語に対応する適用フラグに対応する規則を抽出する（ステップＳ３０２）。すなわち、適用フラグが「１」である場合には、当該単語について適用フラグ「１」に対応する、つまり認識辞書１６０に登録された単語用の表示規則を抽出し、当該規則にしたがった表示修正を行う。具体的には、フォントを２つ大きくし、かつボールド化して表示されるように表示形態を修正する（ステップＳ３０３）。 As shown in the figure, first, the first word registered in the rule application word table 170 is acquired (step S301). Then, referring to the display rule (see FIG. 5) stored in the display rule storage unit 180, a rule corresponding to the application flag corresponding to the acquired word is extracted (step S302). That is, when the application flag is “1”, the display rule for the word corresponding to the application flag “1”, that is, the word registered in the recognition dictionary 160 is extracted for the word, and the display correction according to the rule is performed. I do. Specifically, the display form is corrected so that the font is enlarged by two and bolded (step S303).

一方、取得した単語に対応する適用フラグが「０」の場合、抽出される規則は「変更なし」、つまり特別な表示規則が適用されないことを意味し、当該規則にしたがった特別な修正はなされないことになる。本実施の形態では、変更なしとして記憶するようにしているが、適用フラグ「０」は表示規則が適用されないということを意味しているので、このような情報を記憶しないようにしてもよい。 On the other hand, when the application flag corresponding to the acquired word is “0”, it means that the extracted rule is “no change”, that is, no special display rule is applied, and there is no special correction according to the rule. Will not be. In this embodiment, the information is stored as no change, but the application flag “0” means that the display rule is not applied. Therefore, such information may not be stored.

以上のように表示規則にしたがった修正（修正なしの場合もあり）がなされると、規則適用単語テーブル１７０に登録されているすべての単語について上記表示規則に従った修正処理をなしたか否かを判別する（ステップＳ３０４）。そして、すべての登録単語について処理が終了していない場合には、規則適用単語テーブル１７０から次の単語を取得し（ステップＳ３０５）、ステップＳ３０２以降の処理を行う。 When correction according to the display rule is performed as described above (there may be no correction), whether or not correction processing according to the display rule has been performed for all the words registered in the rule application word table 170. Is determined (step S304). If the processing has not been completed for all the registered words, the next word is acquired from the rule application word table 170 (step S305), and the processing after step S302 is performed.

一方、すべての登録単語について処理が終了した場合には、当該表示態様決定処理を終了する。このような処理を行うことで、要求処理部１２０の検索結果を表示すべき内容、つまり表示画面の構成要素に含まれる単語のうち、認識辞書１６０に格納されている単語については、表示規則にしたがって表示態様の修正処理（規則適用のない場合よりもフォント２サイズ大きく、かつボールド化）がなされる。 On the other hand, when the process is completed for all registered words, the display mode determination process is terminated. By performing such processing, the contents to be displayed for the search result of the request processing unit 120, that is, the words stored in the recognition dictionary 160 among the words included in the constituent elements of the display screen, are displayed in the display rule. Accordingly, a display mode correction process (a font 2 size larger than the case where no rule is applied and bold) is performed.

以上のように表示制御部１３０によって、検索結果の表示画面の構成要素に含まれる単語をうち、音声認識可能な単語が他の単語と異なる態様（フォント、字体）で表示されるよう表示内容が制御される。そして、表示制御部１３０によって制御された内容の表示画面が表示部１４０に表示される。ここで、図１１に表示部１４０の表示内容の一例を示す。同図に示すように、テレビ番組の検索結果表示画面中の単語のうち、認識辞書１６０に格納されている単語、つまり音声認識可能な単語（「ＮＢＳ」、「ＳＢＳ」といった放送局名、「ドラマ」といったジャンル名、「しん」や「黄門様」といった番組名等の単語）がボールド化され、かつ２サイズ大きいフォントで強調表示される。 As described above, the display content is displayed by the display control unit 130 such that the words that can be recognized by voice among the words included in the constituent elements of the search result display screen are displayed in a mode (font, font) different from other words. Be controlled. Then, a display screen of contents controlled by the display control unit 130 is displayed on the display unit 140. Here, an example of the display content of the display unit 140 is shown in FIG. As shown in the figure, among words on the TV program search result display screen, words stored in the recognition dictionary 160, that is, words that can be recognized by speech (such as broadcast station names such as “NBS” and “SBS”, “ Genre names such as “Drama” and words such as program names such as “Shin” and “Huangmen” are bolded and highlighted in a font that is two sizes larger.

以上説明したように本実施の形態では、ユーザが処理要求をなすと、要求処理部１２０によって当該要求が処理され、その処理結果（上記例では、テレビ番組の検索結果）が表示部１４０に表示される。そして、このような表示画面に含まれる単語のうち、当該音声認識装置１０で認識することが可能な単語を他の単語とは異なる態様で表示することができる（図１１参照）。これにより、当該表示を参照したユーザは、自己の要求に対する結果を知ることができるとともに、音声認識が可能な単語を知ることができる。 As described above, in the present embodiment, when the user makes a processing request, the request processing unit 120 processes the request, and the processing result (in the above example, the search result of the TV program) is displayed on the display unit 140. Is done. And the word which can be recognized with the said speech recognition apparatus 10 among the words contained in such a display screen can be displayed in the aspect different from another word (refer FIG. 11). As a result, the user who refers to the display can know the result of his request and can know a word that can be recognized by voice.

したがって、音声認識が可能な単語をユーザに報知するために特別のリストを表示させたり、ヘルプ発話で補助画面を表示させたりする必要がなく、本来必要な情報提供を妨げることなく、音声認識可能な単語をユーザに報知することができる。また、認識可能単語を報知するための特別画面等が不要となるので、画面デザインが大きく損なわれてしまうことも抑制できる。 Therefore, it is not necessary to display a special list to inform the user of words that can be recognized by voice, or to display an auxiliary screen with help utterances, and it is possible to recognize voice without disturbing provision of necessary information. Simple words can be notified to the user. In addition, since a special screen or the like for notifying a recognizable word is not required, it is possible to prevent the screen design from being greatly impaired.

特に、音声認識に不慣れなユーザや、音声認識が搭載された機器そのものをほとんど利用したことのないユーザにとっては、画面を見ても何を言えば対象の機器が動作するのかが分からず困惑することが多い。本実施の形態では、認識可能な単語を他と異なる態様で表示（強調表示）してユーザに提示される。また、何らかの表示画面が表示される際には、上記のようにその表示画面に含まれる単語のうち、音声認識可能な単語が強調表示されて定時される、つまり当該音声認識装置の利用中は認識可能な単語が一貫して強調表示されるので、認識単語を調べるためにマニュアルを読み直すといった無駄な作業が軽減される。さらには、未知語の入力によって起こる誤動作を軽減できる可能性が高いなど多大な効果が期待できる。 In particular, users who are unfamiliar with voice recognition or users who have hardly used a voice recognition-equipped device itself are confused because it is difficult to understand what the target device will work by looking at the screen. There are many cases. In the present embodiment, recognizable words are displayed (highlighted) in a different manner from others and presented to the user. In addition, when any display screen is displayed, among the words included in the display screen as described above, words that can be recognized by voice are highlighted and fixed, that is, while the speech recognition device is being used. Since recognizable words are consistently highlighted, useless work such as re-reading the manual to check the recognized words is reduced. Furthermore, a great effect can be expected, such as a high possibility of reducing malfunctions caused by the input of unknown words.

（第２の実施の形態）
次に、本発明の第２の実施の形態について説明する。図１２は、本発明の第２の実施の形態にかかる音声認識装置２０の構成を示すブロック図である。同図に示すように、第２の実施の形態における音声認識装置２０は、上記第１の実施の形態における音声認識装置１０の構成に加え、読み方頻度管理部２１０と、表示読み方決定部２２０と、読み履歴記憶部２３０とを備えており、また認識辞書１６０に代えて認識辞書２６０、規則適用単語テーブル１７０に代えて規則適用単語テーブル２７０を備える点で第１の実施の形態と相違している。なお、第２の実施の形態において、第１の実施の形態と共通する構成要素には同一の符号をつけてその説明を省略する。 (Second Embodiment)
Next, a second embodiment of the present invention will be described. FIG. 12 is a block diagram showing a configuration of the speech recognition apparatus 20 according to the second exemplary embodiment of the present invention. As shown in the figure, in addition to the configuration of the speech recognition device 10 in the first embodiment, the speech recognition device 20 in the second embodiment includes a reading frequency management unit 210, a display reading determination unit 220, and the like. And a reading history storage unit 230, and a recognition dictionary 260 in place of the recognition dictionary 160 and a rule application word table 270 in place of the rule application word table 170, differing from the first embodiment. Yes. In the second embodiment, the same reference numerals are given to the same components as those in the first embodiment, and the description thereof is omitted.

本実施の形態における認識辞書２６０は、１つの認識対象単語に複数の読み方が対応つけられている。ここで、図１３に認識辞書２６０に格納されるデータの一例を示す。同図に示すように、認識辞書２６０には、１つの認識対象単語「月曜ヒステリー劇場「告発弁護人シリーズ・猪熊文明５」」について、３つの読み方「げつようひすてりーげきじょう」、「こくはつべんごにんしりーず」、「いのくまふみあきふぁいぶ」が対応つけられている。かかる認識辞書２６０を参照する音声認識部１１０は、ユーザによって上記３つの読み方のいずれで発声された場合にも、「月曜ヒステリー劇場「告発弁護人シリーズ・猪熊文明５」」という認識対象単語を認識することができる。 In the recognition dictionary 260 in the present embodiment, a plurality of readings are associated with one recognition target word. Here, FIG. 13 shows an example of data stored in the recognition dictionary 260. As shown in the figure, the recognition dictionary 260 has three words “Montsu Hysteria Theater“ Indictment Defense Lawyer Series: Fumiaki Inokuma 5 ”” in three ways to read “Getsuyo Histeri Gejo” and “Koku”. "Hatsubengo Ninshirizu" and "Inokuma Fumiaki Faibu" are associated. The speech recognition unit 110 that refers to the recognition dictionary 260 recognizes the recognition target word “Monday hysterical theater“ accusal defense attorney series, Inokuma Bunmei 5 ”” when the user speaks in any of the above three readings. can do.

読み方頻度管理部２１０には、音声認識部１１０によって認識された結果が供給される。読み方頻度管理部２１０は、音声認識部１１０の認識結果に含まれる単語（図２の認識対象単語）と、その認識の際にユーザが発声した読み方との組み合わせとを読み履歴記憶部２３０に加算する形で更新する。つまり、本実施の形態では、上記のように１つの認識対象単語について複数の読み方が対応つけられており、上記のような組み合わせ出現頻度を管理することで、認識対象単語を認識させるためにユーザが最も多く利用した読み方を把握することができる。 A result recognized by the voice recognition unit 110 is supplied to the reading frequency management unit 210. The reading frequency management unit 210 adds, to the reading history storage unit 230, a combination of a word (a recognition target word in FIG. 2) included in the recognition result of the voice recognition unit 110 and a reading uttered by the user at the time of recognition. Update as you do. That is, in this embodiment, as described above, a plurality of readings are associated with one recognition target word, and the user can recognize the recognition target word by managing the combination appearance frequency as described above. Can understand how to read the most.

読み履歴記憶部２３０には、上記のように読み方頻度管理部２１０によって管理される認識対象単語と読み方との組み合わせが出現した頻度が記憶されている。ここで、読み履歴記憶部２３０の記憶内容の一例を図１４に示す。同図に示すように、読み履歴記憶部２３０は、認識対象単語と読み方との組み合わせ、およびその頻度に加え、読み方対応文字列といった情報が対応つけて格納されている。ここで、読み方対応文字列とは、読み方に対応する区間の文字列である。 The reading history storage unit 230 stores the frequency of occurrence of combinations of recognition target words and readings managed by the reading frequency management unit 210 as described above. Here, an example of the storage contents of the reading history storage unit 230 is shown in FIG. As shown in the drawing, the reading history storage unit 230 stores information such as a reading-ready character string in association with the combination of the recognition target word and the reading and the frequency thereof. Here, the reading-compatible character string is a character string in a section corresponding to the reading.

音声認識部１１０によって音声認識がなされた場合、その認識内容に応じて以上のような読み履歴記憶部２３０の頻度が上記読み方頻度管理部２１０によって更新される。例えば、図示のような記憶がなされている状態において、ユーザが「にほんまるみえ」と発声し、当該発声に応じて「日本丸見え！テレビ特派員」という認識対象単語が認識された場合、読み方「にほんまるみえ」に対応する頻度が「５」から「６」に更新されるのである。 When speech recognition is performed by the speech recognition unit 110, the frequency of the reading history storage unit 230 as described above is updated by the reading frequency management unit 210 according to the recognition content. For example, if the user utters “Nihonmarue” in the state where the memory is as shown in the figure, and the recognition target word “Nihon Maru! TV correspondent” is recognized according to the utterance, The frequency corresponding to “Marumie” is updated from “5” to “6”.

表示読み方決定部２２０は、読み履歴記憶部２３０に記憶された内容を参照し、規則適用単語テーブル２７０に登録された単語のうち、複数の読み方を有する単語に対して、規則適用対象文字列をあらかじめ決められた基準にしたがって書き込む処理を行う。なお、規則適用単語テーブル２７０の認識対象単語および読み方は、上記第１の実施の形態と同様の手順で規則適用単語決定部１３１によって登録されている。 The display / reading determination unit 220 refers to the content stored in the reading history storage unit 230, and selects a rule application target character string for a word having a plurality of readings among the words registered in the rule application word table 270. The writing process is performed according to a predetermined standard. The recognition target words and how to read the rule application word table 270 are registered by the rule application word determination unit 131 in the same procedure as in the first embodiment.

ここで、本実施の形態における規則適用単語テーブル２７０の登録内容の一例を図１５に示す。同図に示すように、第２の実施の形態における規則適用単語テーブル２７０は、上記第１の実施の形態のテーブル内容（図４参照）に加え、規則対象文字列という項目が追加されたものとなっており、表示読み方決定部２２０は当該項目にあらかじめ決められた基準にしたがって決定される規則対象文字列を書き込む処理を行うのである。 Here, an example of the registered contents of the rule application word table 270 in the present embodiment is shown in FIG. As shown in the figure, the rule application word table 270 in the second embodiment is obtained by adding an item of a rule target character string in addition to the table contents of the first embodiment (see FIG. 4). The display / reading determination unit 220 performs a process of writing a rule target character string determined according to a predetermined criterion in the item.

本実施の形態では、規則適用単語テーブル２７０に登録された複数の読み方を有する単語（例えば、「日本丸見え！テレビ特派員」）について、読み履歴記憶部２３０を参照し、最も頻度の多い読み方（「にほんまるみえ」）に対応する文字列（「日本まる見え」）を、規則対象文字列に書き込む処理を行うのである。 In the present embodiment, a word having a plurality of readings registered in the rule application word table 270 (for example, “Nippon Maruhi! TV correspondent”) is referred to the reading history storage unit 230 and read most frequently ( The character string corresponding to “Nihon Marumi”) (“Nippon Maru Appearance”) is written into the rule target character string.

第２の実施の形態における表示態様決定部１３２は、上記のように書き換えられた単語後が登録された規則適用単語テーブル２７０を参照し、上記第１の実施の形態と同様、表示規則記憶部１８０に記憶された表示規則に基づいて表示態様を決定する。 The display mode determination unit 132 in the second embodiment refers to the rule application word table 270 registered after the rewritten word as described above, and similarly to the first embodiment, the display rule storage unit The display mode is determined based on the display rule stored in 180.

なお、第２の実施の形態における要求処理用情報記憶部１９０には、上記第１の実施の形態と同様、テレビ番組を検索するための番組データベースが格納されており、その内容の一例を図１６に示す。同図に示すように、この番組データベースには、上記第１の実施の形態と同様、ＩＤ、番組名、放送日時、放送局、ジャンルおよび出演者といったテレビ番組に関する項目の情報が含まれており、上記のように複数の読み方を有する認識対象単語（「月曜ヒステリー劇場「告発弁護人シリーズ・猪熊文明５」」など）も含まれている。 Note that the request processing information storage unit 190 in the second embodiment stores a program database for searching for television programs, as in the first embodiment, and an example of the contents thereof is shown in FIG. 16 shows. As shown in the figure, this program database includes information on items related to TV programs such as ID, program name, broadcast date and time, broadcast station, genre, and performers, as in the first embodiment. As described above, a recognition target word having a plurality of readings (such as “Monday hysterical theater“ accusal defense counsel series / Fumiaki Inokuma 5 ”) is also included.

以上が第２の実施の形態にかかる音声認識装置２０の構成であり、以下当該音声認識装置２０の動作について具体例を挙げながら説明する。ここでは、要求処理部１２０がユーザからの音声入力によるテレビ番組検索要求、例えばジャンルや出演者を指定した番組検索要求に対する処理を行い、その処理結果であるテレビ番組検索結果を表示部１４０に表示する場合を例に挙げて説明する。 The above is the configuration of the speech recognition apparatus 20 according to the second exemplary embodiment. Hereinafter, the operation of the speech recognition apparatus 20 will be described with specific examples. Here, the request processing unit 120 performs a process for a TV program search request by voice input from the user, for example, a program search request specifying a genre or a performer, and displays the TV program search result as the processing result on the display unit 140. An example of the case will be described.

まず、ユーザが音声入力部１００に向けて、所望の検索のための情報、つまり所望番組のジャンル、出演者、チャンネル、放送時刻、番組名等の情報を発声する。ここでは、複数の読み方を有する認識対象単語「月曜ヒステリー劇場告発弁護人シリーズ・猪熊文明５」を認識させるために「げつようひすてりーげきじょう」といった発声をしたものとする。 First, the user utters information for a desired search toward the voice input unit 100, that is, information such as the genre of the desired program, performers, channels, broadcast time, program name, and the like. Here, it is assumed that the word “Getsuyo Histeri Gekijo” is uttered in order to recognize the recognition target word “Monday Hystery Theater Prosecution Attorney Series, Fumiaki Inokuma 5” having a plurality of readings.

この場合、音声認識部１１０によって「月曜ヒステリー劇場告発弁護人シリーズ・猪熊文明５」が認識され、検索条件文言として要求処理部１２０に供給される。要求処理部１２０は、かかる検索条件文言にしたがって検索処理を行う。 In this case, the “Monday Hystery Theater Prosecution Attorney Series / Fumiaki Inokuma 5” is recognized by the voice recognition unit 110 and supplied to the request processing unit 120 as a search condition wording. The request processing unit 120 performs a search process according to the search condition wording.

音声認識部１１０によって上記のような音声認識がなされた場合、読み方頻度管理部２１０は、かかる認識結果に応じて読み履歴記憶部２３０の記憶内容を更新する。このように音声認識部１１０によって音声認識がなされた場合における読み方頻度管理部２１０の処理について図１７を参照しながら説明する。 When the voice recognition unit 110 performs the voice recognition as described above, the reading frequency management unit 210 updates the storage content of the reading history storage unit 230 according to the recognition result. The processing of the reading frequency management unit 210 when the speech recognition unit 110 performs speech recognition in this way will be described with reference to FIG.

まず、音声認識部１１０による最初の単語の認識結果、つまり認識対象単語とその認識の際になされた読み方を取得する（ステップＳ４０１）。ここでは、認識対象単語である「月曜ヒステリー劇場告発弁護人シリーズ・猪熊文明５」と、その際の読み方「げつようひすてりーげきじょう」を取得する。 First, the recognition result of the first word by the speech recognition unit 110, that is, the recognition target word and how to read it at the time of the recognition are acquired (step S401). Here, the recognition target words “Monday Hystery Theater Prosecution Attorney Series, Fumiaki Inokuma 5” and the reading “Getsuyo Histeri Gekijo” are acquired.

次に、取得した認識結果である認識対象単語と読み方の組み合わせ、つまり「月曜ヒステリー劇場告発弁護人シリーズ・猪熊文明５」と「げつようひすてりーげきじょう」との組合せ頻度が以前に何回出現していたかを示す情報を読み履歴記憶部２３０（図１４参照）から取得する（ステップＳ４０２）。 Next, how many times the combination frequency of the recognition target word and the reading that is the acquired recognition result, that is, the combination frequency of “Monday Hystery Theater Prosecution Defense Attorney Series ・ Fumaaki Inokuma 5” and “Getsuyo Histeri Gekijo” Information indicating whether it has appeared is acquired from the history storage unit 230 (see FIG. 14) (step S402).

そして、取得した頻度に１を加算し、加算後の値を上記組み合わせに対応する頻度に上書きし、読み履歴記憶部２３０の記憶内容を更新する（ステップＳ４０３）。図１４に示すような状態である場合には、認識対象単語と読み方「げつようひすてりーげきじょう」の組み合わせに対応する頻度「１」が取得され、これに１が加算されることで、頻度が「２」に更新される。 Then, 1 is added to the acquired frequency, the value after the addition is overwritten on the frequency corresponding to the combination, and the storage content of the reading history storage unit 230 is updated (step S403). In the state as shown in FIG. 14, the frequency “1” corresponding to the combination of the recognition target word and the reading “Getsuyo Histeri Gekijo” is acquired, and 1 is added to this. The frequency is updated to “2”.

このように更新が行われると、音声認識部１１０の認識結果に含まれる単語すべてについて上記のような処理を行ったか否かを判別し（ステップＳ４０４）、すべての単語について処理済であれば当該処理を終了する。一方、すべての単語について処理を行っていない場合には、音声認識部１１０の認識結果に含まれる次の認識対象単語と読み方の組み合わせを取得し（ステップＳ４０５）、当該組み合わせについてステップＳ４０２以降の処理を行う。上記例の場合、ユーザは「げつようひすてりーげきじょう」という言葉を発したのみであるため、認識対象単語は１つであり、上記ステップＳ４０４の判別は「Ｙｅｓ」となり、読み方頻度管理部２１０の処理は終了することになる。 When the update is performed in this way, it is determined whether or not the above-described processing has been performed on all the words included in the recognition result of the speech recognition unit 110 (step S404). The process ends. On the other hand, if not all the words have been processed, the next recognition target word and reading combination included in the recognition result of the speech recognition unit 110 is acquired (step S405), and the processing after step S402 is performed for the combination. I do. In the case of the above example, since the user only utters the word “Getsuyo Histari Gekijo”, there is only one recognition target word, the determination in Step S404 is “Yes”, and the reading frequency management unit The processing of 210 ends.

また、上記のような音声認識部１１０による認識結果は要求処理部１２０に供給され、上記第１の実施の形態と同様、認識結果に含まれる要求に応じた処理結果を表示するための構成要素を表示制御部１３０に出力する。これを受けた規則適用単語決定部１３１は、上記第１の実施の形態と同様、表示画面の構成要素に含まれる単語を、適用フラグとともに規則適用単語テーブル２７０に登録する。 Further, the recognition result by the voice recognition unit 110 as described above is supplied to the request processing unit 120, and the component for displaying the processing result according to the request included in the recognition result, as in the first embodiment. Is output to the display control unit 130. Receiving this, the rule application word determination unit 131 registers the words included in the constituent elements of the display screen in the rule application word table 270 together with the application flag, as in the first embodiment.

第２の実施の形態では、第１の実施の形態と異なり、表示読み方決定部２２０が上記のように規則適用単語テーブル２７０に登録された単語について、規則対象文字列を書き込む処理を行う。このような表示読み方決定部２２０による処理について図１８を参照しながら説明する。ここで、要求処理部１２０によって図１９に示すような処理結果を表示するための構成要素が作成され、その結果規則適用単語テーブル２７０に図１５に示すような登録がなされた場合を例に挙げて説明する。 In the second embodiment, unlike the first embodiment, the display / reading determination unit 220 performs a process of writing the rule target character string for the words registered in the rule application word table 270 as described above. Processing performed by the display reading determination unit 220 will be described with reference to FIG. Here, the request processing unit 120 creates a component for displaying the processing result as shown in FIG. 19, and as a result, the rule application word table 270 is registered as shown in FIG. I will explain.

まず、表示読み方決定部２２０は、規則適用単語テーブル２７０に格納された最初の単語である「２件」を取得する（ステップＳ５０１）。そして、規則適用単語テーブル２７０における取得単語に対応する規則適用対象文字列の項目に、その単語全部（「２件」）をそのまま仮登録する（ステップＳ５０２）。 First, the display reading determination unit 220 acquires “two cases” that are the first words stored in the rule application word table 270 (step S501). Then, all the words (“2 cases”) are provisionally registered as they are in the item of the rule application target character string corresponding to the acquired word in the rule application word table 270 (step S502).

そして、読み履歴記憶部２３０に記憶されている最初の認識対象単語（図１４に示す例では「日本まる見え！テレビ特派員」）を取得し（ステップＳ５０３）、規則適用単語テーブル２７０から取得した単語（「２件」）と比較し、両者が一致するか否かを判別する（ステップＳ５０４）。 Then, the first recognition target word (in the example shown in FIG. 14, “Nippon Maru looks! TV correspondent”) stored in the reading history storage unit 230 is acquired (step S503) and acquired from the rule application word table 270. It is compared with the word (“2 cases”) and it is determined whether or not they match (step S504).

「２件」と「日本まる見え！テレビ特派員」のように両者が一致しない場合には、読み履歴記憶部２３０に記憶されたすべての単語について比較をなしたか否かを判別し（ステップＳ５０５）、全単語との比較が済んでいない場合には、読み履歴記憶部２３０に記憶されている次の単語を取得し（ステップＳ５０６）、ステップＳ５０４以降の処理を行う。 If the two do not match, such as “two cases” and “Japan Marutsu! TV correspondent”, it is determined whether or not all the words stored in the reading history storage unit 230 have been compared (steps). S505) If the comparison with all the words has not been completed, the next word stored in the reading history storage unit 230 is acquired (step S506), and the processes after step S504 are performed.

一方、ステップＳ５０４の判別において両単語が一致する場合、読み履歴記憶部２３０に記憶されている当該単語についての複数の読み方とそれぞれの読み方（および読み方対応文字列）に対応する頻度を参照し、その単語について最も頻度の値が大きい読み方に対応する読み方対応文字列を取得し、規則適用単語テーブル２７０の規則適用対象文字列に格納する（ステップＳ５０７）。上記例では、単語「日本まる見え！テレビ特派員」の複数の読み方のうち、最も頻度の値（５回）が大きい読み方「にほんまるみえ」に対応する読み方対応文字列「日本まる見え」が取得され、規則適用単語テーブル２７０の単語「日本まる見え！テレビ特派員」に対応する規則適用対象文字列に格納される。 On the other hand, if both words match in the determination of step S504, refer to a plurality of readings for the word stored in the reading history storage unit 230 and the frequencies corresponding to the respective readings (and reading corresponding character strings), A reading correspondence character string corresponding to the reading with the highest frequency value for the word is acquired and stored in the rule application target character string of the rule application word table 270 (step S507). In the above example, the reading-ready character string “Nippon Maru Appearance” corresponding to the reading “Nihon Maru Mie” with the highest frequency value (5 times) is obtained from the multiple readings of the word “Japan Maru Appearance! TV correspondent”. Then, it is stored in the rule application target character string corresponding to the word “Nippon Maru Appearance! Television Correspondent” in the rule application word table 270.

以上のように規則対象文字列に頻度の高い読み方に対応する文字列を格納した場合、もしくはステップＳ５０５において読み履歴記憶部２３０に記憶されているすべての単語と比較済みである場合には、規則適用単語テーブル２７０に登録されている単語について上記のような規則適用対象文字列の格納等の処理を行ったか否かを判別し（ステップＳ５０８）、すべての単語について処理済の場合には当該処理を終了する。一方、規則適用単語テーブル２７０に登録されている単語すべてについて処理を行っていない場合には、規則適用単語テーブル２７０から次の単語を取得し（ステップＳ５０９）、ステップＳ５０２以降の処理を行う。 As described above, when a character string corresponding to a high-frequency reading is stored in the rule target character string, or when all the words stored in the reading history storage unit 230 have been compared in step S505, the rule It is determined whether or not the processing such as storing the rule application target character string as described above has been performed on the words registered in the application word table 270 (step S508). If all the words have been processed, the processing is performed. Exit. On the other hand, if all the words registered in the rule application word table 270 have not been processed, the next word is acquired from the rule application word table 270 (step S509), and the processes after step S502 are performed.

このようにして規則適用単語テーブル２７０に登録されているすべての単語について規則適用対象文字列への文字列の書き込み等の処理が行われる。このような表示読み方決定部２２０による処理後の規則適用単語テーブル２７０の登録内容の一例を図２０に示す。同図に示すように、認識辞書２６０に登録されていない単語「２件」等については、規則適用対象文字列は仮登録された「２件」のままであるのに対し、複数の読み方が認識辞書１６０に登録された単語「日本まる見え！テレビ特派員」については、認識可能な読み方のうち最も頻度の高い読み方に対応する文字列「日本まる見え」が規則適用対象文字列として登録される。なお、「スーパーテレビ・情報最前列」も複数の読み方を有する単語であるが、図示の例では最も頻度の高い読み方が「すーぱーてれびじょうほうさいぜんれつ」という単語全体に対応する読み方であった場合を示しており、当該読み方に対応する文字列、つまり単語全部が登録されている。 In this way, processing such as writing a character string into the rule application target character string is performed for all words registered in the rule application word table 270. An example of the registered contents of the rule application word table 270 after processing by the display reading determination unit 220 is shown in FIG. As shown in the figure, for the word “2 cases” and the like that are not registered in the recognition dictionary 260, the rule application target character string remains as “2 cases” temporarily registered, but there are a plurality of reading methods. For the word “Nippon Maru Appearance! TV correspondent” registered in the recognition dictionary 160, the character string “Japan Maru Appearance” corresponding to the most frequent reading of the recognizable readings is registered as the rule application target character string. The Note that “super TV / information front row” is also a word having a plurality of readings, but in the illustrated example, the most frequent reading is a reading corresponding to the whole word “Super Television”. The character string corresponding to the reading, that is, all the words are registered.

以上のように規則適用単語テーブル２７０に単語、規則適用文字列および適用フラグが登録されると、表示態様決定部１３２は、登録処理後の規則適用単語テーブル２７０を参照して、要求処理部１２０から供給された表示画面の構成要素中の各単語の表示態様を決定する処理を行う。かかる処理の詳細について図２１を参照しながら説明する。 As described above, when the word, the rule application character string, and the application flag are registered in the rule application word table 270, the display mode determination unit 132 refers to the rule application word table 270 after the registration process, and the request processing unit 120 The process which determines the display mode of each word in the component of the display screen supplied from is performed. Details of such processing will be described with reference to FIG.

なお、この説明においては、表示規則記憶部１８０に図２２に示すような規則等が格納されている場合を例に挙げて説明する。つまり、この表示規則記憶部１８０には、適用フラグが「１」の文字列等に対して適用される表示規則に加え、適用フラグが「０」の文字列等に対して適用すべき表示規則が記憶されており、表示態様決定部１３２は、このような表示規則記憶部１８０の記憶される規則にしたがって表示態様を決定する。 In this description, a case where rules such as those shown in FIG. 22 are stored in the display rule storage unit 180 will be described as an example. That is, in this display rule storage unit 180, in addition to the display rule applied to the character string or the like whose application flag is “1”, the display rule to be applied to the character string or the like whose application flag is “0”. Is stored, and the display mode determination unit 132 determines the display mode according to the rules stored in the display rule storage unit 180.

表示態様決定部１３２が行う処理は、基本的には上記第１の実施の形態と同様であり（図１０参照）と同様（ステップＳ６０１〜ステップＳ６０２はステップＳ３０１〜ステップＳ３０２に対応、ステップＳ６０４〜ステップＳ６０５はステップＳ３０４〜ステップＳ３０５に対応）であるが、ステップＳ６０３の処理が上記第１の実施の形態と相違している。 The processing performed by the display mode determination unit 132 is basically the same as that in the first embodiment (see FIG. 10) (steps S601 to S602 correspond to steps S301 to S302, and steps S604 to S604). Step S605 corresponds to Step S304 to Step S305), but the processing of Step S603 is different from that of the first embodiment.

すなわち、規則適用単語テーブル２７０に登録されている単語を取得し（ステップＳ６０１、ステップＳ６０５）、当該単語の適用フラグに対応する表示規則を抽出する（ステップＳ６０２）点までは同じであるが、抽出した表示規則を用いて表示態様を決定する文字列が相違する。より具体的には、上記第１の実施の形態では、当該単語の全部分について表示規則にしたがって表示態様を決定するようにしていたが、本実施の形態では規則適用単語テーブル２７０に格納された規則適用対象文字列について表示規則にしたがって表示態様を決定し、修正する（ステップＳ６０３）。 That is, it is the same until the word registered in the rule application word table 270 is acquired (steps S601 and S605) and the display rule corresponding to the application flag of the word is extracted (step S602). The character strings that determine the display mode using the display rules are different. More specifically, in the first embodiment, the display mode is determined according to the display rule for all parts of the word, but in this embodiment, the display mode is stored in the rule application word table 270. The display mode is determined and corrected according to the display rule for the rule application target character string (step S603).

例えば、図２３に示すように、規則適用単語テーブル２７０から取り出した単語が「日本まる見え！テレビ特派員」の場合、適用フラグが「１」であり、規則適用対象文字列が「日本まる見え！」であるため、適用フラグ「１」に対応する表示規則、つまり認識辞書２６０に登録された単語に適用される表示規則に従って「日本まる見え！」の部分のみのフォントサイズが２つ大きくなされ、かつボールド化される。一方、取り出した「２件」の場合、適用フラグ「０」であり、規則適用対象文字列が単語の全部である「２件」であるので、認識辞書２６０に登録された単語用の表示規則は適用されず、それ以外の単語に適用される規則が適用される。したがって、フォントサイズが２つ小さく表示される。 For example, as shown in FIG. 23, if the word extracted from the rule application word table 270 is “Japan Maru looks! Television correspondent”, the application flag is “1” and the rule application target character string is “Japan Maru Appearance”. Therefore, according to the display rule corresponding to the application flag “1”, that is, the display rule applied to the word registered in the recognition dictionary 260, the font size of only the part “Japan looks!” Is increased by two. And bolded. On the other hand, in the case of “2 cases” taken out, the application flag is “0”, and the rule application target character string is “2 cases” that is all the words, so the display rules for the words registered in the recognition dictionary 260 are displayed. Does not apply, and rules that apply to other words apply. Therefore, the font size is displayed two smaller.

また、取得された単語が「スーパーテレビ・情報最前列」の場合、適用フラグは「１」であるので、認識辞書２６０に登録された単語用の表示規則が適用されるが、規則適用対象文字列が単語の全部分「スーパーテレビ・情報最前列」であるので、結果としては上記第１の実施の形態と同様、単語全体のフォントサイズが２つ大きく、かつボールド化されて表示される。 In addition, when the acquired word is “super TV / information front row”, the application flag is “1”, so the display rule for the word registered in the recognition dictionary 260 is applied. Since the column is the entire word portion “super TV / information front row”, the result is that the font size of the entire word is two larger and bolded as in the first embodiment.

以上説明したように本実施の形態では、ユーザが処理要求をなすと、要求処理部１２０によって当該要求が処理され、その処理結果（上記例では、テレビ番組の検索結果）が表示部１４０に表示される。そして、このような表示画面に含まれる単語のうち、当該音声認識装置２０で認識することが可能な単語を、その単語を認識させるために必要な文字列を他の文字等とは異なる態様で表示することができる（図２３参照）。これにより、当該表示を参照したユーザは、自己の要求に対する結果を知ることができるとともに、音声認識が可能な単語を知ることができるとともに、どのような読み方をすればその単語を認識させることができるかを知ることができる。 As described above, in the present embodiment, when the user makes a processing request, the request processing unit 120 processes the request, and the processing result (in the above example, the search result of the TV program) is displayed on the display unit 140. Is done. And among the words contained in such a display screen, the character string necessary for recognizing the word that can be recognized by the voice recognition device 20 is different from other characters. It can be displayed (see FIG. 23). As a result, the user who has referred to the display can know the result of his / her request, can know a word that can be recognized by voice, and can recognize the word by how to read it. You can know if you can.

したがって、上記第１の実施の形態と同様、本来必要な情報の提供を妨げえることなく、かつ画面デザインが大きく損なわれしまうことを抑制しつつ、ユーザに認識可能な単語を報知することができ、さらにその単語を認識させるのに必要な読み方を報知することができる。 Therefore, as in the first embodiment, it is possible to notify a user of a recognizable word while preventing the provision of necessary information and suppressing the screen design from being greatly impaired. Further, it is possible to notify the reading necessary for recognizing the word.

さらに、本実施の形態では、認識可能な単語を認識させるのに必要な単語の読み方として、ユーザが最も多く利用した読み方に対応する文字列が他と異なる態様で表示されるので、最も利用しやすいと考えられる読み方をユーザに伝えることもできる。すなわち、ユーザによって読み方に違いがある場合に、単語全体だけではなく必要な部分のみを強調することで、認識可能な単語を他の認識対象外の単語と区別してよりわかりやすく表示できる。また、番組名のように単語自体が長い場合、全部を発声するのは煩雑であり、特に長い単語を何度も発声しなくてはならないと面倒である。そこで、本実施の形態のように、部分文字列の発声により全体の単語を認識できるようにするとともに、その部分的な読み方を報知することで、長い単語を何度も認識させる必要があるような場合に特に好適である。 Furthermore, in this embodiment, as a way of reading a word necessary for recognizing a recognizable word, a character string corresponding to the reading most frequently used by the user is displayed in a different form from the other, so that it is most used. Users can be told how to read easily. That is, when there is a difference in reading depending on the user, by emphasizing not only the whole word but only a necessary part, recognizable words can be distinguished from other words not recognized and displayed more clearly. Moreover, when a word itself is long like a program name, it is troublesome to utter the whole word, and it is particularly troublesome if a long word has to be uttered many times. Therefore, as in this embodiment, it is necessary to recognize the entire word by uttering the partial character string and to recognize the long word many times by notifying the partial reading. In this case, it is particularly suitable.

（第３の実施の形態）
次に、本発明の第３の実施の形態について説明する。図２４は、本発明の第３の実施の形態にかかる音声認識装置３０の構成を示すブロック図である。同図に示すように、第３の実施の形態における音声認識装置３０は、上記第１の実施の形態における音声認識装置１０の構成に加え、単語重要度決定部３１０と、単語重要度記憶部３２０と、重要度決定規則記憶部３３０とを備え、第１の実施の形態における表示規則記憶部１８０に代えて表示規則記憶部３８０を備えている。なお、第３の実施の形態において、第１の実施の形態と共通する構成要素には同一の符号をつけてその説明を省略する。 (Third embodiment)
Next, a third embodiment of the present invention will be described. FIG. 24 is a block diagram showing a configuration of a speech recognition apparatus 30 according to the third exemplary embodiment of the present invention. As shown in the figure, in addition to the configuration of the speech recognition device 10 in the first embodiment, a speech recognition device 30 in the third embodiment includes a word importance degree determination unit 310 and a word importance degree storage unit. 320 and an importance determination rule storage unit 330, and a display rule storage unit 380 instead of the display rule storage unit 180 in the first embodiment. Note that, in the third embodiment, components that are the same as those in the first embodiment are assigned the same reference numerals, and descriptions thereof are omitted.

単語重要度決定部３１０には、ユーザからの要求に対応する要求処理部１２０の処理結果が供給される。単語重要度決定部３１０は、重要度決定規則記憶部３３０に記憶されている重要度決定規則にしたがい、処理結果の表示画面の構成要素に含まれる単語について重要度を決定し、これを単語重要度記憶部３２０に記憶させる。単語重要度決定部３１０は、このような重要度決定の際に必要であれば、要求処理用情報記憶部１９０に格納されている情報や要求処理部１２０が過去に行った処理内容等を参照する。 The word importance determination unit 310 is supplied with the processing result of the request processing unit 120 corresponding to the request from the user. The word importance level determination unit 310 determines the importance level for the word included in the component of the processing result display screen according to the importance level determination rule stored in the importance level determination rule storage unit 330, and uses this as the word importance level. It is stored in the degree storage unit 320. The word importance level determination unit 310 refers to the information stored in the request processing information storage unit 190, the processing content performed in the past by the request processing unit 120, etc., if necessary for such importance level determination. To do.

重要度決定規則記憶部３３０には、上記のように単語の重要度を決定するための規則が格納されている。ここで、図２５は重要度決定規則記憶部３３０に格納される規則の一例を示す。同図に示すように、重要度決定規則は、重要度の値と、それに対応する規則内容とを含んでいる。 The importance determination rule storage unit 330 stores rules for determining the importance of words as described above. Here, FIG. 25 shows an example of rules stored in the importance determination rule storage unit 330. As shown in the figure, the importance determination rule includes an importance value and the corresponding rule content.

図示の例は、テレビ番組をジャンル等をキーワードとして検索する処理を要求処理部１２０が行う場合の規則の一例であり、重要度決定の対象となる単語の属性が、すでに入力済みの属性であれば、重要度が２０に決定されるという規則や、対象となる単語の属性が、直前にジャンル属性が入力された場合のサブジャンル属性であれば重要度が８０に決定されるという規則である。 The illustrated example is an example of a rule when the request processing unit 120 performs a process of searching for a TV program using a genre or the like as a keyword. If the attribute of a word for which importance is determined is an already input attribute, the example shown in FIG. For example, a rule that the importance is determined to be 20 or a rule that the importance is determined to be 80 if the attribute of the target word is a sub-genre attribute when the genre attribute is input immediately before. .

例えば、重要度決定の対象となる単語が、テレビ番組の検索キーワード「スポーツ」であった場合に、それ以前に同一の「スポーツ」という単語が要求処理部１２０に入力されている場合、再度同じ単語が入力されるケースは少ないと考えられるので、このような単語については重要度が低く決定される。一方、直前に検索キーワードとしてあるジャンル名「スポーツ」などが入力された場合、その下位概念のジャンルであるサブジャンル例えば「野球」、「サッカー」などは検索キーワードとして入力される可能性が高い。したがって、このような属性の単語である場合には、その単語の重要度は高く決定される。 For example, if the word whose importance is to be determined is the search keyword “sports” of a television program, and the same word “sports” has been input to the request processing unit 120 before that, the same is repeated again. Since it is considered that there are few cases where a word is input, the importance of such a word is determined to be low. On the other hand, when a genre name “sports” or the like is input as a search keyword immediately before, a sub-genre such as “baseball” or “soccer” that is a subordinate genre is highly likely to be input as a search keyword. Therefore, when the word has such an attribute, the importance of the word is determined to be high.

以上のように重要度決定規則としては、過去の入力内容等をも考慮し、要求処理部１２０が処理を行うにあたって、ユーザから入力される可能性の大小等に応じてあらかじめ類型化された単語の属性などの規則内容と、それに対応する重要度が対応つけられたものとなっており、このような規則を用いて単語の重要度を決定することで、その時々の状況に応じて次に入力される可能性が大きい単語については重要度を高く、入力可能性の低い単語については重要度を低くといった重要度決定が可能となる。 As described above, the importance determination rule also takes into account past input contents and the like, and the words categorized in advance according to the possibility of input from the user when the request processing unit 120 performs processing. The contents of the rules such as attributes and the importance corresponding to them are associated with each other. By determining the importance of the words using such rules, it is possible to It is possible to determine an importance level such that a word having a high possibility of being input has a high importance and a word having a low possibility of being input has a low importance.

このような重要度規則にしたがって決定された重要度が単語重要度記憶部３２０に記憶される。ここで、図２６に単語重要度記憶部３２０の記憶内容の一例を示す。同図に示すように、単語重要度記憶部３２０には、上記のような重要度規則を参照することで単語重要度決定部３１０により決定された各単語の重要度が、各単語に対応つけて格納される。 The importance determined according to such importance rules is stored in the word importance storage unit 320. Here, FIG. 26 shows an example of the stored contents of the word importance storage unit 320. As shown in the figure, in the word importance storage unit 320, the importance of each word determined by the word importance determination unit 310 by referring to the importance rules as described above is associated with each word. Stored.

本実施の形態における表示態様決定部１３２は、上記第１の実施の形態と同様、規則適用単語決定部１３１により規則適用単語テーブル１７０に登録された単語について、表示規則記憶部３８０に記憶された表示規則および単語重要度記憶部３２０記憶された単語重要度に基づいて表示態様を決定する。 The display mode determination unit 132 in the present embodiment stores the words registered in the rule application word table 170 by the rule application word determination unit 131 in the display rule storage unit 380, as in the first embodiment. The display mode is determined based on the display rules and the word importance stored in the word importance storage unit 320.

ここで、本実施の形態における表示規則記憶部３８０に記憶される内容の一例を図２７に示す。同図に示すように、表示規則記憶部３８０には、４つの規則が記憶されている。まず、１つ目の規則は、認識辞書１６０に登録されていない単語（適用フラグ「０」）に用いられる規則であり、当該単語についてはフォントを２つ小さくする旨が規定されている。 Here, an example of the contents stored in the display rule storage unit 380 in the present embodiment is shown in FIG. As shown in the figure, the display rule storage unit 380 stores four rules. First, the first rule is a rule used for a word (application flag “0”) that is not registered in the recognition dictionary 160. For the word, it is defined that the font is reduced by two.

他の３つの規則は、認識辞書１６０に登録されている単語（適用フラグ「１」）について適用される表示規則であり、重要度の範囲に応じて３種類の規則が規定されている。重要度が５０以上８０未満の場合には、フォントを２つ大きくし、かつボールド化して表示する旨が、重要度が８０以上の場合にはフォントを４つ大きくし、かつボールド化して表示する旨が規定されている。また、重要度が５０未満の場合には、「なし」、つまり通常の状態と変更しない旨が規定されている。このように本実施の形態では、重要度が大きいほど、より強調された表示がなされるような規則が設定されている。 The other three rules are display rules that are applied to words registered in the recognition dictionary 160 (application flag “1”), and three types of rules are defined according to the importance range. When the importance is 50 or more and less than 80, the font is increased by two and displayed in bold. When the importance is 80 or more, the font is increased by four and displayed in bold. It is stipulated. Further, when the importance is less than 50, “None”, that is, not changing to the normal state is defined. As described above, in this embodiment, a rule is set such that the higher the importance, the more emphasized display is made.

本実施の形態における表示態様決定部１３２は、このような表示規則記憶部３８０に記憶された規則にしたがい、要求処理部１２０の処理結果を表示する画面の構成要素に含まれる単語について表示態様の修正等を行う。このように認識辞書１６０に登録されている単語について重要度に応じて表示態様が修正等された単語等を含む要求処理部１２０の処理結果を示すための画像が表示部１４０に表示される。 In accordance with the rules stored in the display rule storage unit 380, the display mode determination unit 132 in the present embodiment displays the display mode for words included in the components of the screen that displays the processing result of the request processing unit 120. Make corrections. In this way, an image for displaying the processing result of the request processing unit 120 including the word registered in the recognition dictionary 160 and the like whose display mode is corrected according to the importance is displayed on the display unit 140.

以上が第３の実施の形態にかかる音声認識装置３０の構成であり、以下当該音声認識装置３０の動作について具体例を挙げながら説明する。ここでは、要求処理部１２０がユーザからの音声入力によるテレビ番組検索要求、例えばジャンルや出演者を指定した番組検索要求に対する処理を行い、その処理結果であるテレビ番組検索結果を表示部１４０に表示する場合を例に挙げて説明する。 The above is the configuration of the speech recognition apparatus 30 according to the third exemplary embodiment. Hereinafter, the operation of the speech recognition apparatus 30 will be described with specific examples. Here, the request processing unit 120 performs a process for a TV program search request by voice input from the user, for example, a program search request specifying a genre or a performer, and displays the TV program search result as the processing result on the display unit 140. An example of the case will be described.

まず、ユーザが音声入力部１００に向けて、所望の検索のための情報、つまり所望番組のジャンル、出演者、チャンネル、放送時刻、番組名等の情報を発声する。ここでは、検索キーワードとしてジャンル「今日のスポーツ」を認識させるために「きょうのすぽーつ」といった発声をしたものとする。 First, the user utters information for a desired search toward the voice input unit 100, that is, information such as the genre of the desired program, performers, channels, broadcast time, program name, and the like. Here, it is assumed that the utterance “Kyono Sports” is used to recognize the genre “Today's Sports” as a search keyword.

この場合、音声認識部１１０によって「今日のスポーツ」が認識され、検索キーワードとして要求処理部１２０に供給される。要求処理部１２０は、かかる検索キーワードにしたがって検索処理を行い、その結果、例えば図２８に示すような処理結果を表示すべき画面の構成要素が作成される。 In this case, “today's sport” is recognized by the voice recognition unit 110 and supplied to the request processing unit 120 as a search keyword. The request processing unit 120 performs a search process according to the search keyword, and as a result, a component of a screen on which a process result as shown in FIG. 28 is to be displayed is created.

なお、本実施の形態における要求処理部１２０は、上位概念のジャンル、例えば「スポーツ」と、上位概念のジャンルに属する下位概念のサブジャンル、例えばジャンル「スポーツ」のサブジャンル「野球」などといった複数階層のジャンルを検索キーワードとして検索できるようになっている。したがって、本実施の形態における要求処理部１２０は、図２９に示すように、第１の実施の形態のテンプレート（図７参照）とは異なる表示画面の構成要素のテンプレートを保持しており、かかるテンプレートを利用して表示画面の構成要素を生成する。 It should be noted that the request processing unit 120 in the present embodiment includes a plurality of genres of higher concepts, such as “sports”, and subgenres of lower concepts belonging to the genre of higher concepts, such as a subgenre “baseball” of the genre “sports”. It is possible to search the genre of the hierarchy as a search keyword. Therefore, as shown in FIG. 29, the request processing unit 120 according to the present embodiment holds a template of display screen components different from the template according to the first embodiment (see FIG. 7). Generate a component of the display screen using a template.

同図に示すように、かかるテンプレートは、上記第１の実施の形態のテンプレートの各項目に加え、「サブジャンル」を配置すべき領域が設けられている。したがって、図２８に示すように、テンプレートの「ジャンル」項目には、上位概念のジャンルである「スポーツ」が当てはめられ、「サブジャンル」にはその下位概念のジャンルである「野球」や「サッカー」が当てはめられることで、表示画面が構成される。 As shown in the figure, in addition to the items of the template of the first embodiment, such a template is provided with a region where “sub-genre” is to be arranged. Accordingly, as shown in FIG. 28, the “genre” item of the template is applied to “sports” that is a genre of the higher concept, and “subgenres” are “baseball” and “soccer” that are genres of the lower concept. "Is applied, the display screen is configured.

このような要求処理部１２０による処理結果は単語重要度決定部３１０にも供給されるとともに、要求処理部１２０から過去の処理内容等（既に番組検索クエリとしてジャンル属性、日付属性の値が決定済であるなど）の情報が供給される。単語重要度決定部３１０は、処理結果に含まれる単語について重要度を決定する処理を行うが、かかる処理の内容について図３０を参照しながら説明する。 The processing result by the request processing unit 120 is also supplied to the word importance determining unit 310, and the processing contents of the past from the request processing unit 120 (the values of the genre attribute and date attribute have already been determined as a program search query) Information) is provided. The word importance level determination unit 310 performs a process of determining the importance level for a word included in the processing result, and the contents of the processing will be described with reference to FIG.

まず、要求処理部１２０によって作成された処理結果表示画面の構成要素に含まれる単語であり、かつ認識辞書１６０に登録されている単語の中から最初の単語を取得する（ステップＳ７０１）。例えば、認識辞書１６０に図３１に示すような内容が登録されており、表示画面が図２８に示すようなものであれば、「Ｊリーグ鹿島×東京」が取得される。 First, the first word is acquired from the words included in the components of the processing result display screen created by the request processing unit 120 and registered in the recognition dictionary 160 (step S701). For example, if the contents as shown in FIG. 31 are registered in the recognition dictionary 160 and the display screen is as shown in FIG. 28, “J League Kashima × Tokyo” is acquired.

次に、単語重要度記憶部３２０に取得した単語を登録するとともに、取得した単語に対応する重要度を初期化する（ステップＳ７０２）。この後、重要度決定規則記憶部３３０に記憶されている複数の規則内容（図２５参照）の中から最初の規則内容を取得し（ステップＳ７０３）、取得した単語が取得した規則内容を満たすか否かを判別する（ステップＳ７０４）。 Next, the acquired word is registered in the word importance storage unit 320, and the importance corresponding to the acquired word is initialized (step S702). Thereafter, the first rule content is acquired from a plurality of rule contents (see FIG. 25) stored in the importance determination rule storage unit 330 (step S703), and whether the acquired word satisfies the acquired rule content It is determined whether or not (step S704).

ここで、取得した単語が取得した規則内容を満たさない場合、重要度決定規則記憶部３３０に記憶されているすべての規則内容について判断を行ったか否かを判別し（ステップＳ７０５）、すべての規則内容について判断していない場合には、重要度決定規則記憶部３３０に格納されている次の規則内容を取得し（ステップＳ７０６）、規則内容を満たすか否かの判別を行う（ステップＳ７０４）。 Here, if the acquired word does not satisfy the acquired rule contents, it is determined whether or not all the rule contents stored in the importance determination rule storage unit 330 have been determined (step S705), and all the rules are determined. If the content is not judged, the next rule content stored in the importance determination rule storage unit 330 is acquired (step S706), and it is determined whether or not the rule content is satisfied (step S704).

一方、取得された単語が取得された規則内容を満たす場合には、重要度決定の対象となる単語の重要度をその規則内容に対応する重要度の値に決定し、それを単語重要度記憶部３２０に格納する（ステップＳ７０７）。例えば、対象単語がすでに要求処理部１２０に入力済みの属性値、つまり同じ単語が入力済みである場合には重要度２０に対応する規則内容を満たしているので、当該単語については重要度が「２０」に決定され、単語重要度記憶部３２０に重要度「２０」が格納される。また、「Ｊリーグ鹿島×東京」が対象単語となっている場合、かかる単語の属性は番組名であり、ジャンル属性や日付属性ではないので、規則内容を満たさないと判断される。 On the other hand, when the acquired word satisfies the acquired rule content, the importance level of the word subject to importance determination is determined as the importance value corresponding to the rule content, and the word importance level storage is performed. The data is stored in the unit 320 (step S707). For example, if the target word has already been input to the request processing unit 120, that is, if the same word has already been input, the rule content corresponding to the importance 20 is satisfied. 20 ”is stored in the word importance storage unit 320. When “J League Kashima × Tokyo” is the target word, the attribute of the word is a program name, and is not a genre attribute or a date attribute, so it is determined that the rule content is not satisfied.

このように単語の重要度が決定されて単語重要度記憶部３２０に格納された場合、もしくはステップＳ７０５においてある単語についてすべての規則内容を満たすか否かの判断を終了したと判別された場合、認識辞書１６０に登録される単語であり、かつ表示画面の構成要素に含まれる単語のすべてについて重要度決定等の処理が終了したか否かを判別する（ステップＳ７０８）。 When the importance of the word is determined and stored in the word importance storage unit 320 as described above, or when it is determined in step S705 that the determination as to whether or not all the rule contents are satisfied is completed. It is determined whether or not the processing such as importance determination has been completed for all the words registered in the recognition dictionary 160 and included in the constituent elements of the display screen (step S708).

ここで、すべての単語について処理が終了した場合には単語重要度決定部３１０による処理は終了する。一方、すべての単語について処理が終了していない場合には、表示画面の構成要素に含まれる単語であり、かつ認識辞書１６０に登録されている単語で未処理の単語を取得し（ステップＳ７０９）、当該単語についてステップＳ７０２以降の処理を行う。このようにして表示画面の構成要素に含まれ、かつ認識辞書１６０に登録される単語のすべてについて重要度決定処理が行われ、その結果が単語重要度記憶部３２０に格納される。 Here, when the processing is completed for all the words, the processing by the word importance determining unit 310 ends. On the other hand, if the processing has not been completed for all the words, unprocessed words that are words included in the constituent elements of the display screen and registered in the recognition dictionary 160 are acquired (step S709). Then, the processing after step S702 is performed on the word. In this way, importance determination processing is performed for all the words included in the components of the display screen and registered in the recognition dictionary 160, and the result is stored in the word importance storage unit 320.

以上のように単語の重要度が決定されて単語重要度記憶部３２０に格納され、さらに上記第１の実施の形態と同様、規則適用単語決定部１３１により規則適用単語テーブル１７０に単語および適用フラグが登録されると、表示態様決定部１３２は、登録処理後の規則適用単語テーブル１７０、単語重要度記憶部３２０および表示規則記憶部３８０の記憶内容を参照して、要求処理部１２０から供給された表示画面の構成要素中の各単語の表示態様を決定する処理を行う。かかる処理の詳細について図３２を参照しながら説明する。なお、ここでは、規則適用単語決定部１３１により、規則適用単語テーブル１７０に図３３に示すような単語等が登録されている場合を例に挙げて説明する。 As described above, the degree of importance of the word is determined and stored in the word importance degree storage unit 320. Further, as in the first embodiment, the rule application word determination unit 131 stores the word and application flag in the rule application word table 170. Is registered, the display mode determination unit 132 is supplied from the request processing unit 120 with reference to the stored contents of the rule application word table 170, the word importance storage unit 320, and the display rule storage unit 380 after the registration process. The display mode of each word in the constituent elements of the display screen is determined. Details of this processing will be described with reference to FIG. Here, a case will be described as an example where the rule application word determination unit 131 registers words and the like as shown in FIG. 33 in the rule application word table 170.

まず、規則適用単語テーブル１７０に登録されている最初の単語を取得し（ステップＳ８０１）、当該取得した単語の適用フラグ、および当該単語について単語重要度記憶部３２０に記憶されている重要度に基づいて、表示規則記憶部３８０（図２７参照）から表示規則を抽出する（ステップＳ８０２）。例えば、最初の単語「３件」の適用フラグは「０」であるため、適用フラグ「０」に対応する表示規則が抽出される。なお、取得された単語が「Ｊリーグ鹿島×東京」の場合、適用フラグ「１」、単語重要度「５０」であるので、適用フラグ「１」、重要度範囲「５０以上８０未満」に対応する表示規則が抽出される。 First, the first word registered in the rule application word table 170 is acquired (step S801), and based on the application flag of the acquired word and the importance stored in the word importance storage unit 320 for the word. The display rule is extracted from the display rule storage unit 380 (see FIG. 27) (step S802). For example, since the application flag of the first word “3 cases” is “0”, the display rule corresponding to the application flag “0” is extracted. If the acquired word is “J League Kashima × Tokyo”, the application flag is “1” and the word importance is “50”, so the application flag is “1” and the importance range is “50 to less than 80”. Display rules to be extracted.

このように表示規則を抽出すると、当該単語について抽出した表示規則にしたがった表示態様の修正を行う（ステップＳ８０３）。対象単語が「３件」である場合、フォントを２つ小さくして表示するといった修正が行われ、また対象単語が「Ｊリーグ鹿島×東京」である場合には、フォントを２つ大きくし、かつボールド化して表示されるよう表示態様が修正される。 When the display rule is extracted in this way, the display mode is corrected according to the display rule extracted for the word (step S803). When the target word is “3”, a correction is made such that the font is reduced by two, and when the target word is “J League Kashima × Tokyo”, the font is increased by two, In addition, the display mode is modified so that the display is bolded.

以上のように表示規則にしたがった修正（修正なしの場合もあり）がなされると、規則適用単語テーブル１７０に登録されているすべての単語について上記表示規則に従った修正処理をなしたか否かを判別する（ステップＳ８０４）。そして、すべての登録単語について処理が終了していない場合には、規則適用単語テーブル１７０から次の単語を取得し（ステップＳ８０５）、ステップＳ８０２以降の処理を行う。 When correction according to the display rule is performed as described above (there may be no correction), whether or not correction processing according to the display rule has been performed for all the words registered in the rule application word table 170. Is determined (step S804). If the processing has not been completed for all registered words, the next word is acquired from the rule application word table 170 (step S805), and the processing from step S802 is performed.

一方、すべての登録単語について処理が終了した場合には、当該表示態様決定処理を終了する。このような処理を行うことで、要求処理部１２０の検索結果を表示すべき内容、つまり表示画面の構成要素に含まれる単語のうち、認識辞書１６０に格納されている単語については、重要度に応じて表示態様の修正処理がなされる。 On the other hand, when the process is completed for all registered words, the display mode determination process is terminated. By performing such processing, the contents stored in the recognition dictionary 160 among the contents to be displayed of the search result of the request processing unit 120, that is, among the words included in the constituent elements of the display screen, Accordingly, the display mode is corrected.

以上のように表示制御部１３０によって、検索結果の表示画面の構成要素に含まれる単語をうち、音声認識可能な単語が他の単語と異なる態様であり、また重要度に応じた態様（フォント、字体等）で表示されるよう表示内容が制御される。そして、表示制御部１３０によって制御された内容の表示画面が表示部１４０に表示される。ここで、図３４に表示部１４０の表示内容の一例を示す。同図に示すように、テレビ番組の検索結果表示画面中の単語のうち、認識辞書１６０に格納されている単語、つまり音声認識可能な単語であり、かつ重要度の高いサブジャンル「野球」や「サッカー」等（フォントサイズ４つ大きく）が最も大きく表示され、次に「Ｊリーグ鹿島×東京」等（フォントサイズ２つ大きく）の番組名が大きく表示される。 As described above, among the words included in the constituent elements of the search result display screen by the display control unit 130, the speech-recognizable word is different from the other words, and the mode (font, The display content is controlled to be displayed in a font or the like. Then, a display screen of contents controlled by the display control unit 130 is displayed on the display unit 140. Here, an example of the display content of the display unit 140 is shown in FIG. As shown in the figure, among the words on the TV program search result display screen, words stored in the recognition dictionary 160, that is, words that can be recognized by voice, and a highly important sub-genre “baseball” or “Soccer” or the like (4 font sizes larger) is displayed the largest, and then the program name such as “J League Kashima × Tokyo” (font size 2 larger) is displayed larger.

以上説明したように本実施の形態では、ユーザが処理要求をなすと、要求処理部１２０によって当該要求が処理され、その処理結果（上記例では、テレビ番組の検索結果）が表示部１４０に表示される。そして、このような表示画面に含まれる単語のうち、当該音声認識装置３０で認識することが可能な単語を他の単語とは異なる態様で表示することができる（図３４参照）。これにより、当該表示を参照したユーザは、自己の要求に対する結果を知ることができるとともに、音声認識が可能な単語を知ることができる。 As described above, in the present embodiment, when the user makes a processing request, the request processing unit 120 processes the request, and the processing result (in the above example, the search result of the TV program) is displayed on the display unit 140. Is done. And the word which can be recognized with the said speech recognition apparatus 30 among the words contained in such a display screen can be displayed in the aspect different from another word (refer FIG. 34). As a result, the user who refers to the display can know the result of his request and can know a word that can be recognized by voice.

さらに本実施の形態では、単語の重要度が決定され、重要度に応じた態様、つまり重要度が大きいほど強調された形で表示されるので、重要度の高い認識可能な単語をユーザにより確実に報知することができる。 Further, in the present embodiment, the importance level of the word is determined, and the aspect according to the importance level, that is, the emphasized form is displayed as the importance level increases, so that the user can recognize a recognizable word having a high importance level. Can be notified.

また、本実施の形態では、重要度はすでに当該装置に入力された内容等を考慮し、あらかじめ種々のケースごとに次に入力される可能性が高いものの重要度が高く設定されるようになっているので、ある処理を実行する際に次に入力が必要となる可能性が高いものを強調して表示し、ユーザに報知することができる。逆に、入力される可能性が低いものについては強調表示がなされないので、より入力が必要となる可能性が高い単語をユーザに強く印象付けることができる。本実施の形態では、検索処理の際の入力可能性の大小（重要度の大小）を、直前等に入力されたキーワードの属性等などに応じて判定することで、検索処理を効率的に行うための重要度設定をなしうる。 Further, in the present embodiment, the importance is set to a high importance although it is highly likely that the importance will be input next in advance for each of various cases in consideration of the contents already input to the device. Therefore, it is possible to highlight and display the information that is likely to be input next when executing a certain process and notify the user. On the other hand, since words that are less likely to be input are not highlighted, words that are more likely to be input can be strongly impressed by the user. In the present embodiment, the search process is efficiently performed by determining the level of input possibility (importance level) in the search process according to the keyword attribute or the like input immediately before. The importance level can be set.

上記のように番組検索処理を例に考えると、前回発声して既に検索クエリに含まれる属性は、認識できるが番組を絞り込むという目的に対しては効果を及ぼさないので強調表示しなくてもよいから、重要度が低くなり、その結果表示態様も強調されない。また、スポーツといった上位ジャンルで検索された番組群はサブジャンルによって効果的に絞込みができるから、当該上位ジャンルのサブジャンルに属する単語の重要度を高く設定することで、処理目的達成のために効率的な認識対象単語がより強調されて表示される。このような表示をなすことで、素早く認識対象単語を見つけて検索することができ、要求した処理目的を効率的に達成することができる。 Considering the program search process as an example as described above, the attributes already included in the search query after the previous utterance can be recognized, but they do not have an effect for the purpose of narrowing down the programs, so they need not be highlighted. Therefore, the importance is lowered, and as a result, the display mode is not emphasized. In addition, since programs searched for in a higher genre such as sports can be narrowed down effectively by sub-genre, by setting the importance of words belonging to the sub-genre of the higher-level genre high, it is efficient to achieve the processing purpose. A typical recognition target word is displayed with more emphasis. By making such a display, it is possible to quickly find and search the recognition target word, and efficiently achieve the requested processing purpose.

（変形例）
なお、本発明は、上述した各実施の形態に限定されるものではなく、以下に例示するような種々の変形が可能である。 (Modification)
In addition, this invention is not limited to each embodiment mentioned above, The various deformation | transformation which is illustrated below is possible.

（変形例１）
上述した第１の実施の形態においては、表示規則記憶部１８０には、認識辞書１６０に登録された単語に適用される表示規則は１種類であったが、登録された単語に適用する規則を複数種類用意するようにしてもよい。例えば、操作コマンド用単語と番組属性単語を異なる形態で表示するような規則を設け、これらの単語の表示態様を異ならせるようにしてもよい。 (Modification 1)
In the first embodiment described above, the display rule storage unit 180 has one type of display rule that is applied to words registered in the recognition dictionary 160. However, the rule that applies to registered words is stored in the display rule storage unit 180. A plurality of types may be prepared. For example, a rule may be provided to display the operation command word and the program attribute word in different forms, and the display mode of these words may be different.

（変形例２）
また、上述した各実施の形態では、表示規則として、フォントの大小、ボールド化といったことを規定するようにしていたが、異なる態様で表示させることができればよく、例えば斜体表示、網掛け表示等を用いるようにしてもよい。さらには、配色を設定したり、認識単語が表示されているオブジェクト(ボタンなど)の大きさそのものを変更するなど他と区別される表示態様であればいかなるものでもよい。 (Modification 2)
Further, in each of the above-described embodiments, the font size and bolding are defined as the display rule. However, it is only necessary that the display can be displayed in different modes. For example, italic display, shading display, etc. You may make it use. Furthermore, any display mode may be used as long as it is distinguished from others, such as setting a color scheme or changing the size of the object (button or the like) on which the recognition word is displayed.

（変形例３）
また、上述した第２の実施の形態では、表示読み方決定部２２０が、認識対象単語を認識させる際にユーザが過去に最も多く利用した読み方（頻度が最も高い読み方）に決定するという基準で、表示態様を異ならせる（表示規則にしたがって表示態様を決定する）文字列を決めていたが、これ以外の基準で表示態様を異ならせるようにしてもよい。 (Modification 3)
In the second embodiment described above, the display / reading determination unit 220 determines the reading most frequently used by the user in the past (reading with the highest frequency) when recognizing the recognition target word. Although the character string that changes the display mode (determines the display mode according to the display rule) is determined, the display mode may be changed based on other criteria.

例えば、認識対象単語の前方側の文字列を読み方として決定するようにしてもよい。具体的には、認識対象単語が「月曜ヒステリー劇場「告発弁護人シリーズ・猪熊文明５」」の場合、３つの読み方「げつようひすてりーげきじょう」、「こくはつべんごにんしりーず」、「いのくまふみあきふぁいぶ」のうち、前方側の文字列のみから構成される読み方「げつようひすてりーげきじょう」を選択するようにしてもよい。 For example, the character string on the front side of the recognition target word may be determined as a reading method. Specifically, if the word to be recognized is “Monday Hysteria Theater“ The Condemnation Defense Attorney Series, Fumiaki Inokuma 5 ””, the three readings “Getsuyo Histeri Gekijo” and “ You may be made to select the reading “Getsuyo Histari Gejo” consisting only of the character string on the front side among “Z” and “Inokuma Fumiaki Faibu”.

また、認識対象単語の文字数があらかじめ決められた数より多い（例えば１０文字以上）場合にのみ、単語全部以外の部分的な文字列を表示を異ならせる文字列として選択するようにしてもよく、この場合当該認識対象単語の複数の読み方のうち、最も短い文字列を選択するといったような基準で規則適用対象文字列を決定するようにしてもよい。 Further, only when the number of characters of the recognition target word is larger than a predetermined number (for example, 10 characters or more), a partial character string other than the entire word may be selected as a character string to be displayed differently. In this case, the rule application target character string may be determined on the basis of selecting the shortest character string among a plurality of readings of the recognition target word.

（変形例４）
また、上記第２の実施の形態や変形例３のように複数の読み方がある場合にいずれか１つの読み方に絞り、絞った文字列についてのみ表示態様を異ならせるのではなく、複数の読み方がある場合に２つ以上の読み方をユーザに報知できるような態様で表示させるようにしてもよい。 (Modification 4)
In addition, when there are a plurality of readings as in the second embodiment or the third modification, the reading mode is not limited to only one reading method and the display mode is changed only for the narrowed character string. In some cases, two or more readings may be displayed in such a manner that the user can be notified.

例えば、上記のように３つの読み方で認識が可能な単語「月曜ヒステリー劇場「告発弁護人シリーズ・猪熊文明５」」を表示する場合には、図３５に示すように、可能な読み方を認識対象単語の文字列の上下部分に枠４０１で範囲指定するといったようにすればよい。また、これらの３つの読み方の表示色を異ならせるといった手法を用いることができ、例えば認識辞書１６０に登録されていない単語が黒色で表示されている場合においては、「月曜ヒステリー劇場」を赤色、「告発弁護人シリーズ」を緑色、「猪熊文明５」を黄色で表示するといったようにしてもよい。 For example, when the word “Monday hysterical theater“ accusal defense lawyer series, Inokuma Fumiaki 5 ”” that can be recognized by three readings as described above is displayed, the possible readings are recognized as shown in FIG. A range 401 may be designated in the upper and lower parts of the word character string. Further, it is possible to use a technique in which the display colors of these three readings are made different. For example, when a word not registered in the recognition dictionary 160 is displayed in black, “Monday Hystery Theater” is displayed in red, The “accusal defense attorney series” may be displayed in green, and the “Burmei Fumakuma 5” may be displayed in yellow.

また、上記第２の実施の形態では、当該装置のユーザの過去の読み方の頻度に基づいて表示態様を異ならせる文字列を決定するようにしていたが、他の装置のユーザの過去の読み方頻度等を利用して表示態様を異ならせる文字列を決定するようにしてもよい。例えば、３つの読み方「げつようひすてりーげきじょう」、「こくはつべんごにんしりーず」、「いのくまふみあきふぁいぶ」のうち、多数のユーザが最も多く利用した読み方が「げつようひすてりーげきじょう」である場合には、これを表示態様を異ならせる文字列として決定するようにすればよい。なお、他の装置のユーザがどのような読み方をしたかといった情報はインターネットなどのネットワークを利用する等して当該装置に供給するようにすればよい。 In the second embodiment, the character string that changes the display mode is determined based on the frequency of past reading by the user of the device. However, the frequency of past reading by the user of another device is determined. The character string that changes the display mode may be determined using the above. For example, among the three readings "Getsuyo Histeri Gekijo", "Kokuhatsubengo Ninshirizu", and "Inokuma Fumiaki Faibu", the reading most used by many users In the case of “Getsuyo Histari Gekijo”, this may be determined as a character string that changes the display mode. Information such as how a user of another device has read may be supplied to the device by using a network such as the Internet.

（変形例５）
また、上述した各実施の形態では、認識対象単語を認識するための読み方がその単語の全部または一部分に対応するものであったが、認識対象単語の略称を発声することで当該認識対象単語が認識されるようにしてもよい。この場合、認識辞書１６０（２６０）の読み方の欄に、単語全体の読み方に加え、略称の読み方を加えるようにすればよい。例えば、認識対象単語「月曜ヒステリー劇場「告発弁護人シリーズ・猪熊文明５」」が一般的に「げつげき」等と略して称されている場合には、認識辞書１６０（２６０）の読み方に「げつげき」を加えるようにすればよい。 (Modification 5)
Further, in each of the above-described embodiments, the reading method for recognizing the recognition target word corresponds to all or a part of the word. However, by speaking the abbreviation of the recognition target word, the recognition target word is It may be recognized. In this case, in addition to reading the whole word, the reading of the abbreviation may be added to the reading column of the recognition dictionary 160 (260). For example, when the word to be recognized “Monday Hysteria Theater“ Prosecution of Defense Prosecutor Series / Fumiaki Inokuma 5 ”” is generally abbreviated as “Getsugeki” or the like, the recognition dictionary 160 (260) You should add “Getsugeki”.

そして、このような読み方をユーザに報知するため、認識対象単語「月曜ヒステリー劇場「告発弁護人シリーズ・猪熊文明５」」の上または下の部分に、当該単語を認識させるために用いることができる略称「げつげき」と表示させるような表示規則を設けておけばよい。このようにすれば、当該表示規則にしたがって表示態様決定部１３２が認識対象単語「月曜ヒステリー劇場「告発弁護人シリーズ・猪熊文明５」」」の表示態様を修正し、図３６に示すような表示がなされる。 And in order to notify the user of such reading, it can be used for recognizing the word above or below the recognition target word “Monday Hysteria Theater“ Criminal Defense Attorney Series / Fumiaki Inokuma 5 ””. A display rule for displaying the abbreviation “Getsugeki” may be provided. In this way, according to the display rule, the display mode determination unit 132 corrects the display mode of the recognition target word “Monday hysterical theater“ accusal defense lawyer series: Fumaaki Inokuma 5 ”” and displays the display as shown in FIG. Is made.

（変形例６）
また、上述した各実施の形態では、要求処理部１２０がユーザが入力した検索キーワード等に基づいた検索処理を行う構成であったが、要求処理部１２０がユーザの要求に応じて他の処理を行うような装置であっても本発明を適用することができる。例えば、エアコンディショナー装置に搭載される音声認識装置に本発明を適用することができる。 (Modification 6)
In each of the above-described embodiments, the request processing unit 120 performs a search process based on a search keyword or the like input by the user. However, the request processing unit 120 performs other processes in response to a user request. The present invention can be applied even to an apparatus that performs the same. For example, the present invention can be applied to a voice recognition device mounted on an air conditioner device.

より具体的には、認識辞書１６０には「運転開始」、「運転停止」、「風量多く」、「設定温度アップ」、「設定温度ダウン」等の単語を登録しておき、ユーザのエアコンディショナー装置に対する要求に対して動作（運転等）を行うとともに、表示パネル等に要求に対する処理結果を表示させる（例えば、運転開始を要求した場合、運転を開始しました等のメッセージを表示させる）際、その画面に含まれる単語のうち、認識可能な単語を表示規則にしたがって他の単語と異なる態様で表示させるようにすればよい。 More specifically, words such as “operation start”, “operation stop”, “large air flow”, “setting temperature up”, “setting temperature down”, etc. are registered in the recognition dictionary 160, and the air conditioner of the user is registered. In response to a request to the device (operation, etc.), the processing result for the request is displayed on the display panel etc. (for example, when a request to start operation is displayed, a message such as starting operation) is displayed. Of the words included in the screen, recognizable words may be displayed in a manner different from other words in accordance with display rules.

また、上記のようなエアコンディショナー装置等に本発明を適用する場合において、第３の実施の形態のように単語の重要度を決定するという機能を持たせるときには、単語の重要度決定に際してその時点でのエアコンディショナー装置の動作状況を重要度決定の一つの要素としてもよい。例えば、運転中であれば、再度「運転開始」という単語が入力される可能性は低いのでその単語は重要度が低く設定されるのに対し、運転中には風量調整や温度調整等に関する単語の重要度が高く設定されるといった具合に動作状況を重要度の決定に反映させるようにしてもよい。 In addition, when the present invention is applied to the above-described air conditioner device or the like, when the function of determining the importance of a word is provided as in the third embodiment, the point in time at the time of determining the importance of the word The operating condition of the air conditioner device in the above may be used as one element for determining the importance. For example, while driving, the word "start driving" is unlikely to be input again, so the word is set to be less important, while words related to air volume adjustment, temperature adjustment, etc. during driving The operation status may be reflected in the determination of the importance level such that the importance level is set high.

（変形例７）
また、上記各実施の形態にかかる音声認識装置を、テレビ番組などの動画像を記録するレコーダ装置に搭載するようにしてもよい。ここで、図３７に第１の実施の形態にかかる音声認識装置１０を組み込んだレコーダ装置４００の概略構成例を示す。同図に示すように、このレコーダ装置４００は、上述した構成の音声認識装置１０と、レコーダ部４１０と、チューナー部４２０とを備える。 (Modification 7)
In addition, the voice recognition device according to each of the above embodiments may be mounted on a recorder device that records a moving image such as a television program. Here, FIG. 37 shows a schematic configuration example of a recorder apparatus 400 incorporating the speech recognition apparatus 10 according to the first embodiment. As shown in the figure, the recorder device 400 includes the voice recognition device 10 having the above-described configuration, a recorder unit 410, and a tuner unit 420.

チューナー部４２０は、ユーザの指示等にしたがって所定のテレビチャンネルを選択し、選択したチャンネルの動画像を受信する。放送等されるテレビ番組を視聴する際には、かかるチューナー部４２０によって受信された動画像が表示部１４０に供給され、テレビ番組等が表示される。 The tuner unit 420 selects a predetermined television channel according to a user instruction or the like, and receives a moving image of the selected channel. When viewing a television program to be broadcast, the moving image received by the tuner unit 420 is supplied to the display unit 140, and the television program or the like is displayed.

レコーダ部４１０は、上記のようにチューナー部４２０によって受信される動画像（テレビ番組等）を記録媒体に記録する。例えば、受信される動画像そのまま記録するのではなく、ＭＰＥＧ（Moving Pictures Experts Group）−２等の圧縮方式により圧縮し、圧縮した動画像データを記録する。なお、記録媒体としては、ＨＤ（Hard Disk）等であってもよいし、過般型の記録媒体（ＤＶＤ−ＲＡＭ（Digital Versatile Disc-RAM）など）であってもよい。 The recorder unit 410 records the moving image (such as a television program) received by the tuner unit 420 as described above on a recording medium. For example, the received moving image is not recorded as it is, but is compressed by a compression method such as MPEG (Moving Pictures Experts Group) -2, and the compressed moving image data is recorded. The recording medium may be an HD (Hard Disk) or the like, or a general-purpose recording medium (DVD-RAM (Digital Versatile Disc-RAM) or the like).

このような構成のレコーダ装置４００において、ユーザは録画したい番組等の検索をなすことができる。このような検索を行う場合、録画したい番組の属性（ジャンル、名称、放送日時、出演者）に関する文言を発声することで、かかる発声内容が音声認識装置１０により認識され、上記実施の形態で説明したように要求処理部１２０によって発声内容をキーワードとする検索がなされる。 In the recorder apparatus 400 having such a configuration, the user can search for a program or the like to be recorded. When performing such a search, the utterance content is recognized by the speech recognition apparatus 10 by uttering a word about the attributes (genre, name, broadcast date and time, performer) of the program to be recorded, and is described in the above embodiment. As described above, the request processing unit 120 performs a search using the utterance content as a keyword.

そして、その検索結果が表示部１４０に表示されるが、その際表示制御部１３０によって音声認識可能な単語が他の文字等と異なる態様（強調された態様）で表示されるので、次に検索のための言葉を音声入力しようとする際にどの単語が入力可能なものであるかを認識することができる。そして、検索内容を発声することでテレビ番組の検索を行わせ、所望の番組が検索された場合には、それに対して録画するよう指示を発声する。これにより要求処理部１２０は、検索された番組を識別する情報を含む録画指示をレコーダ部４１０に送出し、レコーダ部４１０は要求処理部１２０の指示にしたがって録画を行う。 Then, the search result is displayed on the display unit 140. At this time, the display control unit 130 displays words that can be recognized by voice in a different mode (highlighted mode) from other characters. It is possible to recognize which words can be input when trying to input the words for the voice. Then, the search contents are uttered to search for a TV program, and when a desired program is searched, an instruction is issued to record it. Thereby, the request processing unit 120 sends a recording instruction including information for identifying the searched program to the recorder unit 410, and the recorder unit 410 performs recording according to the instruction of the request processing unit 120.

（変形例８）
また、上述した各実施の形態では、音声認識装置に本発明を適用した場合について説明したが、表示制御部１３０、規則適用単語テーブル１７０、表示規則記憶部１８０を備えた表示制御装置として、音声入力部１００、音声認識部１１０、辞書等を備えた音声認識装置とは分離した態様で製造、販売等して流通させるようにしてもよい。 (Modification 8)
In each of the above-described embodiments, the case where the present invention is applied to the speech recognition apparatus has been described. However, as the display control apparatus including the display control unit 130, the rule application word table 170, and the display rule storage unit 180, the speech You may make it distribute | circulate by manufacturing, selling, etc. in the aspect isolate | separated from the speech recognition apparatus provided with the input part 100, the speech recognition part 110, a dictionary.

（変形例９）
なお、上述した各実施の形態で行われる表示態様の制御処理は、専用のハードウェア回路によって行うようにしてもよいし、ＣＰＵがプログラムにしたがって動作することにより、処理が行われるように構成してもよい。また、コンピュータにこのような処理を実行させるためのプログラムをインターネット等の通信回線を介してユーザに提供するようにしてもよいし、当該プログラムをＣＤ−ＲＯＭ（Compact Disc-Read Only Memory）などのコンピュータ読み取り可能な記録媒体に記録してユーザに提供するようにしてもよい。 (Modification 9)
The display mode control process performed in each of the above-described embodiments may be performed by a dedicated hardware circuit, or may be configured so that the process is performed when the CPU operates according to a program. May be. Further, a program for causing the computer to execute such processing may be provided to the user via a communication line such as the Internet, or the program may be provided as a CD-ROM (Compact Disc-Read Only Memory). It may be recorded on a computer-readable recording medium and provided to the user.

以上のように、本発明にかかる音声認識装置、表示制御装置、レコーダ装置、表示方法およびプログラムは、特に音声認識によって入力されたキーワードについて検索処理を行う機能等を備えた装置に適している。 As described above, the speech recognition device, display control device, recorder device, display method, and program according to the present invention are particularly suitable for a device having a function of performing a search process on a keyword input by speech recognition.

本発明の第１の実施の形態にかかる音声認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the speech recognition apparatus concerning the 1st Embodiment of this invention. 前記音声認識装置の構成要素である認識辞書の登録内容の一例を示す図である。It is a figure which shows an example of the registration content of the recognition dictionary which is a component of the said speech recognition apparatus. 前記音声認識装置の構成要素である要求処理用情報記憶部に記憶される番組データベースの一例を示す図である。It is a figure which shows an example of the program database memorize | stored in the information processing part for request processes which is a component of the said speech recognition apparatus. 前記音声認識装置の構成要素である規則適用単語テーブルの登録内容の一例を示す図である。It is a figure which shows an example of the registration content of the rule application word table which is a component of the said speech recognition apparatus. 前記音声認識装置の構成要素である表示規則記憶部に記憶される表示規則の一例を示す図である。It is a figure which shows an example of the display rule memorize | stored in the display rule memory | storage part which is a component of the said speech recognition apparatus. 前記音声認識装置の構成要素である要求処理部の処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence of the request | requirement process part which is a component of the said speech recognition apparatus. 前記要求処理部が処理結果を表示用の画面を作成する際に利用するテンプレートの一例を示す図である。It is a figure which shows an example of the template utilized when the said request | requirement process part produces the screen for displaying a process result. 前記要求処理の処理結果を示す表示画面の一例を示す図である。It is a figure which shows an example of the display screen which shows the process result of the said request process. 前記音声認識装置の構成要素である規則適用単語決定部によって行われる処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process performed by the rule application word determination part which is a component of the said speech recognition apparatus. 前記音声認識装置の構成要素である表示態様決定部によって行われる処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process performed by the display mode determination part which is a component of the said speech recognition apparatus. 前記表示態様決定部によって修正された表示画面の一例を示す図である。It is a figure which shows an example of the display screen corrected by the said display mode determination part. 本発明の第２の実施の形態にかかる音声認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the speech recognition apparatus concerning the 2nd Embodiment of this invention. 第２の実施の形態における音声認識装置の認識辞書の登録内容の一例を示す図である。It is a figure which shows an example of the registration content of the recognition dictionary of the speech recognition apparatus in 2nd Embodiment. 第２の実施の形態における音声認識装置の読み履歴記憶部の記憶内容の一例を示す図である。It is a figure which shows an example of the memory content of the reading history memory | storage part of the speech recognition apparatus in 2nd Embodiment. 第２の実施の形態における規則適用単語テーブルの登録内容の一例を示す図である。It is a figure which shows an example of the registration content of the rule application word table in 2nd Embodiment. 第２の実施の形態における要求処理用情報記憶部に記憶される番組データベースの一例を示す図である。It is a figure which shows an example of the program database memorize | stored in the information processing part for request | requirement processing in 2nd Embodiment. 第２の実施の形態における音声認識装置の構成要素である読み方頻度管理部によって行われる処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process performed by the reading frequency management part which is a component of the speech recognition apparatus in 2nd Embodiment. 第２の実施の形態における音声認識装置の構成要素である表示読み方決定部によって行われる処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process performed by the display reading determination part which is a component of the speech recognition apparatus in 2nd Embodiment. 第２の実施の形態における音声認識装置の要求処理部によって作成される表示画面の一例を示す図である。It is a figure which shows an example of the display screen produced by the request | requirement process part of the speech recognition apparatus in 2nd Embodiment. 前記表示詠み方決定部によって規則適用対象文字列が書き込まれた前記規則適用単語テーブルの登録内容の一例を示す図である。It is a figure which shows an example of the registration content of the said rule application word table in which the rule application object character string was written by the said display stagnation way determination part. 第２の実施の形態における表示態様決定部によって行われる処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process performed by the display mode determination part in 2nd Embodiment. 第２の実施の形態における表示規則記憶部に記憶される表示規則の一例を示す図である。It is a figure which shows an example of the display rule memorize | stored in the display rule memory | storage part in 2nd Embodiment. 第２の実施の形態における表示態様決定部によって修正された表示画面の一例を示す図である。It is a figure which shows an example of the display screen corrected by the display mode determination part in 2nd Embodiment. 本発明の第３の実施の形態にかかる音声認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the speech recognition apparatus concerning the 3rd Embodiment of this invention. 第３の実施の形態における音声認識装置の構成要素である重要度決定規則記憶部に記憶される重要度決定規則の一例を示す図である。It is a figure which shows an example of the importance determination rule memorize | stored in the importance determination rule memory | storage part which is a component of the speech recognition apparatus in 3rd Embodiment. 第３の実施の形態における音声認識装置の構成要素である単語重要度記憶部の記憶内容の一例を示す図である。It is a figure which shows an example of the memory content of the word importance memory | storage part which is a component of the speech recognition apparatus in 3rd Embodiment. 第３の実施の形態における表示規則記憶部に記憶される表示規則の一例を示す図である。It is a figure which shows an example of the display rule memorize | stored in the display rule memory | storage part in 3rd Embodiment. 第３の実施の形態における要求処理部によって作成される処理結果を表示するための画面の一例を示す図である。It is a figure which shows an example of the screen for displaying the processing result produced by the request | requirement process part in 3rd Embodiment. 前記要求処理部が表示画面を作成する際に利用するテンプレートの一例を示す図である。It is a figure which shows an example of the template utilized when the said request process part produces a display screen. 第３の実施の形態における音声認識装置の構成要素である単語重要度決定部によって行われる処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process performed by the word importance determination part which is a component of the speech recognition apparatus in 3rd Embodiment. 第３の実施の形態における認識辞書の登録内容の一例を示す図である。It is a figure which shows an example of the registration content of the recognition dictionary in 3rd Embodiment. 第３の実施の形態における表示態様決定部によって行われる処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process performed by the display mode determination part in 3rd Embodiment. 第３の実施の形態における規則適用単語テーブルの登録内容の一例を示す図である。It is a figure which shows an example of the registration content of the rule application word table in 3rd Embodiment. 第３の実施の形態における表示態様決定部により修正された表示画面の一例を示す図である。It is a figure which shows an example of the display screen corrected by the display mode determination part in 3rd Embodiment. 第２の実施の形態の変形例の表示部に表示される認識辞書に登録された単語の表示態様の一例を示す図である。It is a figure which shows an example of the display mode of the word registered into the recognition dictionary displayed on the display part of the modification of 2nd Embodiment. 各実施の形態の変形例の表示部に表示される認識辞書に登録された単語の表示態様の一例を示す図である。It is a figure which shows an example of the display mode of the word registered into the recognition dictionary displayed on the display part of the modification of each embodiment. 第１の実施の形態にかかる音声認識装置を備えたレコーダ装置の構成を示すブロック図である。It is a block diagram which shows the structure of the recorder apparatus provided with the speech recognition apparatus concerning 1st Embodiment.

Explanation of symbols

１０音声認識装置
２０音声認識装置
３０音声認識装置
１００音声入力部
１１０音声認識部
１２０要求処理部
１３０表示制御部
１３１規則適用単語決定部
１３２表示態様決定部
１４０表示部
１５０音響辞書
１６０認識辞書
１７０規則適用単語テーブル
１８０表示規則記憶部
１９０要求処理用情報記憶部
２１０読み方頻度管理部
２２０表示読み方決定部
２３０読み履歴記憶部
２６０認識辞書
２７０規則適用単語テーブル
３１０単語重要度決定部
３２０単語重要度記憶部
３３０重要度決定規則記憶部
３８０表示規則記憶部
４００レコーダ装置
４０１枠
４１０レコーダ部
４２０チューナー部 DESCRIPTION OF SYMBOLS 10 Speech recognition apparatus 20 Speech recognition apparatus 30 Speech recognition apparatus 100 Speech input part 110 Speech recognition part 120 Request processing part 130 Display control part 131 Rule application word determination part 132 Display mode determination part 140 Display part 150 Acoustic dictionary 160 Recognition dictionary 170 Rule Application word table 180 Display rule storage unit 190 Request processing information storage unit 210 Reading frequency management unit 220 Display reading determination unit 230 Reading history storage unit 260 Recognition dictionary 270 Rule application word table 310 Word importance determination unit 320 Word importance storage unit 330 Importance Determination Rule Storage Unit 380 Display Rule Storage Unit 400 Recorder Device 401 Frame 410 Recorder Unit 420 Tuner Unit

Claims

A recognition dictionary that stores multiple words that are subject to speech recognition and how to read these words;
Speech recognition means for performing speech recognition processing with reference to the recognition dictionary for input speech;
Display rule storage means for storing a display rule defining a display mode of words stored in the recognition dictionary;
When displaying the request processing result for the input request on the display screen, if a word stored in the recognition dictionary is included in the display screen, the displayed rule is stored in the display rule storage unit. Display control means for determining a display mode according to stored display rules;
A speech recognition apparatus comprising:

A recognition dictionary that stores multiple words that are subject to speech recognition and how to read these words;
Speech recognition means for performing speech recognition processing with reference to the recognition dictionary for input speech;
Display rule storage means for storing a display rule defining a display mode of words stored in the recognition dictionary;
Request processing means for processing a request based on the voice recognized by the voice recognition means;
When displaying the request processing result by the request processing means for the input request on the display screen, if the display screen includes a word stored in the recognition dictionary, the word included Display control means for determining a display mode according to the display rules stored in the display rule storage means;
A speech recognition apparatus comprising:

For words in which a plurality of types of readings are stored in the recognition dictionary, further comprising a display reading-compatible character string determination means for determining a character string based on a predetermined criterion,
The voice recognition apparatus according to claim 1, wherein the display control unit determines a display mode of the character string determined by the display reading correspondence character string determination unit according to the display rule.

For words that can be read in a plurality of types in the recognition dictionary, it further comprises reading frequency management means for recording the frequency of reading input when recognized by the voice recognition means,
The display reading determination means displays a character string corresponding to the reading with the highest reading frequency recorded by the reading frequency management means for a word in which a plurality of types of readings are stored in the recognition dictionary according to the display rule. The speech recognition apparatus according to claim 3, wherein the mode is determined as a character string to be determined.

The display / reading determination means, for words whose number of characters constituting the word stored in the recognition dictionary is smaller than a predetermined value, how to read the whole word as a character string whose display mode should be determined according to the display rule The speech recognition apparatus according to claim 3 or 4, wherein the voice recognition apparatus is determined.

Further comprising importance determining means for determining the importance of the words stored in the recognition dictionary;
The voice recognition apparatus according to claim 1, wherein the display control unit determines a display mode based on the display rule and the importance determined by the importance determination unit. .

The importance level determination means determines the importance level for words included in the result displayed on the display screen as a result of the request processing means for the input request based on the content of the request already input. The speech recognition apparatus according to claim 6.

The speech recognition apparatus according to claim 6, wherein the importance level determination unit determines the importance level based on an operation state of the request processing unit.

Speech recognition comprising a recognition dictionary for storing a plurality of words to be speech-recognized and how to read these words, and speech recognition means for performing speech recognition processing on the input speech by referring to the recognition dictionary A display control device for displaying words that can be recognized by the device,
Display control means for controlling the display contents when displaying the request processing result for the input request on the display screen;
A display rule storage means for storing a display rule for defining a display mode of words stored in the recognition dictionary;
The display control means, when the word to be stored in the recognition dictionary is included in the result to be displayed on the display screen, the display stored in the display rule storage means for the included word A display control device that determines a display mode according to a rule.

A recorder device that performs processing for storing an image,
A recognition dictionary that stores multiple words that are subject to speech recognition and how to read these words;
Speech recognition means for performing speech recognition processing with reference to the recognition dictionary for input speech;
Request processing means for processing a request based on the voice recognized by the voice recognition means;
Display control means for controlling the display contents when the result of the request processing means for the request is displayed on a display screen;
A display rule storage means for storing a display rule for defining a display mode of words stored in the recognition dictionary;
The display control means, when the word to be stored in the recognition dictionary is included in the result to be displayed on the display screen, the display stored in the display rule storage means for the included word A display device that determines a display mode according to a rule.

Speech recognition comprising a recognition dictionary for storing a plurality of words to be speech-recognized and how to read these words, and speech recognition means for performing speech recognition processing on the input speech by referring to the recognition dictionary A method for displaying words that the device can recognize,
When the request processing result for the input request is displayed on the display screen, if the word to be stored in the recognition dictionary is included in the result to be displayed on the display screen, the included word A display method characterized by determining a display mode according to a predetermined display rule.

Computer
Included when the request processing result for the input request is displayed on the display screen, the result to be displayed on the display screen includes a word stored in the recognition dictionary used for the speech recognition processing A program that functions as a display control means for determining a display mode according to a display rule determined in advance for a word that is stored.