JP2002297374A

JP2002297374A - Voice retrieving device

Info

Publication number: JP2002297374A
Application number: JP2001100615A
Authority: JP
Inventors: Mitsuaki Watanabe; 光章渡邉; Katsunori Takahashi; 克典高橋; Nozomi Saito; 望斉藤
Original assignee: Alpine Electronics Inc
Current assignee: Alpine Electronics Inc
Priority date: 2001-03-30
Filing date: 2001-03-30
Publication date: 2002-10-11
Anticipated expiration: 2021-03-30
Also published as: JP4137399B2

Abstract

PROBLEM TO BE SOLVED: To provide a voice retrieving device for enhancing precision in recognizing a voice. SOLUTION: When a conversation start button 14 is depressed, a choice setting part 16 sets a plurality of choices based on data which is stored in a candidate set DB 26 and reports it to a voice recognition processing part 12 and a guide sentence generating part 18. The guide sentence generating part 18 generates a guide sentence for presenting a plurality of choices and outputs them from a speaker 22 through a voice synthesizing part 20. The voice recognition processing part 12 performs a voice recognition processing in an input voice from a user and specifies which one of the choices reported from the choice setting part 16 is selected. The processing is repeated when more choices exist in relation to the specified choice. When the final choice is specified, the contents are reported to an operation instruction output part 30 and, then, a prescribed operation instruction is outputted.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声認識を利用し
て各種情報の検索を行う音声検索装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech retrieval apparatus for retrieving various information using speech recognition.

【０００２】[0002]

【従来の技術】従来から、利用者により発声される音声
に対して音声認識処理を行い、認識結果に基づいて各種
情報の検索を行う音声検索装置が知られている。このよ
うな音声検索装置は、車載用のナビゲーション装置など
と組み合わせて用いられている。例えば、ナビゲーショ
ン装置において、経路探索の目的地とするために各種施
設を検索する機能を実行する場合を考えると、音声検索
装置は、利用者の音声に対応して、（１）実行する機能
を特定し、（２）利用者により指定された施設種別（例
えば、食事場所や給油所等）に属する施設を検索し、
（３）利用者によって指定されたフランチャイズ名など
で特定される施設をさらに検索し、（４）最終的に利用
者により指定された一の施設を抽出する、といった手順
で情報の検索を行う。2. Description of the Related Art Conventionally, there has been known a voice search device which performs voice recognition processing on voice uttered by a user and searches for various information based on a recognition result. Such a voice search device is used in combination with a car navigation device or the like. For example, considering a case where a navigation device executes a function of searching for various facilities in order to be a destination of a route search, the voice search device performs (1) a function to be executed in response to a user's voice. (2) Search for facilities belonging to the facility type specified by the user (for example, a meal place or a gas station),
Information search is performed in such a manner that (3) a facility specified by a franchise name or the like designated by the user is further searched, and (4) one facility finally specified by the user is extracted.

【０００３】[0003]

【発明が解決しようとする課題】ところで、従来の音声
検索装置では、音声検索装置に対して、どのような言葉
を音声入力することができるのかについては各利用者が
覚えている必要がある。例えば、上述した例では、各種
機能や施設種別などがどのような呼び方で認識対象とし
て設定されているかをあらかじめ把握しておかなけれ
ば、的確な音声入力を行うことができない。しかしなが
ら、多くの利用者は、認識対象となっている言葉を全て
把握しきれないので、とりあえず適当に思いついた言葉
を入力してみることとなり、的確な言葉を用いた音声入
力を行うことができないことから音声認識の精度の低下
を招く場合があるという問題があった。By the way, in the conventional voice search device, it is necessary for each user to remember what words can be input to the voice search device by voice. For example, in the above-described example, accurate voice input cannot be performed unless it is known in advance how various functions and facility types are set as recognition targets. However, many users cannot fully understand all the words to be recognized, so they have to try inputting an appropriate word for the time being, and cannot perform voice input using an accurate word. As a result, there is a problem that the accuracy of speech recognition may be reduced.

【０００４】本発明は、このような点に鑑みて創作され
たものであり、その目的は、音声認識の精度を向上させ
ることができる音声検索装置を提供することにある。[0004] The present invention has been made in view of the above points, and an object of the present invention is to provide a voice search device capable of improving the accuracy of voice recognition.

【０００５】[0005]

【課題を解決するための手段】上述した課題を解決する
ために、本発明の音声検索装置は、複数の検索対象項目
のそれぞれに検索キーが対応付けられており、利用者の
入力音声の内容と検索キーとを比較することにより、検
索対象項目の中から該当するものを抽出する場合に、検
索キーとなりうる文字列の最大個数を設定し、その数を
超えない範囲で複数の文字列の読みを認識対象文字列出
力手段により音声出力する。そして、マイクロホンによ
って集音した利用者の音声に対して音声認識処理手段に
よって所定の音声認識処理を行い、この音声に対応する
文字列を、認識対象文字列出力手段による音声出力の対
象となった文字列の中から選択しており、音声認識処理
手段によって選択された文字列によって特定される検索
キーに対応する検索対象項目を項目抽出手段によって抽
出している。In order to solve the above-mentioned problems, a voice search apparatus according to the present invention has a search key associated with each of a plurality of search target items, and the content of a user input voice. When extracting the corresponding item from the search target items by comparing the search key with the search key, the maximum number of character strings that can be used as the search key is set. The pronunciation is output as voice by the recognition target character string output means. Then, predetermined voice recognition processing is performed by the voice recognition processing means on the voice of the user collected by the microphone, and the character string corresponding to this voice is subjected to voice output by the recognition target character string output means. A search target item corresponding to the search key specified by the character string selected from the character strings and selected by the voice recognition processing means is extracted by the item extracting means.

【０００６】検索キーとなりうる複数の文字列の読みを
音声出力することにより、認識対象となる文字列を利用
者にあらかじめ提示しており、この提示に対応して利用
者により入力される音声に対応する文字列を、音声出力
の対象となった複数の文字列の中から選択して検索キー
を特定しているので、音声認識の精度を向上させること
ができる。[0006] By outputting voice readings of a plurality of character strings that can be search keys, a character string to be recognized is presented to the user in advance, and the voice input by the user in response to the presentation is output. Since the corresponding character string is selected from the plurality of character strings that have been subjected to voice output and the search key is specified, the accuracy of voice recognition can be improved.

【０００７】また、利用者が発声する前に操作されるス
イッチをさらに備え、このスイッチが操作されたとき
に、音声認識処理手段による音声認識処理を開始するこ
とが望ましい。スイッチが操作された場合に音声認識処
理を開始すればよいため、音声認識処理を開始するタイ
ミングが明確になり、処理の簡略化が可能となる。It is preferable that the apparatus further comprises a switch operated before the user speaks, and when the switch is operated, the voice recognition processing by the voice recognition processing means is started. Since the voice recognition processing only needs to be started when the switch is operated, the timing for starting the voice recognition processing becomes clear, and the processing can be simplified.

【０００８】また、音声認識処理手段によって選択され
た文字列の読みを音声出力する選択文字列確認手段をさ
らに備えることが望ましい。選択された文字列の読みを
音声出力することにより、利用者は自分が入力した音声
に対する認識結果を容易に確認することができる。It is preferable that the apparatus further comprises a selected character string confirming means for outputting a voice of a character string selected by the voice recognition processing means. By outputting the reading of the selected character string by voice, the user can easily confirm the recognition result for the voice input by the user.

【０００９】また、音声認識処理手段による文字列の選
択結果に対して利用者による否定的な見解が示されたと
きに、この選択結果を得るために用いられた複数の文字
列の読みを再度音声出力する指示を認識対象文字列出力
手段に対して行う再選択指示手段をさらに備えることが
望ましい。これにより利用者は、自分の希望とは異なる
文字列が選択結果として得られた場合に、否定的な見解
を示すことにより、検索キーを入力し直すことができ
る。Further, when a negative opinion is given by the user with respect to the selection result of the character string by the voice recognition processing means, the reading of the plurality of character strings used to obtain the selection result is repeated. It is desirable to further include a re-selection instruction unit that issues an instruction to output a voice to the recognition target character string output unit. Thus, when a character string different from the one desired by the user is obtained as a selection result, the user can input a search key again by showing a negative opinion.

【００１０】また、認識対象文字列出力手段は、認識対
象となる文字列の総数が上述した所定の最大個数を超え
ているときに、複数回に分けてこの最大個数を超えない
範囲の数の文字列の読みを音声出力し、１回の音声出力
毎に、音声認識処理手段による文字列の選択判定を行う
ことが望ましい。認識対象となる文字列が多数存在する
場合であっても、所定個数ずつに分けて音声出力が行わ
れるため、利用者はこの所定個数の文字列にのみ着目し
て文字列の選択を行えばよく、所望の文字列の選択を確
実に行うことができる。When the total number of character strings to be recognized exceeds the predetermined maximum number, the recognition target character string output means divides the number of the character strings within a range not exceeding the maximum number into a plurality of times. It is preferable that the reading of the character string is output as voice, and the selection of the character string is determined by the voice recognition processing means for each voice output. Even when there are a large number of character strings to be recognized, voice output is performed in a predetermined number of pieces, so that the user can select a character string by focusing only on the predetermined number of character strings. It is possible to reliably select a desired character string.

【００１１】また、利用者によって他の選択候補の音声
出力が指示されたときに、認識対象文字列出力手段に対
して２回目以降の音声出力を指示する音声出力指示手段
をさらに備えていることが望ましい。これにより利用者
は、他の選択候補を容易に得ることができる。[0011] Further, there is further provided a voice output instructing means for instructing the character string output means for recognition to output the second and subsequent voices when a voice output of another selection candidate is instructed by the user. Is desirable. Thus, the user can easily obtain another selection candidate.

【００１２】また、再度の音声出力が利用者によって指
示されたときに、認識対象文字列出力手段に対して、直
前に音声出力した複数の文字列の読みを再度音声出力す
る指示を行う再音声出力指示手段をさらに備えているこ
とが望ましい。これにより、音声出力の内容を聞き逃し
たような場合に、再度の音声出力を行わせてその内容を
確認することができる。[0012] Further, when the user instructs to output the voice again, the re-voice which instructs the recognition target character string output means to again output the reading of the plurality of character strings output immediately before is output again. It is desirable to further include output instruction means. Thus, if the user misses the contents of the audio output, the user can output the audio again to check the content.

【００１３】また、文字列の選択動作をまかせる旨の指
示が利用者によってなされたときに、音声認識処理手段
による音声認識処理の結果を用いずに文字列の選択を行
う文字列選択手段をさらに備えておき、この文字列選択
手段による文字列の選択が行われたときには、音声認識
処理手段によって選択される文字列に代えて、文字列選
択手段によって選択された文字列を用いて項目抽出手段
による検索対象項目の抽出動作を行うことが望ましい。
「まかせる」旨の指示を行うことにより、利用者は文字
列の選択を音声検索装置に対して委ねることができるた
め、いずれの文字列が選択されても構わないというよう
な場合における操作の簡略化が可能となる。[0013] Further, a character string selecting means for selecting a character string without using the result of the voice recognition processing by the voice recognition processing means when an instruction to let the character string selecting operation be given by the user is further provided. When a character string is selected by the character string selecting means, the character string selected by the character string selecting means is used instead of the character string selected by the voice recognition processing means. It is desirable to perform an operation of extracting a search target item by using.
By giving an instruction to “leave”, the user can entrust the selection of a character string to the voice search device, so that the operation can be simplified in a case where any character string may be selected. Is possible.

【００１４】また、検索対象項目のそれぞれに複数の検
索キーが対応付けられており、一の検索キーに対応して
項目抽出手段によって一の検索対象項目の絞り込みが行
えなかった場合には、一の検索対象項目の絞り込みが行
えるまで、他の検索キーを用いた認識対象文字列出力手
段、音声認識処理手段および項目抽出手段による処理を
繰り返すことが望ましい。これにより、一の検索対象項
目を確実に絞り込むことができる。Also, a plurality of search keys are associated with each of the search target items, and if one search target item cannot be narrowed down by the item extracting means corresponding to one search key, one search key is assigned. It is desirable to repeat the processing by the recognition target character string output unit, the speech recognition processing unit, and the item extraction unit using other search keys until the search target items can be narrowed down. As a result, one search target item can be reliably narrowed down.

【００１５】また、複数の検索キーのそれぞれには異な
る優先度が対応付けられており、複数の検索対象項目の
それぞれに複数の検索キーに対応する複数の文字列が対
応付けられたテーブル情報をテーブル格納手段に格納
し、このテーブル格納手段に格納されるテーブル情報に
基づいて、認識対象文字列出力手段により、優先度が高
い検索キーから順番に、対応する文字列の読みを音声出
力することが望ましい。優先度が設定された検索キー毎
に内容の追加や変更を行うことができるため、データ更
新を容易に行うことができる。Also, different priorities are associated with each of the plurality of search keys, and table information in which a plurality of character strings corresponding to the plurality of search keys are associated with each of the plurality of search target items. Based on the table information stored in the table storage means, based on the table information stored in the table storage means, the recognition target character string output means outputs the reading of the corresponding character string by voice in order from the search key with the highest priority. Is desirable. Since the content can be added or changed for each search key for which the priority is set, the data can be easily updated.

【００１６】また、一の検索キーに対応する文字列の選
択が行われたときに、次に選択対象となる検索キーおよ
びこの検索キーに対応する文字列を示す複数階層のツリ
ー構造情報をツリー構造格納手段に格納し、このツリー
構造格納手段に格納されるツリー構造情報に基づいて、
認識対象文字列出力手段により、次に音声出力の対象と
なる検索キーに対応する複数の文字列を抽出して、これ
らの文字列の読みを音声出力するようにしてもよい。ツ
リー構造を上位階層から順に辿っていくだけで、次に音
声出力する文字列を抽出することができるため、処理の
簡略化が可能となる。When a character string corresponding to one search key is selected, a search key to be selected next and a plurality of levels of tree structure information indicating a character string corresponding to the search key are stored in a tree. Stored in the structure storage means, and based on the tree structure information stored in the tree structure storage means,
The recognition target character string output means may extract a plurality of character strings corresponding to a search key to be subjected to voice output next, and voice-read the reading of these character strings. By simply tracing the tree structure in order from the upper hierarchy, a character string to be output next as a sound can be extracted, so that the processing can be simplified.

【００１７】また、音声認識処理手段による過去の選択
履歴情報を格納する選択履歴格納手段をさらに備えてお
き、この選択履歴格納手段に格納される選択履歴情報に
基づいて、選択頻度が高い文字列を認識対象文字列出力
手段によって判定し、この文字列の読みを優先的に音声
出力することが望ましい。選択される頻度が高い文字列
ほど優先的に音声出力を行うようにすることにより、選
択頻度の高い文字列を少ない音声入力によって選択する
ことができるようになり、操作性を向上させることがで
きる。Further, the apparatus further comprises selection history storage means for storing past selection history information by the voice recognition processing means, and a character string having a high selection frequency is selected based on the selection history information stored in the selection history storage means. Is desirably determined by the recognition target character string output means, and the reading of this character string is preferentially output as voice. By preferentially performing voice output for a character string that is frequently selected, a character string that is frequently selected can be selected with a small number of voice inputs, and operability can be improved. .

【００１８】また、複数の文字列のそれぞれが日本語の
５０音の中の一音からなっている場合に、項目抽出手段
は、先頭の一語が音声認識処理手段によって選択された
一音に一致する検索キーを抽出することが望ましい。選
択候補となる文字列が多数存在する場合であっても、容
易に候補の文字列を絞り込むことができる。When each of the plurality of character strings is composed of one of the Japanese 50 sounds, the item extracting means sets the first word to the one sound selected by the speech recognition processing means. It is desirable to extract matching search keys. Even when a large number of character strings to be selected exist, the character strings of the candidates can be easily narrowed down.

【００１９】また、音声認識処理手段は、文字列を構成
する全ての文字と、音声認識処理結果の全体とを比較す
ることにより、文字列の選択を行うことが望ましい。文
字列と音声認識結果とが完全に一致するもののみを考慮
して文字列の選択を行えばよいため、比較処理が容易と
なり処理を簡略化することができる。It is desirable that the voice recognition processing means selects a character string by comparing all characters constituting the character string with the entire voice recognition processing result. Since it is sufficient to select a character string in consideration of only a character string that completely matches the speech recognition result, comparison processing is facilitated and processing can be simplified.

【００２０】また、音声認識処理手段は、文字列の一部
を構成する文字と、音声認識処理結果の全体とを比較す
ることにより、文字列の選択を行うようにしてもよい。
文字列の一部を構成する文字を考慮した比較を行うこと
により、文字列の一部に特徴がある場合等において、利
用者はこの特徴があって覚えやすい一部分のみを音声入
力することが可能となり、操作性の向上を図ることがで
きる。Further, the voice recognition processing means may select a character string by comparing characters constituting a part of the character string with the entire voice recognition processing result.
By performing comparisons that take into account the characters that make up a part of a character string, in the case where a part of the character string has a characteristic, the user can input only the part that has this characteristic and is easy to remember Thus, operability can be improved.

【００２１】また、音声認識処理手段は、認識対象文字
列出力手段による音声出力が終了する前に、利用者の音
声がマイクロホンによって集音されたときには、その時
点から文字列の選択動作を開始することが望ましい。音
声出力において、最初の方で案内された認識対象文字列
を選択したい場合などにおいて、全ての音声出力を待つ
ことなくこの所望の文字列を音声入力することができる
ため、より一層の操作性の向上を図ることができる。Further, the voice recognition processing means, when the voice of the user is collected by the microphone before the voice output by the character string output means for recognition ends, starts the character string selecting operation from that point. It is desirable. In the case of, for example, selecting a recognition target character string guided in the first direction in voice output, this desired character string can be voice input without waiting for all voice outputs, thereby further improving operability. Improvement can be achieved.

【００２２】また、上述した検索キーとなりうる文字列
の最大個数は、７±２の範囲に設定されていることが望
ましい。認知心理学における短期記憶の理論によれば、
なんらかのまとまりを持つ情報のかたまりを「チャン
ク」と定義すると、人間が一度に保持することができる
情報の量は、およそ７±２チャンクであるとされてい
る。例えば、電話番号を記憶する場合には、基本的には
電話番号を構成する数字１個が１チャンクに相当するこ
ととなる。また、「２９８３」という数字列を「肉屋さ
ん」のように語呂合わせにして記憶した場合には、この
「肉屋さん」という情報が１チャンクに相当する。した
がって、この「チャンク」の概念に基づいて、検索キー
となりうる文字列の最大個数を７±２の範囲に設定して
おくことにより、利用者が検索キーとなりうる文字列を
確実に覚えておくことができる。なお、上述した「チャ
ンク」に関する詳細については、例えば、文献「認知心
理学２記憶高野陽太郎編１９９５東京大学出版
会」の７５頁などに記載されている。It is desirable that the maximum number of character strings that can be used as a search key is set in a range of 7 ± 2. According to the theory of short-term memory in cognitive psychology,
If a chunk of information having some kind of unity is defined as a "chunk", the amount of information that a human can hold at one time is about 7 ± 2 chunks. For example, when storing a telephone number, basically, one numeral constituting the telephone number corresponds to one chunk. Further, when the numeral string “2983” is stored in a word-matching manner like “butcher”, the information of “butcher” corresponds to one chunk. Therefore, by setting the maximum number of character strings that can be a search key in the range of 7 ± 2 based on the concept of “chunk”, the user can reliably remember the character strings that can be the search key. be able to. The details of the above-described “chunk” are described in, for example, page 75 of the document “Cognitive Psychology 2 Memory, Yotaro Takano, 1995, University of Tokyo Press”.

【００２３】また、ネットワークを介して接続されたサ
ーバと端末装置とに機能を分散配置して音声検索装置を
構成してもよい。具体的には、検索対象項目とそれぞれ
に対応する検索キーに関する情報を格納する機能をサー
バに配置し、端末装置には、認識対象文字列出力手段、
マイクロホン、音声認識処理手段、項目抽出手段に対応
する機能を配置するようにし、各種の処理に先立って、
端末装置がサーバから必要な情報を取得することによ
り、音声検索装置を構成することが好ましい。各種の処
理に必要な情報を端末装置がサーバから取得しているた
め、端末装置は、内容の更新された新しい情報をサーバ
から通信によって取得して各種処理に反映させることが
できる。Further, the voice search device may be configured by distributing functions to a server and a terminal device connected via a network. Specifically, a function of storing information on search target items and search keys corresponding to the respective search target items is arranged on the server, and the terminal device includes a recognition target character string output unit,
The functions corresponding to the microphone, voice recognition processing means, and item extraction means are arranged, and prior to various processes,
It is preferable that the terminal device acquires necessary information from the server to configure the voice search device. Since the terminal device obtains information necessary for various processes from the server, the terminal device can obtain new information with updated contents from the server by communication and reflect the new information on the various processes.

【００２４】また、サーバから端末装置に送られてくる
情報は、前回までに送られてきた情報に対する変更内容
を含む差分情報であることが望ましい。内容に変更があ
った場合に、その変更内容を含んだ差分情報だけを取得
すればよく、通信コストを削減することができる。It is preferable that the information sent from the server to the terminal device is difference information including a change content of the information sent up to the previous time. When the content is changed, only the difference information including the changed content needs to be acquired, and the communication cost can be reduced.

【００２５】また、ネットワークを介して接続されたサ
ーバと端末装置とに機能を分散配置して音声検索装置を
構成する場合に、検索対象項目とそれぞれに対応する検
索キーに関する情報を格納するとともに、認識対象文字
列出力手段による音声出力の対象となる文字列の抽出処
理と、項目抽出手段による検索対象項目の抽出処理を行
う機能をサーバに配置し、端末装置には、認識対象文字
列出力手段、マイクロホン、音声認識処理手段に対応す
る機能を配置し、これらの処理に必要な情報を端末装置
がサーバから取得するようにしてもよい。多くの機能を
サーバ側に配置することにより、端末装置の処理負担が
軽減し、構成の簡略化が可能となるため、端末装置のコ
ストダウンを図ることができる。When a function is distributed to a server and a terminal device connected via a network to constitute a voice search device, information on a search target item and a search key corresponding to each are stored, and A server is provided with a function of performing a process of extracting a character string to be output as a sound by the recognition target character string output unit and a process of extracting a search target item by the item extraction unit. , A microphone, and a function corresponding to the voice recognition processing means, and the terminal device may acquire information necessary for these processes from the server. By arranging many functions on the server side, the processing load on the terminal device can be reduced and the configuration can be simplified, so that the cost of the terminal device can be reduced.

【００２６】[0026]

【発明の実施の形態】以下、本発明を適用した一実施形
態の音声検索装置について、図面を参照しながら説明す
る。〔第１の実施形態〕図１は、第１の実施形態の音声検索
装置１を含んで構成される車載用システムの構成を示す
図である。図１に示す車載用システムは、利用者が発声
した音声に応答して対話形式で各種の動作指示を決定し
て出力する音声検索装置１と、自車位置を検出して自車
位置周辺の地図を表示したり、利用者によって選択され
た目的地までの経路探索および経路誘導等を行うナビゲ
ーション装置２と、コンパクトディスクやミニディスク
等の記録媒体に記録された音楽の再生等を行うオーディ
オ装置３を含んで構成されている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, a voice search device according to an embodiment of the present invention will be described with reference to the drawings. [First Embodiment] FIG. 1 is a diagram showing the configuration of a vehicle-mounted system including a voice search device 1 according to a first embodiment. The in-vehicle system shown in FIG. 1 includes a voice search device 1 that determines and outputs various operation instructions in an interactive manner in response to a voice uttered by a user, and a vehicle search system that detects a vehicle position and detects a position around the vehicle position. A navigation device 2 for displaying a map, searching for a route to a destination selected by a user, guiding a route, and the like, and an audio device for playing music recorded on a recording medium such as a compact disk or a mini disk. 3 is included.

【００２７】次に、音声検索装置１の詳細構成について
説明する。図１に示す音声検索装置１は、マイクロホン
１０、音声認識処理部１２、対話開始ボタン１４、再要
求ボタン１５、選択肢設定部１６、案内文生成部１８、
音声合成部２０、スピーカ２２、選択項目判定部２４、
候補セットデータベース（ＤＢ）２６、ＤＢ更新部２
８、動作指示出力部３０を含んで構成されている。Next, a detailed configuration of the voice search device 1 will be described. The voice search device 1 shown in FIG. 1 includes a microphone 10, a voice recognition processing unit 12, a dialogue start button 14, a re-request button 15, an option setting unit 16, a guidance sentence generation unit 18,
Voice synthesis unit 20, speaker 22, selection item determination unit 24,
Candidate set database (DB) 26, DB update unit 2
8. It is configured to include the operation instruction output unit 30.

【００２８】マイクロホン１０は、利用者が発声した音
声を集音して音声信号に変換する。音声認識処理部１２
は、マイクロホン１０から出力される音声信号を解析し
て所定の音声認識処理を行い、利用者が発声した音声に
対応する文字列を特定する。本実施形態の音声認識処理
部１２は、選択肢設定部１６によって設定される所定数
の選択肢に対応した文字列を認識対象として、所定の認
識処理を行っている。The microphone 10 collects the voice uttered by the user and converts it into a voice signal. Voice recognition processing unit 12
Analyzes a voice signal output from the microphone 10, performs a predetermined voice recognition process, and specifies a character string corresponding to a voice uttered by the user. The voice recognition processing unit 12 according to the present embodiment performs a predetermined recognition process on a character string corresponding to a predetermined number of options set by the option setting unit 16 as a recognition target.

【００２９】対話開始ボタン１４は、利用者が音声検索
装置１と対話を開始する際に押下する押しボタンスイッ
チである。また、再要求ボタン１５は、利用者が音声検
索装置１から出力される音声を再度聞きたい場合に押下
する押しボタンスイッチである。The dialogue start button 14 is a push button switch that is pressed when the user starts a dialogue with the voice search device 1. The re-request button 15 is a push-button switch that is pressed when the user wants to hear the voice output from the voice search device 1 again.

【００３０】選択肢設定部１６は、候補セットＤＢ２６
に格納されたデータに基づいて、音声入力を行う際の候
補として提示される所定数の選択肢を設定するものであ
る。なお、この所定数は、１回の提示機会において７±
２個の範囲内で設定されることが望ましく、本実施形態
では５つの選択肢が設定される。選択肢設定部１６によ
って行われる処理の詳細については後述する。The option setting section 16 stores a candidate set DB 26
Is set on the basis of the data stored in the option. In addition, this predetermined number is 7 ± in one presentation opportunity.
It is desirable to set within two ranges, and in this embodiment, five options are set. Details of the processing performed by the option setting unit 16 will be described later.

【００３１】案内文生成部１８は、選択肢設定部１６に
よって設定される所定数の選択肢に基づいて、利用者に
対して出力する案内音声の内容、すなわち案内文を生成
する。音声合成部２０は、案内文生成部１８によって生
成された案内文に対応した音声出力を行うための音声信
号を生成し、スピーカ２２に出力する。スピーカ２２
は、入力される音声信号に基づいて案内音声を出力する
選択項目判定部２４は、音声認識処理部１２から出力さ
れる認識結果の文字列に基づいて、所定数の選択肢の中
からいずれの項目が利用者により選択されたかを判定す
る。The guidance text generator 18 generates the content of the guidance voice to be output to the user, that is, the guidance text, based on the predetermined number of options set by the option setting unit 16. The voice synthesis unit 20 generates a voice signal for outputting a voice corresponding to the guidance text generated by the guidance text generation unit 18, and outputs the voice signal to the speaker 22. Speaker 22
The selection item determination unit 24 that outputs a guidance voice based on the input voice signal is based on the character string of the recognition result output from the voice recognition processing unit 12, and selects any item from a predetermined number of options. Is determined by the user.

【００３２】候補セットＤＢ２６は、選択肢設定部１６
が複数の選択肢を設定するために必要なデータを格納し
ている。図２は、候補セットＤＢ２６に格納されるデー
タの構造を示す図である。図２に示すように、候補セッ
トＤＢ２６には、階層構造を有する所定の候補セット
（ツリー構造情報）が格納されている。それぞれの候補
セットには、所定数の選択肢が含まれている。最上位階
層の候補セットには、ナビゲーション装置２等に対して
実行させることができる複数の機能が選択肢として含ま
れている。また、２番目以降の階層の候補セットには、
上位階層の候補セットに含まれる複数の選択肢のいずれ
かに関連付けられた複数の選択肢が含まれている。The candidate set DB 26 stores the option setting unit 16
Stores the data required to set multiple options. FIG. 2 is a diagram showing the structure of data stored in the candidate set DB 26. As shown in FIG. 2, the candidate set DB 26 stores a predetermined candidate set (tree structure information) having a hierarchical structure. Each candidate set includes a predetermined number of options. The top-level candidate set includes, as options, a plurality of functions that can be executed by the navigation device 2 or the like. In addition, the candidate sets of the second and subsequent hierarchies include:
A plurality of options associated with any of the plurality of options included in the candidate set of the upper hierarchy are included.

【００３３】図３は、図２に示したデータ構造における
上位階層の候補セットと下位階層の候補セットとの対応
関係を示す図である。例えば、最上位階層の候補セット
１００には、“食事場所検索”、“給油所検索”、“施
設検索”、“駐車場検索”、“オーディオ操作”、およ
び“その他”という選択肢が含まれている。これらの選
択肢は、所定の優先順位に基づいて並べられており、こ
れらの選択肢を案内する案内音声を生成する際には、優
先順位の高いものから順に各選択肢が案内される。例え
ば、図３に示す候補セット１００では、“食事場所検
索”が最も優先順位が高くなっており、この候補セット
１００に基づいて生成される案内音声では、“食事場所
検索”、“給油所検索”、…、“その他”の順に各選択
肢が案内される。各選択肢が案内される具体例について
は後述する。なお、他の候補セットについても同様であ
る。FIG. 3 is a diagram showing the correspondence between the upper layer candidate set and the lower layer candidate set in the data structure shown in FIG. For example, the top-level candidate set 100 includes options such as “meal place search”, “service station search”, “facility search”, “parking lot search”, “audio operation”, and “other”. I have. These options are arranged based on a predetermined priority, and when generating a guidance voice for guiding these options, the options are guided in descending order of priority. For example, in the candidate set 100 shown in FIG. 3, “meal place search” has the highest priority, and in the guidance voice generated based on this candidate set 100, “meal place search”, “service station search” Each option is guided in the order of “,...,“ Other ”. A specific example in which each option is guided will be described later. The same applies to other candidate sets.

【００３４】また、選択肢の“その他”に関連付けられ
て、同じ階層に他の候補セット１００ａがあり、この候
補セット１００ａには、“交通情報”、“地図表示”、
…、“その他”という選択肢が含まれている。この候補
セット１００ａに含まれる選択肢の“その他”について
は、さらに他の選択肢が存在する場合には、新たな候補
セットが設けられ、この“その他”に関連付けられる。There is another candidate set 100a at the same level associated with the option "other", and this candidate set 100a includes "traffic information", "map display",
.., And the option “Other” is included. As for “others” of the options included in the candidate set 100a, if there are still other options, a new candidate set is provided and associated with the “others”.

【００３５】また、候補セット１００等に含まれる“そ
の他”以外の選択肢については、この選択肢に関連付け
て、複数の選択肢を含む候補セットが下位階層に設けら
れる。例えば、候補セット１００に含まれる“食事場所
検索”に関連付けられた下位階層の候補セットとして
は、候補セット１０２が存在しており、この候補セット
１０２には、食事場所を選択するために、“レストラン
ａ”など複数のフランチャイズ名等が選択肢として含ま
れている。同様に、候補セット１００に含まれる“給油
所検索”に関連付けられた下位階層の候補セットとして
は、候補セット１０４が存在しており、この候補セット
１０４には、給油所を選択するために、“Ａ石油”など
複数のフランチャイズ名等が選択肢として含まれてい
る。For options other than "other" included in the candidate set 100 and the like, a candidate set including a plurality of options is provided in a lower hierarchy in association with this option. For example, as a candidate set of a lower hierarchy associated with “meal place search” included in the candidate set 100, a candidate set 102 exists. In this candidate set 102, “ A plurality of franchise names, such as restaurant a ", are included as options. Similarly, a candidate set 104 exists as a candidate set of a lower hierarchy associated with “service station search” included in the candidate set 100. In the candidate set 104, in order to select a service station, A plurality of franchise names such as “A Oil” are included as options.

【００３６】このように、本実施形態では、最上位階層
の候補セットから順に、一の選択肢を選択してその選択
肢に関連付けられた下位階層の候補セットに移るという
処理を繰り返していき、最終的に、最下位階層の候補セ
ットに含まれる複数の選択肢の中から一を選択すること
により、動作指示の内容が決定される。この場合に、上
位階層の候補セットに含まれる複数の選択肢が「検索キ
ー」に対応し、最下位階層の候補セットに含まれる複数
の選択肢が「検索対象項目」に対応している。なお、図
２では４階層の階層構造を有する候補セットが示されて
いるが、これは一例であり、動作指示の内容によりこの
階層数は増減する。As described above, in the present embodiment, the process of selecting one option and moving to the candidate set of the lower layer associated with the option in order from the candidate set of the highest layer is repeated. Then, the content of the operation instruction is determined by selecting one from a plurality of options included in the candidate set of the lowest hierarchy. In this case, a plurality of options included in the candidate set of the upper hierarchy correspond to “search key”, and a plurality of options included in the candidate set of the lowest hierarchy correspond to “search target item”. Note that FIG. 2 shows a candidate set having a four-layer hierarchical structure, but this is merely an example, and the number of layers may increase or decrease depending on the content of the operation instruction.

【００３７】ＤＢ更新部２８は、車両位置の検出結果を
ナビゲーション装置２から取得し、これに基づいて、候
補セットＤＢ２６に格納された食事場所、給油所、駐車
場などの施設の位置に関するデータ（位置データ）の内
容を更新する。例えばＤＢ更新部２８は、給油所検索が
行われており、検索対象となる施設のフランチャイズ名
が選択された場合に、このフランチャイズ名に対応する
店舗の中から、その時点での車両位置を中心とした所定
範囲内に存在する店舗を抽出し、抽出された店舗につい
てその位置データを算出し、候補セットＤＢ２６の内容
を更新する。The DB update unit 28 obtains the vehicle position detection result from the navigation device 2 and, based on the result, stores data (such as a meal place, a gas station, and a parking lot) stored in the candidate set DB 26 with respect to the location of the facility ( Update the contents of (position data). For example, when the refueling station search is performed and the franchise name of the facility to be searched is selected, the DB update unit 28 searches the store corresponding to the franchise name for the vehicle position at that time. Is extracted, the position data of the extracted store is calculated, and the contents of the candidate set DB 26 are updated.

【００３８】動作指示出力部３０は、複数の選択肢から
いずれか一を選択する処理が繰り返されて最終的に選択
された項目の内容に対応して、所定の動作指示をナビゲ
ーション装置２またはオーディオ装置３に向けて出力す
る。上述した選択肢設定部１６、案内文生成部１８、音
声合成部２０、スピーカ２２が認識対象文字列出力手段
および選択文字列確認手段に、音声認識処理部１２が音
声認識処理手段に、選択肢設定部１６、選択項目判定部
２４が項目抽出手段に、対話開始ボタン１４がスイッチ
に、再要求ボタン１５が再音声出力指示手段に、候補セ
ットＤＢ２６がツリー構造格納手段に、選択項目判定部
２４が音声出力指示手段にそれぞれ対応している。The operation instruction output unit 30 repeats the process of selecting one of a plurality of options, and outputs a predetermined operation instruction to the navigation device 2 or the audio device in accordance with the content of the finally selected item. Output to 3 The option setting unit 16, the guidance sentence generation unit 18, the speech synthesis unit 20, and the speaker 22 described above are used as a recognition target character string output unit and a selected character string confirmation unit, the speech recognition processing unit 12 is used as a speech recognition processing unit, 16, the selection item determination unit 24 is used as an item extraction unit, the dialogue start button 14 is used as a switch, the re-request button 15 is used as a re-voice output instruction unit, the candidate set DB 26 is used as a tree structure storage unit, and the selection item determination unit 24 is used as a voice. It corresponds to the output instruction means.

【００３９】本実施形態の音声検索装置１はこのような
構成を有しており、次にその動作について説明する。図
４は、第１の実施形態の音声検索装置１の動作手順を示
す流れ図である。利用者の発声する音声に対応してナビ
ゲーション装置２に対する動作指示を出力する際の動作
手順が示されている。The voice search device 1 of the present embodiment has such a configuration, and its operation will be described next. FIG. 4 is a flowchart illustrating an operation procedure of the voice search device 1 according to the first embodiment. An operation procedure when outputting an operation instruction to the navigation device 2 in response to a voice uttered by the user is shown.

【００４０】選択肢設定部１６は、利用者により対話開
始ボタン１４が押下されたか否かを判定している（ステ
ップ１００）。対話開始ボタン１４が押下されない場合
は否定判断がなされ、ステップ１００での処理が繰り返
される。対話開始ボタンが利用者により押下された場合
には肯定判断が行われ、選択肢設定部１６は、候補セッ
トＤＢ２６に格納されたデータを用いて、最上位階層の
候補セットを先頭の候補セットとして設定する（ステッ
プ１０１）。The option setting section 16 determines whether or not the user has pressed the dialog start button 14 (step 100). If the dialogue start button 14 has not been pressed, a negative determination is made, and the processing in step 100 is repeated. When the dialogue start button is pressed by the user, an affirmative determination is made, and the option setting unit 16 sets the top-level candidate set as the top candidate set using the data stored in the candidate set DB 26. (Step 101).

【００４１】次に選択肢設定部１６は、候補セットに含
まれる複数の選択肢に対応する文字列を音声認識処理部
１２および案内文生成部１８に通知する（ステップ１０
２）。案内文生成部１８は、候補セットに含まれる複数
の選択肢を案内する所定の案内文を生成して音声合成部
２０に出力する。音声合成部２０によって案内文に対応
する音声信号が生成されてスピーカ２２に出力され、ス
ピーカ２２から選択肢を提示する案内音声が出力される
（ステップ１０３）。Next, the option setting unit 16 notifies the speech recognition processing unit 12 and the guidance sentence generation unit 18 of the character strings corresponding to the plurality of options included in the candidate set (step 10).
2). The guidance text generation unit 18 generates a predetermined guidance text for guiding a plurality of options included in the candidate set, and outputs the guidance text to the speech synthesis unit 20. A voice signal corresponding to the guidance sentence is generated by the voice synthesis unit 20 and output to the speaker 22, and a guidance voice for presenting an option is output from the speaker 22 (step 103).

【００４２】また選択肢設定部１６は、利用者により再
要求ボタン１５が押下されたか否かを判定する（ステッ
プ１０４）。再要求ボタン１５が押下された場合には、
ステップ１０４で肯定判断が行われ、ステップ１０３に
戻り、以降の処理が繰り返される。具体的には、案内文
を再度出力するように要求された旨が選択肢設定部１６
から案内文生成部１８に通知される。この通知に応じ
て、先の処理時に生成した案内文が、案内文生成部１８
により、音声合成部２０に再度出力される。これによ
り、案内音声の再出力が行われる。The option setting unit 16 determines whether the user has pressed the re-request button 15 (step 104). When the re-request button 15 is pressed,
An affirmative determination is made in step 104, the process returns to step 103, and the subsequent processing is repeated. More specifically, the option setting unit 16 indicates that a request has been made to output the guidance message again.
Is notified to the guidance sentence generation unit 18 from In response to this notification, the guidance text generated during the previous processing is transmitted to the guidance text generation unit 18.
Is output to the speech synthesizer 20 again. As a result, the guidance voice is output again.

【００４３】再要求ボタン１５が押下されない場合に
は、ステップ１０４で否定判断が行われ、音声認識処理
部１２は、マイクロホン１０から出力される音声信号の
有無に基づいて、利用者により音声入力が行われたか否
かを判定する（ステップ１０５）。音声入力が行われな
い場合には、ステップ１０５で否定判断が行われ、この
場合には上述したステップ１０４に戻り、以降の処理が
繰り返される。If the re-request button 15 is not depressed, a negative determination is made in step 104, and the voice recognition processing unit 12 allows the user to input a voice based on the presence or absence of a voice signal output from the microphone 10. It is determined whether or not the operation has been performed (step 105). If no voice input is made, a negative determination is made in step 105, and in this case, the process returns to step 104, and the subsequent processing is repeated.

【００４４】音声入力が行われた場合には、ステップ１
０５で肯定判断が行われ、音声認識処理部１２は、選択
肢設定部１６から通知された複数の選択肢に対応する文
字列のみを音声認識の対象として所定の音声認識処理を
行い、利用者によって選択された一の選択肢を特定する
（ステップ１０６）。なお本実施形態では、選択肢設定
部１６から通知された複数の選択肢に加えて、「その
他」についても選択肢の１つとして音声認識の対象とさ
れているものとする。If voice input is performed, step 1
In step 05, a positive determination is made, and the voice recognition processing unit 12 performs a predetermined voice recognition process on only the character strings corresponding to the plurality of options notified from the option setting unit 16 as a target for voice recognition, and selects the character string by the user. One selected option is specified (step 106). In the present embodiment, in addition to the plurality of options notified from the option setting unit 16, “other” is also assumed to be one of the options for speech recognition.

【００４５】選択項目判定部２４は、音声認識処理部１
２から出力される音声認識結果に基づいて、選択肢の中
から「その他」が選択されたか否かを判定する（ステッ
プ１０７）。「その他」が選択されなかった場合には、
ステップ１０７で否定判断が行われ、選択項目判定部２
４は、選択肢設定部１６に指示を送り、利用者によって
選択された選択肢に対応する次の候補セット（下位階層
の候補セット）が存在するか否かを判定する（ステップ
１０８）。The selection item judging section 24 includes the voice recognition processing section 1
It is determined whether or not “other” has been selected from the options based on the speech recognition result output from Step 2 (Step 107). If "Other" is not selected,
A negative determination is made in step 107, and the selection item determination unit 2
4 sends an instruction to the option setting section 16 to determine whether or not there is a next candidate set (a lower-level candidate set) corresponding to the option selected by the user (step 108).

【００４６】次の候補セットが存在する場合には、ステ
ップ１０８で肯定判断が行われ、選択項目判定部２４
は、次の候補セットを設定するように選択肢設定部１６
に指示する。指示を受けた選択肢設定部１６は、次の候
補セットを設定する（ステップ１０９）。その後、ステ
ップ１０２に戻り、以降の処理が行われる。If the next candidate set exists, an affirmative determination is made in step 108 and the selection item determination unit 24
Is set in the option setting unit 16 so as to set the next candidate set.
To instruct. The option setting unit 16 that has received the instruction sets the next candidate set (step 109). Thereafter, the process returns to step 102, and the subsequent processing is performed.

【００４７】また、選択肢の中から「その他」が選択さ
れた場合には、上述したステップ１０７で肯定判断が行
われ、選択項目判定部２４は、次の選択肢を設定するよ
うに選択肢設定部１６に通知する。通知を受けた選択肢
設定部１６は、候補セットＤＢ２６に格納されたデータ
に基づいて、次の選択肢が存在するか否かを判定し（ス
テップ１１０）、存在する場合には肯定判断を行って、
次の選択肢を設定する（ステップ１１１）。その後、上
述したステップ１０２に戻り、次の選択肢が音声認識処
理部１２および案内文生成部１８に通知され、以降の処
理が行われる。If "other" is selected from the options, an affirmative determination is made in step 107 described above, and the selection item determination section 24 sets the option setting section 16 so as to set the next option. Notify. Based on the data stored in the candidate set DB 26, the option setting unit 16 having received the notification determines whether or not the next option exists (step 110).
The next option is set (step 111). Thereafter, the process returns to step 102 described above, and the next option is notified to the voice recognition processing unit 12 and the guidance sentence generation unit 18, and the subsequent processing is performed.

【００４８】また、次の選択肢が存在しない場合には、
ステップ１１０で否定判断が行われ、この場合には、選
択肢設定部１６は、次の選択肢がない旨を案内する案内
文を生成するように案内文生成部１８に指示を送る。指
示を受けた案内文生成部１８によって所定の案内文が生
成されて音声合成部２０に出力され、次の選択肢がない
旨を通知する案内音声がスピーカ２２から出力される
（ステップ１１２）。その後、上述したステップ１０３
に戻り、前回の処理時に案内された選択肢が、利用者に
対して再度提示され、以降の処理が繰り返される。If the next option does not exist,
A negative determination is made in step 110, and in this case, the option setting unit 16 sends an instruction to the guidance sentence generation unit 18 to generate a guidance sentence indicating that there is no next option. Upon receiving the instruction, the guidance sentence generating unit 18 generates a predetermined guidance sentence and outputs it to the voice synthesizing unit 20, and outputs a guidance voice notifying that there is no next option from the speaker 22 (step 112). Then, the above-mentioned step 103
Returning to, the option guided during the previous processing is presented to the user again, and the subsequent processing is repeated.

【００４９】また、上述したステップ１０８における次
の候補セットが存在するかどうかの判定処理において、
次の候補セットが存在しない場合には否定判断が行わ
れ、選択項目判定部２４は、利用者によって最終的に選
択された項目の内容を動作指示出力部３０に通知する。
通知を受けた動作指示出力部３０は、利用者によって選
択された項目の内容に対応する動作指示を、ナビゲーシ
ョン装置２等に出力する（ステップ１１３）。Also, in the above-described processing for determining whether or not the next candidate set exists in step 108,
If the next candidate set does not exist, a negative determination is made, and the selection item determination unit 24 notifies the operation instruction output unit 30 of the content of the item finally selected by the user.
The operation instruction output unit 30 that has received the notification outputs an operation instruction corresponding to the content of the item selected by the user to the navigation device 2 or the like (step 113).

【００５０】次に、上述した図４に示した処理にしたが
って、音声検索装置１と利用者の間で行われる対話を具
体的に説明する。なお、以降の説明では、利用者を
「Ｕ」、音声検索装置１を「Ｓ」として、両者の対話例
を説明する。また、対話例と合わせて、候補セットＤＢ
２６から読み出されるデータの内容を示す図面を適宜参
照する。Next, the dialogue between the voice search device 1 and the user according to the processing shown in FIG. 4 will be described in detail. In the following description, an example of dialogue between the user and the voice search device 1 will be described as “U” and “S”. Also, along with the dialogue example, the candidate set DB
Reference will be made to the drawings showing the contents of the data read from 26 as appropriate.

【００５１】（対話例１）対話例１は、最寄りの給油所
を検索する際の対話例を示している。また図５は、対話
例１において用いられるデータの内容を示す図である。Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い。」…（１）Ｕ：「給油所検索」…（２）Ｓ：「給油所検索ですね、ではフランチャイズ名を、Ａ
石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、その他、の中
から選択してください」…（３）Ｕ：「Ｂ石油」…（４）Ｓ：「Ｂ石油ですね、では２ｋｍ先右側、２．５ｋｍ左
側、３ｋｍ先左側、５ｋｍ先右側、その他、の中から選
択してください」…（５）Ｕ：「２ｋｍ先右側」…（６）Ｓ：「２ｋｍ先右側ですね、それではＢ石油いわき店に
目的地をセットします」…（７）図５に示すように、利用者により対話開始ボタン１４が
押下されると、まず最上位階層の候補セットが読み出さ
れ、利用者が選択可能な複数の機能が上述した音声
（１）のように案内される。(Interaction Example 1) Interaction Example 1 shows an example of an interaction when searching for the nearest gas station. FIG. 5 is a diagram showing the contents of data used in the first interactive example. U: The dialogue start button 14 is pressed. S: "Please select from meal place search, gas station search, facility search, parking lot search, audio operation, etc." ... (1) U: "gas station search" ... (2) S: " It ’s a service station search, so the franchise name is A
Please choose from petroleum, petroleum B, petroleum C, petroleum D, petroleum E, etc. "... (3) U:" petroleum B "... (4) S:" it is petroleum B, right 2 km ahead , 2.5km left, 3km left, 5km right, etc. "... (5) U:" 2km ahead right "... (6) S:" 2km ahead right, then B The destination is set at the petroleum Iwaki store. ”(7) As shown in FIG. 5, when the dialogue start button 14 is pressed by the user, the top-level candidate set is read out first, and the user A plurality of selectable functions are guided as in the above-mentioned voice (1).

【００５２】この音声に対応して、上述した音声（２）
に示すように利用者により「給油所検索」が選択される
と、この給油所選択に対応した下位階層の候補セットが
読み出され、利用者が選択可能な複数のフランチャイズ
名が上述した音声（３）のように案内される。In response to this voice, the above-mentioned voice (2)
When the user selects "service station search" as shown in (1), a candidate set of a lower hierarchy corresponding to the service station selection is read out, and a plurality of franchise names that can be selected by the user are described in the above-mentioned voice ( You will be guided as in 3).

【００５３】ここで、上述した音声（４）に示すように
利用者によりフランチャイズ名の「Ｂ石油」が選択され
ると、このＢ石油に対応した下位階層の候補セットが読
み出され、自車位置を基準とした各施設の位置（相対的
な距離）が上述した音声（５）のように案内される。Here, when the user selects the franchise name "B petroleum" as shown in the above-mentioned voice (4), the candidate set of the lower hierarchy corresponding to this B petroleum is read out, and the own vehicle is read. The position (relative distance) of each facility based on the position is guided as in the above-mentioned sound (5).

【００５４】次に、上述した音声（６）に示すように、
利用者により位置「２ｋｍ先右側」が選択されると、こ
の選択された位置に対応する一の給油所である「Ｂ石油
いわき店」が特定され、上述した音声（７）に示すよう
に、この給油所が経路探索の目的地にセットされ、一連
の処理が終了する。Next, as shown in the above-mentioned sound (6),
When the position “2 km ahead right” is selected by the user, “B Petroleum Iwaki Store” which is one gas station corresponding to the selected position is specified, and as shown in the above-mentioned voice (7), This gas station is set as the destination of the route search, and a series of processing ends.

【００５５】（対話例２）対話例２は、上述した対話例
１と同様に最寄りの給油所を検索する場合であって、再
要求ボタン１５が押下された場合の対話例を示してい
る。なお、対話例２において用いられるデータの内容は
上述した図５と同様である。Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い」Ｕ：「給油所検索」Ｓ：「給油所検索ですね、ではフランチャイズ名を、Ａ
石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、その他、の中
から選択してください」Ｕ：再要求ボタン１５を押下する。Ｓ：「入力は給油所検索ですね、つづいてフランチャイ
ズ名を、Ａ石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、そ
の他、の中から選択してください」…（８）Ｕ：「Ｂ石油」Ｓ：「Ｂ石油ですね、では２ｋｍ先右側、２．５ｋｍ左
側、３ｋｍ先左側、５ｋｍ先右側、その他、の中から選
択してください」Ｕ：「２ｋｍ先右側」Ｓ：「２ｋｍ先右側ですね、それではＢ石油いわき店に
目的地をセットします」上述した対話例における音声（８）に示すように、利用
者により再要求ボタン１５が押下されると、直前に案内
された候補セットの内容が、再度案内される。(Interaction Example 2) Interaction Example 2 is a case in which the nearest gas station is searched in the same manner as in Interaction Example 1 described above, and shows an interaction example in which the re-request button 15 is pressed. The contents of the data used in the interactive example 2 are the same as those in FIG. 5 described above. U: The dialogue start button 14 is pressed. S: "Please choose from meal place search, gas station search, facility search, parking lot search, audio operation, etc." U: "gas station search" S: "gas station search, franchise name And A
Please select from petroleum, petroleum B, petroleum C, petroleum D, petroleum E, and others. "U: Press the re-request button 15. S: "Input is a gas station search, then select the franchise name from A, B, C, D, E, etc." ... (8) U: "B Oil: S: "In the case of B oil, please choose from 2km ahead right, 2.5km left, 3km left, 5km right, etc." U: "2km ahead right" S: "2km ahead On the right, let's set the destination at B Oil Iwaki Store. ”As shown in the voice (8) in the dialogue example above, when the user presses the re-request button 15, the candidate that was guided immediately before The contents of the set are presented again.

【００５６】なお、この場合には、１回目と２回目で案
内文の内容を変更することが望ましい。上述した例で
は、１回目の案内文は「給油所検索ですね、では…」、
２回目の案内文は「入力は給油所検索ですね、つづいて
…」となっており、両者の内容が変更されている。ま
た、案内音声が聞き取りにくかった場合も考えられるの
で、再要求がなされた場合には、２回目の音声の発話ス
ピードを１回目よりも遅くするようにしてもよい。In this case, it is desirable to change the contents of the guide text at the first time and the second time. In the above example, the first guidance sentence is "Search for gas stations, then ..."
The second guidance sentence is "The input is a gas station search, followed by ...", and the contents of both are changed. In addition, since it is conceivable that the guidance voice is difficult to hear, if the request is made again, the utterance speed of the second voice may be made slower than the first voice.

【００５７】（対話例３）対話例３は、上述した対話例
１と同様に最寄りの給油所を検索する場合であって、選
択肢の中から「その他」が選択された場合の対話例を示
している。また図６は、対話例３において用いられるデ
ータの内容を示す図である。Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い」Ｕ：「給油所検索」Ｓ：「給油所検索ですね、ではフランチャイズ名を、Ａ
石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、その他、の中
から選択してください」Ｕ：「その他」…（９）Ｓ：「では、Ｆ石油、Ｇ石油、Ｈ石油、Ｉ石油、Ｊ石
油、その他、の中から選択してください」…（１０）Ｕ：「Ｇ石油」Ｓ：「Ｇ石油ですね、では２ｋｍ先右側、２．５ｋｍ左
側、３ｋｍ先左側、５ｋｍ先右側、その他、の中から選
択してください」Ｕ：「２ｋｍ先右側」Ｓ：「２ｋｍ先右側ですね、それではＧ石油いわき店に
目的地をセットします」上述した対話例における音声（９）に示すように、利用
者により選択肢の中から「その他」が選択されると、こ
の「その他」に対応して、同じ階層における次の選択肢
を含んだ候補セットが読み出され、上述した音声（１
０）に示すように、利用者が選択可能なフランチャイズ
名が追加して案内される。(Interaction Example 3) Interaction Example 3 is a case in which the nearest gas station is searched for in the same manner as in the above-mentioned Dialog Example 1, and shows an interaction example in which "Other" is selected from the options. ing. FIG. 6 is a diagram showing the contents of data used in Dialogue Example 3. U: The dialogue start button 14 is pressed. S: "Please choose from meal place search, gas station search, facility search, parking lot search, audio operation, etc." U: "gas station search" S: "gas station search, franchise name And A
Please select from petroleum, petroleum B, petroleum C, petroleum D, petroleum E, or other. "U:" Other "... (9) S:" So, Petroleum F, Petroleum G, Petroleum H, Petroleum I, Please select from J Petroleum and others .... (10) U: "G Petroleum" S: "It's G Petroleum, 2km ahead right, 2.5km left, 3km ahead left, 5km ahead right, etc. U: "2 km ahead on the right" S: "2 km ahead on the right, then set the destination at G Petroleum Iwaki Store" As shown in the voice (9) in the above dialogue example When the user selects “other” from the options, a candidate set including the next option in the same layer is read out in response to the “other”, and the above-described voice (1)
As shown in (0), a franchise name that can be selected by the user is additionally provided.

【００５８】（対話例４）対話例４は、上述した対話例
３と同様に、選択肢の中から「その他」が選択された場
合であって、同じ階層における次の候補セットが存在し
なかった場合の対話例を示している。また図７は、対話
例４において用いられるデータの内容を示す図である。Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い」Ｕ：「給油所検索」Ｓ：「給油所検索ですね、ではフランチャイズ名を、Ａ
石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、その他、の中
から選択してください」Ｕ：「その他」Ｓ：「では、Ｆ石油、Ｇ石油、Ｈ石油、Ｉ石油、Ｊ石
油、その他、の中から選択してください」Ｕ：「その他」…（１１）Ｓ：「申し訳ございません。その他の候補はありませ
ん」…（１２）上述した対話例における音声（１１）に示すように、利
用者により選択肢の中から「その他」が選択された場合
であって、同じ階層における次の候補セットが存在しな
い場合には、上述した音声（１２）に示すように、利用
者が選択可能な選択肢がもう存在しない旨が案内され
る。(Interaction Example 4) In Interaction Example 4, similarly to Interaction Example 3 described above, when "Other" is selected from the options, the next candidate set in the same hierarchy does not exist. The example of the dialog in the case is shown. FIG. 7 is a diagram showing the contents of data used in Dialogue Example 4. U: The dialogue start button 14 is pressed. S: "Please choose from meal place search, gas station search, facility search, parking lot search, audio operation, etc." U: "gas station search" S: "gas station search, franchise name And A
Please select from petroleum, petroleum B, petroleum C, petroleum D, petroleum E, or other. "U:" other "S:" in, petroleum F, G, H, I, J, etc. U: “Other” ... (11) S: “I ’m sorry, there are no other candidates”… (12) As shown in the speech (11) in the above dialogue example If “other” is selected from the options by the user and there is no next candidate set in the same layer, as shown in the above-mentioned voice (12), the user can select Will be notified that no longer exists.

【００５９】（対話例５）対話例５は、所望の施設を検
索する際の対話例を示している。また図８は、対話例５
において用いられるデータの内容を示す図である。Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い」Ｕ：「施設検索」…（１３）Ｓ：「施設検索ですね、では施設の地方を、北海道、東
北地方、関東地方、中部地方、近畿地方、その他、の中
から選択してください」…（１４）Ｕ：「東北地方」…（１５）Ｓ：「東北地方ですね、では施設の県を、福島県、秋田
県、岩手県、宮城県、青森県、その他、の中から選択し
てください」…（１６）Ｕ：「福島県」…（１７）Ｓ：「福島県ですね、では施設名称の先頭文字を、あ
行、か行、さ行、た行、な行、その他から選択してくだ
さい」…（１８）Ｕ：「あ行」…（１９）Ｓ：「あ行ですね、ではＲパイ、Ｒパイ技研、Ｒパイ情
報システム、Ｒピー事業所、Ｒピー物流、その他、の中
から選択してください」…（２０）Ｕ：「Ｒパイ」…（２１）Ｓ：「Ｒパイですね、それではＲパイに目的地をセット
します」…（２２）このように、選択可能な複数の機能が音声により案内さ
れ、利用者は、この音声に対応して目的の機能を選択す
る。上述した音声（１３）で示すように、利用者により
「施設検索」が選択されると、この施設検索に対応した
下位階層の候補セットが読み出され、上述した音声（１
４）に示すように、施設の所在する地方が選択肢として
案内される。(Interaction Example 5) Interaction Example 5 shows an example of an interaction when searching for a desired facility. FIG. 8 shows dialogue example 5
FIG. 4 is a diagram showing the contents of data used in the method. U: The dialogue start button 14 is pressed. S: "Please select from meal place search, gas station search, facility search, parking lot search, audio operation, etc." U: "facility search" ... (13) S: "facility search Please select the region of the facility from Hokkaido, Tohoku region, Kanto region, Chubu region, Kinki region and others. "... (14) U:" Tohoku region "... (15) S:" Tohoku region Then, please select the prefecture of the facility from Fukushima prefecture, Akita prefecture, Iwate prefecture, Miyagi prefecture, Aomori prefecture, and others .... (16) U: "Fukushima prefecture" ... (17) S: "Fukushima It is a prefecture, so please select the first letter of the facility name from A line, ka line, sa line, ta line, na line, etc. "... (18) U:" A line "... (19) S: "Ah, then, R pie, R pie Giken, R pie information system, R pi establishment, R pi distribution, Please choose from the other. "... (20) U:" R pie "... (21) S:" R pie, then set the destination on R pie "... (22) Like this A plurality of selectable functions are guided by voice, and the user selects a target function according to the voice. As shown by the above-mentioned sound (13), when "facility search" is selected by the user, a candidate set of a lower hierarchy corresponding to this facility search is read, and the above-mentioned sound (1) is selected.
As shown in 4), the region where the facility is located is guided as an option.

【００６０】ここで、上述した音声（１５）に示すよう
に利用者により施設の所在する地方として「東北地方」
が選択されると、この「東北地方」に対応した下位階層
の候補セットが読み出され、施設の所在する都府県名が
上述した音声（１６）のように案内される。Here, as shown by the above-mentioned voice (15), the user locates the facility in the "Tohoku district" as the district where the facility is located.
Is selected, the candidate set of the lower hierarchy corresponding to the "Tohoku region" is read, and the name of the prefecture where the facility is located is guided as in the above-mentioned voice (16).

【００６１】上述した音声（１７）に示すように、利用
者により「福島県」が選択されると、対応する下位階層
の候補セットが読み出され、上述した音声（１８）に示
すように、施設の名称の先頭文字（あ行、か行等）が選
択肢として案内される。上述した音声（１９）に示すよ
うに、利用者により「あ行」が選択されると、対応する
下位階層の候補セットが読み出され、上述した音声（２
０）に示すように、「福島県」に所在する施設であっ
て、施設名称の先頭文字が「あ行」に属する施設の名称
が選択肢として案内される。When the user selects "Fukushima prefecture" as shown in the above-mentioned voice (17), the corresponding lower-level candidate set is read out, and as shown in the above-mentioned voice (18), The first character of the name of the facility (a line, line, etc.) is provided as an option. As shown in the above-mentioned sound (19), when the user selects "A line", the corresponding lower-level candidate set is read out and the above-mentioned sound (2) is selected.
As shown in (0), the name of a facility located in "Fukushima Prefecture" and whose first character of the facility name belongs to "A line" is guided as an option.

【００６２】ここで、上述した音声（２１）に示すよう
に、利用者により一の施設名称「Ｒパイ」が選択される
と、一の施設である「Ｒパイ」が特定されるため、上述
した音声（２２）に示すように、この施設が経路探索の
目的地にセットされ、一連の処理が終了する。Here, as shown in the above-mentioned sound (21), when one facility name “R pie” is selected by the user, one facility “R pie” is specified. As shown in the sound (22), this facility is set as the destination of the route search, and a series of processing ends.

【００６３】このように、第１の実施形態では、所定の
階層構造を有する候補セットを含んだデータを候補セッ
トＤＢ２６に格納しており、この候補セットに基づい
て、次に音声出力の対象となる選択肢に対応する複数の
文字列を抽出して、これらの文字列の読みを音声出力し
ている。そして、認識対象となる文字列を利用者にあら
かじめ提示し、この提示に対応して利用者により入力さ
れる音声に対応する文字列を、音声出力の対象となった
複数の文字列の中から選択し、利用者により選択された
選択肢を特定しているので、音声認識の精度を向上させ
ることができる。特に、階層構造を有する候補セットを
上位階層から順に辿っていくだけで、次に音声出力する
文字列を抽出することができるため、処理の簡略化が可
能となる〔第２の実施形態〕ところで、上述した第１の実施形態
では、候補セットＤＢ２６には、階層構造を有する候補
セットがあらかじめ用意されて格納されていたが、一般
的なテーブル形式の構造を有するデータベースを用いて
第１の実施形態と同様の処理を行うこともできる。As described above, in the first embodiment, data including a candidate set having a predetermined hierarchical structure is stored in the candidate set DB 26, and based on this candidate set, the next audio output target A plurality of character strings corresponding to certain options are extracted, and the reading of these character strings is output as voice. Then, a character string to be recognized is presented to the user in advance, and a character string corresponding to a voice input by the user in response to the presentation is selected from a plurality of character strings targeted for voice output. Since the option is selected and the option selected by the user is specified, the accuracy of voice recognition can be improved. In particular, since it is possible to extract the character string to be output next by simply tracing the candidate set having the hierarchical structure in order from the upper layer, the processing can be simplified. [Second Embodiment] In the above-described first embodiment, candidate sets having a hierarchical structure are prepared and stored in advance in the candidate set DB 26. However, in the first embodiment, a database having a general table format structure is used. The same processing as described above can be performed.

【００６４】図９は、第２の実施形態の音声検索装置１
Ａを含んで構成される車載用システムの構成を示す図で
ある。図９に示す第２の実施形態の音声検索装置１Ａ
は、上述した第１の実施形態における音声検索装置１と
比較して、候補セットＤＢ２６がデータ内容の異なる候
補セットＤＢ２６ａに置き換えられた点が異なってお
り、またこのデータ内容の変更に伴って、利用者の発声
する音声に対応して動作指示の内容を絞り込む際の動作
手順が異なっている。以下、主に第１の実施形態との相
違点について着目して説明を行う。FIG. 9 shows a voice search device 1 according to the second embodiment.
FIG. 1 is a diagram showing a configuration of an on-vehicle system configured including A. The voice search device 1A of the second embodiment shown in FIG.
Is different from the voice search device 1 according to the first embodiment in that the candidate set DB 26 is replaced with a candidate set DB 26a having a different data content, and with the change in the data content, The operation procedure for narrowing down the contents of the operation instruction according to the voice uttered by the user is different. Hereinafter, the description will be given focusing on differences from the first embodiment.

【００６５】候補セットＤＢ２６ａは、選択肢設定部１
６が複数の選択肢を設定するために必要なデータを格納
している。図１０は、第２の実施形態の候補セットＤＢ
２６ａに格納されるデータの構造を示す図である。図１
０に示すように、第２の実施形態の候補セットＤＢ２６
ａに格納されるデータは、上述した第１の実施形態の場
合と異なり、テーブル形式となっている。The candidate set DB 26a stores the option setting unit 1
6 stores data required for setting a plurality of options. FIG. 10 shows a candidate set DB according to the second embodiment.
It is a figure showing the structure of the data stored in 26a. Figure 1
0, as shown in FIG.
The data stored in a is in the form of a table, unlike the case of the first embodiment described above.

【００６６】この候補セットＤＢ２６ａに格納されるデ
ータ（テーブル情報）は、「優先度」、「候補セットタ
イトル」、「選択肢」という３つの要素から構成されて
いる。「優先度」は、上述した第１の実施形態における
階層と同様の意味を示している。すなわち、何らかの動
作指示を決定する際には、優先度１の「機能」から順
に、複数の選択肢の中から一の選択肢が選択される。選
択肢を提示し、選択する処理の具体例については後述す
る。なお、図１０に示す優先度１〜６に対応付けられて
いる各選択肢が「検索キー」に対応し、最終的に特定さ
れる選択肢である優先度７の各選択肢が「検索対象項
目」に対応している。The data (table information) stored in the candidate set DB 26a is composed of three elements: "priority", "candidate set title", and "option". “Priority” has the same meaning as the hierarchy in the first embodiment described above. That is, when deciding any operation instruction, one option is selected from a plurality of options in order from the “function” of priority 1. A specific example of a process for presenting and selecting options will be described later. Note that each option associated with the priority 1 to 6 shown in FIG. 10 corresponds to the “search key”, and each option of priority 7 which is the finally specified option is set as the “search target item”. Yes, it is.

【００６７】また候補セットＤＢ２６ａでは、横方向の
１行分が１つのデータ群（以後、これを「レコード」と
呼ぶ）となっている。例えば、図１０に示した１行目の
レコードは、施設名「Ｂ石油いわき店」に関するデータ
群であり、機能としては「給油所検索」に関連してお
り、フランチャイズ名が「Ｂ石油」、施設の所在する地
方が「東北地方」、施設の所在する都府県が「福島
県」、施設名称の先頭文字が「は行」にそれぞれ属して
いることを示している。なお、位置については、上述し
たＤＢ更新部２８によってその内容が更新される。候補
セットＤＢ２６ａには、このようなレコードが複数含ま
れている。なお、この候補セットＤＢ２６ａがテーブル
格納手段に対応している。In the candidate set DB 26a, one row in the horizontal direction forms one data group (hereinafter, this is referred to as a "record"). For example, the record on the first line shown in FIG. 10 is a data group related to the facility name “B Petroleum Iwaki Store”, and is related to “Fuel Station Search” as a function, and the franchise name is “B Petroleum”. This indicates that the region where the facility is located belongs to "Tohoku region", the prefecture where the facility is located belongs to "Fukushima prefecture", and the first letter of the facility name belongs to "ha line". The content of the position is updated by the above-described DB updating unit 28. The candidate set DB 26a includes a plurality of such records. The candidate set DB 26a corresponds to a table storage.

【００６８】本実施形態の音声検索装置１Ａはこのよう
な構成を有しており、次にその動作について説明する。
図１１は、第２の実施形態の音声検索装置１Ａの部分的
な動作手順を示す流れ図である。なお、音楽検索装置１
Ａの基本的な操作手順は、上述した図４に示した第１の
実施形態の音楽検索装置１と同様であり、ステップ１０
１の処理内容とステップ１０７以降の処理内容が異なっ
ている。図１１には、この処理内容の相違する部分が主
に示されている。The voice search device 1A of the present embodiment has such a configuration, and its operation will be described next.
FIG. 11 is a flowchart showing a partial operation procedure of the voice search device 1A of the second embodiment. Note that the music search device 1
The basic operation procedure of A is the same as that of the music search apparatus 1 of the first embodiment shown in FIG.
1 is different from the processing contents after step 107. FIG. 11 mainly shows the difference between the processing contents.

【００６９】選択肢設定部１６は、利用者により対話開
始ボタン１４が押下されたか否かを判定する（ステップ
１００）。利用者により対話開始ボタン１４が押下され
ない場合は否定判断がなされ、ステップ１００の処理が
繰り返される。対話開始ボタン１４が押下された場合に
は、ステップ１００で肯定判断が行われ、選択肢設定部
１６は、候補セットＤＢ２６ａから“優先度１”の列に
属するデータを抽出し、抽出したデータを用いて先頭の
候補セットを設定する（ステップ１０１Ａ）。具体的に
は、図１０に示したように、本実施形態では、“優先度
１”の列のデータには各種機能の内容が含まれており、
これらの機能の内容を選択肢として含んだ候補セットが
設定される。その後、上述した第１の実施形態と同様に
して、図４に示すステップ１０２〜ステップ１０７に示
した処理が行われる。The option setting unit 16 determines whether or not the user has pressed the dialog start button 14 (step 100). If the dialogue start button 14 is not pressed by the user, a negative determination is made, and the process of step 100 is repeated. If the dialogue start button 14 is pressed, an affirmative determination is made in step 100, and the option setting unit 16 extracts data belonging to the column of “priority 1” from the candidate set DB 26a and uses the extracted data. A first candidate set is set (step 101A). Specifically, as shown in FIG. 10, in the present embodiment, the data of the column of “priority 1” includes the contents of various functions,
A candidate set including the contents of these functions as options is set. Thereafter, the processing shown in Steps 102 to 107 shown in FIG. 4 is performed in the same manner as in the first embodiment.

【００７０】選択項目判定部２４は、音声認識処理部１
２から出力される音声認識結果に基づいて、選択肢の中
から「その他」が選択されたか否かを判定する（ステッ
プ１０７）。「その他」が選択されなかった場合には、
ステップ１０７で否定判断が行われ、その旨が選択項目
判定部２４から選択肢設定部１６に通知される。通知を
受けた選択肢設定部１６は、利用者によって選択された
選択肢に対応して、次の候補セットとして提示される候
補となる選択肢の絞り込みを行う（ステップ１２０）。
例えば、利用者によって「給油所検索」が選択された場
合であれば、選択項目判定部２４は、この「給油所検
索」に対応するレコードの絞り込みを行う。The selection item judging section 24 includes the voice recognition processing section 1
It is determined whether or not “other” has been selected from the options based on the speech recognition result output from Step 2 (Step 107). If "Other" is not selected,
A negative determination is made in step 107, and the selection item determination unit 24 notifies the option setting unit 16 of that. The option setting unit 16 that has received the notification narrows down options that are candidates to be presented as the next candidate set, in accordance with the options selected by the user (step 120).
For example, if “search for gas station” is selected by the user, the selection item determination unit 24 narrows down records corresponding to “search for gas station”.

【００７１】次に選択肢設定部１６は、優先度の高い候
補セットから順に、選択肢を２つ以上含む候補セットが
あるか否かを判定する（ステップ１２１）。選択肢を２
つ以上含んだ候補セットが存在する場合には、ステップ
１２１で肯定判断が行われ、次に選択肢設定部１６は、
ステップ１２１で特定された候補セット（２つ以上の選
択肢を含む候補セット）に対応して、所定数の選択肢を
抽出し、次の候補セットを設定する（ステップ１２
２）。Next, the option setting section 16 determines whether or not there is a candidate set including two or more options in order from the candidate set having the highest priority (step 121). 2 choices
If there is a candidate set including one or more, an affirmative determination is made in step 121, and then the option setting unit 16
A predetermined number of options are extracted corresponding to the candidate set (candidate set including two or more options) specified in step 121, and the next candidate set is set (step 12).
2).

【００７２】図１２は、ステップ１２２に示す処理の詳
細な手順を示す流れ図である。まず、選択肢設定部１６
は、候補セットＤＢ２６ａに格納されているデータに基
づいて、優先度が高く、種類の異なる選択肢を２つ以上
含んだ候補セットを選択する（ステップ１３０）。FIG. 12 is a flowchart showing a detailed procedure of the process shown in step 122. First, the option setting unit 16
Selects a candidate set having two or more alternatives of high priority and different types based on the data stored in the candidate set DB 26a (step 130).

【００７３】次に選択肢設定部１６は、選択した候補セ
ットに含まれている選択肢の種類が所定数（本実施形態
では５つ）以下であるか否かを判定する（ステップ１３
１）。選択肢の種類が所定数以下でない場合には、ステ
ップ１３１で否定判断が行われ、次に選択肢設定部１６
は、所定数の選択肢を抽出する（ステップ１３２）。Next, the option setting section 16 determines whether or not the number of options included in the selected candidate set is equal to or less than a predetermined number (five in this embodiment) (step 13).
1). If the type of option is not equal to or less than the predetermined number, a negative determination is made in step 131, and then the option setting unit 16
Extracts a predetermined number of options (step 132).

【００７４】また、選択肢の種類が所定数以下である場
合には、ステップ１３１で肯定判断が行われ、次に選択
制設定部１６は、存在する選択肢を全て抽出する（ステ
ップ１３３）。次に選択肢設定部１６は、ステップ１３
２またはステップ１３３に示した処理において抽出され
た選択肢を、次の候補セットとして設定し（ステップ１
３４）、図１１に示すステップ１２２での処理が終了す
る。その後、ステップ１０２に戻り、以降の処理が繰り
返される。If the number of options is equal to or less than the predetermined number, an affirmative determination is made in step 131, and then the selection system setting unit 16 extracts all the available options (step 133). Next, the option setting unit 16 proceeds to step 13
2 or the options extracted in the processing shown in step 133 are set as the next candidate set (step 1).
34), the processing in step 122 shown in FIG. 11 ends. Thereafter, the process returns to step 102, and the subsequent processing is repeated.

【００７５】上述したステップ１０７において、選択肢
の中から「その他」が選択された場合には肯定判断が行
われ、選択項目判定部２４は、次の選択肢を設定するよ
うに選択肢設定部１６に通知する。通知を受けた選択肢
設定部１６は、候補セットＤＢ２６ａに格納されたデー
タに基づいて、前回の処理において既に提示された選択
肢以外の他の選択肢が存在するか否かを判定する（ステ
ップ１２３）。存在する場合には肯定判断を行って、次
の選択肢を設定する（ステップ１２４）。その後、上述
したステップ１０２に戻り、次の選択肢が音声認識処理
部１２および案内文生成部１８に通知され、以降の処理
が行われる。In step 107 described above, if “other” is selected from the options, an affirmative determination is made, and the selection item determination unit 24 notifies the option setting unit 16 to set the next option. I do. Based on the data stored in the candidate set DB 26a, the option setting unit 16 that has received the notification determines whether there is another option other than the option already presented in the previous process (step 123). If there is, a positive determination is made and the next option is set (step 124). Thereafter, the process returns to step 102 described above, and the next option is notified to the voice recognition processing unit 12 and the guidance sentence generation unit 18, and the subsequent processing is performed.

【００７６】また、次の選択肢が存在しない場合には、
ステップ１２３で否定判断が行われる。この場合には、
選択肢設定部１６は、次の選択肢が存在しない旨を案内
する案内文を生成するように案内文生成部１８に指示を
送る。指示を受けた案内文生成部１８によって所定の案
内文が生成されて音声合成部２０に出力され、次の選択
肢がない旨を通知する案内音声がスピーカ２２から出力
される（ステップ１２５）。その後、上述したステップ
１０３に戻り、前回の処理時に案内された選択肢が、利
用者に対して再度提示され、以降の処理が繰り返され
る。If the next option does not exist,
In step 123, a negative determination is made. In this case,
The option setting unit 16 sends an instruction to the guidance message generation unit 18 to generate a guidance message for guiding that there is no next option. Upon receiving the instruction, the guidance sentence generation unit 18 generates a predetermined guidance sentence and outputs it to the speech synthesis unit 20, and outputs a guidance voice notifying that there is no next option from the speaker 22 (step 125). Thereafter, the process returns to the above-described step 103, where the option guided during the previous processing is presented to the user again, and the subsequent processing is repeated.

【００７７】上述したステップ１２１おいて、選択肢を
２つ以上含む候補セットが存在しなくなった場合には否
定判断が行われ、選択項目判定部２４は、利用者によっ
て最終的に選択された選択肢の内容を動作指示出力部３
０に通知する。通知を受けた動作指示出力部３０は、利
用者により選択された選択肢の内容に対応する動作指示
をナビゲーション装置２等に出力する（ステップ１２
６）。In step 121 described above, when there is no longer any candidate set including two or more options, a negative determination is made, and the selection item determination unit 24 determines whether the option finally selected by the user is Operation instruction output unit 3 for contents
Notify 0. The operation instruction output unit 30 that has received the notification outputs an operation instruction corresponding to the content of the option selected by the user to the navigation device 2 or the like (step 12).
6).

【００７８】次に、上述した図１１に示した処理にした
がって、音声検索装置１Ａと利用者の間で行われる対話
を具体的に説明し、この対話例と合わせて、候補セット
ＤＢ２６ａに格納されたデータの中から必要なレコード
を抽出する様子について、図面を適宜参照して説明す
る。Next, the dialogue between the voice search device 1A and the user will be specifically described according to the processing shown in FIG. 11 described above, and together with this dialogue example, the dialogue is stored in the candidate set DB 26a. The manner in which a necessary record is extracted from the collected data will be described with reference to the drawings as appropriate.

【００７９】（対話例６）対話例６は、最寄りの給油所
を検索する際の対話例を示している。また図１３は、対
話例６において候補セットＤＢ２６ａから抽出されるレ
コードの内容を示す図である。Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い。」…（２３）Ｕ：「給油所検索」…（２４）Ｓ：「給油所検索ですね、ではフランチャイズ名を、Ａ
石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、その他、の中
から選択してください」…（２５）Ｕ：「Ｂ石油」…（２６）Ｓ：「Ｂ石油ですね、では２ｋｍ先右側、５ｋｍ左側、
その他、の中から選択してください」…（２７）Ｕ：「２ｋｍ先右側」…（２８）Ｓ：「２ｋｍ先右側ですね、それではＢ石油いわき店に
目的地をセットします」…（２９）図１３に示すように、利用者により対話開始ボタン１４
が押下されると、まず優先度１の候補セットタイトルで
ある「機能」に対応して複数の選択肢が抽出され、利用
者が選択可能な複数の機能が上述した音声（２３）のよ
うに案内される。(Interaction Example 6) Interaction Example 6 shows an example of dialog when searching for the nearest gas station. FIG. 13 is a diagram illustrating the contents of a record extracted from the candidate set DB 26a in the interactive example 6. U: The dialogue start button 14 is pressed. S: "Please select from meal place search, gas station search, facility search, parking lot search, audio operation, etc." ... (23) U: "gas station search" ... (24) S: " It ’s a service station search, so the franchise name is A
Please choose from petroleum, petroleum B, petroleum C, petroleum D, petroleum E, etc. "... (25) U:" petroleum B "... (26) S:"it's B petroleum, right 2km ahead , 5km left,
Other, please choose from ... "(27) U:" 2 km ahead right side "... (28) S:" 2 km ahead right side, then set the destination at B Yukino Iwaki store "... (29) 13) As shown in FIG.
Is pressed, a plurality of options are first extracted corresponding to the "function" which is a candidate set title of priority 1, and a plurality of functions that can be selected by the user are guided as in the above-mentioned voice (23). Is done.

【００８０】この音声（２３）に対応して、上述した音
声（２４）に示すように利用者により「給油所検索」が
選択されると、この「給油所選択」に対応したレコード
のみが絞り込まれ、次に優先度の高い優先度２の候補セ
ットタイトルである「フランチャイズ名」に対応して複
数の選択肢が抽出され、利用者が選択可能な複数のフラ
ンチャイズ名が上述した音声（２５）のように案内され
る。In response to the voice (23), when the user selects "fuel station search" as shown in the voice (24), only records corresponding to the "fuel station selection" are narrowed down. Then, a plurality of options are extracted in accordance with the “franchise name” which is the next highest priority candidate set title of priority 2, and a plurality of franchise names that can be selected by the user are described in the above-mentioned voice (25). You will be guided as follows.

【００８１】上述した音声（２６）に示すように利用者
によりフランチャイズ名の「Ｂ石油」が選択されると、
この「Ｂ石油」に対応したレコードのみが絞り込まれ、
次に優先度が高く、かつ２つ以上の種類の選択肢を含ん
でいる候補セットタイトルである優先度６の「位置」に
対応して、さらに複数の選択肢が抽出され、自車位置を
基準とした各施設の位置（相対的な距離）が上述した音
声（２７）のように案内される。When the user selects the franchise name "B petroleum" as shown in the above-mentioned voice (26),
Only records corresponding to this "B Petroleum" are narrowed down,
Next, a plurality of alternatives are extracted corresponding to the priority “position” of 6 which is a candidate set title having a higher priority and including two or more types of alternatives, and the position of the own vehicle is set as a reference. The positions (relative distances) of the respective facilities are guided as in the above-mentioned sound (27).

【００８２】上述した音声（２８）に示すように、利用
者により位置「２ｋｍ先右側」が選択されると、この選
択された位置に対応する一の給油所である「Ｂ石油いわ
き店」が特定され、上述した音声（２９）に示すよう
に、この給油所が経路探索の目的地にセットされ、一連
の処理が終了する。As shown in the above-mentioned sound (28), when the user selects the position "2 km ahead right", one of the refueling stations corresponding to the selected position, "B Petroleum Iwaki Store", is opened. The specified gas station is set as the destination of the route search as indicated by the voice (29) described above, and a series of processing ends.

【００８３】なお、再要求ボタン１５が押下された場合
については、上述した第１の実施形態における対話例２
と同様の対話が行われることとなり、その場合に用いら
れるデータの内容は、図１３に示すものと同様である。（対話例７）対話例７は、上述した対話例６と同様に最
寄りの給油所を検索する場合であって、選択肢の中から
「その他」が選択された場合の対話例を示している。ま
た図１４は、対話例７において候補セットＤＢ２６ａか
ら抽出されるレコードの内容を示す図である。Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い」Ｕ：「給油所検索」Ｓ：「給油所検索ですね、ではフランチャイズ名を、Ａ
石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、その他、の中
から選択してください」Ｕ：「その他」…（３０）Ｓ：「では、Ｆ石油、Ｇ石油、その他、の中から選択し
てください」…（３１）Ｕ：「Ｇ石油」…（３２）Ｓ：「Ｇ石油ですね、それではＧ石油いわき店に目的地
をセットします」…（３３）上述した対話例７における音声（３０）に示すように、
利用者により選択肢の中から「その他」が選択される
と、既に提示されたＡ石油、Ｂ石油、Ｃ石油、Ｄ石油、
Ｅ石油を除いたレコードが絞り込まれ、上述した音声
（３１）に示すように、利用者が選択可能なフランチャ
イズ名が追加して案内される。Note that when the re-request button 15 is pressed, the dialog example 2 in the first embodiment described above is performed.
Is performed, and the contents of the data used in this case are the same as those shown in FIG. (Interaction Example 7) Interaction Example 7 is a case in which the nearest gas station is searched in the same manner as in the above-described Interaction Example 6, and shows an interaction example in which “Other” is selected from the options. FIG. 14 is a diagram illustrating the contents of the record extracted from the candidate set DB 26a in the interactive example 7. U: The dialogue start button 14 is pressed. S: "Please choose from meal place search, gas station search, facility search, parking lot search, audio operation, etc." U: "gas station search" S: "gas station search, franchise name And A
Please select from petroleum, petroleum B, petroleum C, petroleum D, petroleum E, or other. "U:" Other "... (30) S:" Now, select from petroleum F, petroleum G, or other Please ... "(31) U:" G Petroleum "... (32) S:"It's G Petroleum, let's set the destination at G Petroleum Iwaki Store "... (33) Voice in Dialogue Example 7 above As shown in (30),
When the user selects “other” from the options, the petroleum A, petroleum B, petroleum C, petroleum D,
Records excluding E oil are narrowed down, and a franchise name that can be selected by the user is additionally provided as shown in the above-mentioned voice (31).

【００８４】上述した音声（３２）に示すように利用者
によりフランチャイズ名の「Ｇ石油」が選択されると、
この「Ｇ石油」に対応したレコードのみが絞り込まれ
る。この場合には、フランチャイズ名に基づいた絞り込
みを行った時点で、一のレコードが絞り込まれており、
案内対象となる一の給油所である「Ｇ石油いわき店」が
特定されるため、上述した音声（３３）に示すように、
この給油所が経路探索の目的地にセットされ、一連の処
理が終了する。When the user selects the franchise name "G petroleum" as shown in the above-mentioned voice (32),
Only records corresponding to this "G petroleum" are narrowed down. In this case, at the time of narrowing down based on the franchise name, one record is narrowed down,
Since the “G oil Iwaki store” which is one of the gas stations to be guided is specified, as shown in the above-mentioned voice (33),
This gas station is set as the destination of the route search, and a series of processing ends.

【００８５】（対話例８）対話例８は、上述した対話例
７と同様に、選択肢の中から「その他」が選択された場
合であって、次に提示可能な選択肢が存在しなかった場
合の対話例を示している。また図１５は、対話例８にお
いて候補セットＤＢ２６ａから抽出されるレコードの内
容を示す図である。Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い」Ｕ：「給油所検索」Ｓ：「給油所検索ですね、ではフランチャイズ名を、Ａ
石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、その他、の中
から選択してください」Ｕ：「その他」Ｓ：「では、Ｆ石油、Ｇ石油、その他、の中から選択し
てください」Ｕ：「その他」…（３４）Ｓ：「申し訳ございません。その他の候補はありませ
ん」…（３５）上述した対話例における音声（３４）に示すように、利
用者により選択肢の中から「その他」が選択された場合
であって、次に提示可能な選択肢が存在しなかった場合
には、上述した音声（３５）に示すように、利用者が選
択可能な選択肢がもう存在しない旨が案内される。(Dialogue Example 8) In Dialogue Example 8, similarly to Dialogue Example 7 described above, when “Others” is selected from the options and there is no next presentable option. An example of the dialogue is shown. FIG. 15 is a diagram illustrating the contents of the record extracted from the candidate set DB 26a in the interactive example 8. U: The dialogue start button 14 is pressed. S: "Please choose from meal place search, gas station search, facility search, parking lot search, audio operation, etc." U: "gas station search" S: "gas station search, franchise name And A
Please select from petroleum, petroleum B, petroleum C, petroleum D, petroleum E, etc. "U:" other "S:" Please select from petroleum F, petroleum G, other, etc. " U: "Other" ... (34) S: "Sorry, there are no other candidates" ... (35) As shown in the voice (34) in the above dialogue example, the user selects "Other" from among the options. Is selected, and when there is no option that can be presented next, as shown in the above-mentioned voice (35), the user is notified that there are no more options that can be selected by the user. You.

【００８６】このように、第２の実施形態では、所定の
テーブル形式を有し、複数の選択肢のそれぞれに異なる
優先度が対応付けられた所定のテーブル情報を候補セッ
トＤＢ２６ａに格納し、このテーブル情報に基づいて、
優先度が高い選択肢から順番に、対応する文字列の読み
を音声出力している。認識対象となる文字列を利用者に
あらかじめ提示し、この提示に対応して利用者により入
力される音声に対応する文字列を、音声出力の対象とな
った複数の文字列の中から選択し、利用者により選択さ
れた選択肢を特定しているので、音声認識の精度を向上
させることができる。特に、テーブル形式でデータを格
納しているので、レコードの追加・変更などを容易に行
うことができる利点がある。As described above, in the second embodiment, predetermined table information having a predetermined table format, in which a plurality of options are associated with different priorities, is stored in the candidate set DB 26a. Based on the information,
The reading of the corresponding character string is output as voice in order from the option with the highest priority. A character string to be recognized is presented to the user in advance, and a character string corresponding to the voice input by the user in response to the presentation is selected from a plurality of character strings to be output. Since the option selected by the user is specified, the accuracy of voice recognition can be improved. Particularly, since data is stored in a table format, there is an advantage that records can be easily added or changed.

【００８７】〔変形例〕なお、本発明は上述した各実施
形態のみに限定されるものではなく、本発明の要旨の範
囲内においてさらに種々の変形実施が可能である。例え
ば、上述した実施形態では、提示される複数の選択肢の
中からいずれか一の選択肢を利用者が順次選択していく
ことにより最終的な選択肢が選択され、その内容に対応
する動作指示がナビゲーション装置２等に対して行われ
ていたが、利用者が望んだ場合には選択肢が自動的に選
択されるようにしてもよい。[Modifications] The present invention is not limited to only the above-described embodiments, and various modifications can be made within the scope of the present invention. For example, in the above-described embodiment, the user sequentially selects any one of a plurality of options to be presented to select the final option, and the operation instruction corresponding to the content is determined by navigation. Although the processing has been performed on the device 2 and the like, the option may be automatically selected when the user desires.

【００８８】図１６は、選択肢が自動的に選択される変
形例における音声検索装置の構成を示す図である。図１
６に示す音声検索装置１Ｂは、上述した第１の実施形態
における音声検索装置１と比較して、選択頻度学習部３
２と学習結果格納部３４が追加された点が異なってい
る。以下、主に第１の実施形態との相違点について着目
して、構成および動作の説明を行う。FIG. 16 is a diagram showing the configuration of a voice search device according to a modification in which options are automatically selected. Figure 1
6 is different from the voice search device 1 in the first embodiment described above in comparison with the selection frequency learning unit 3.
2 in that a learning result storage unit 34 is added. Hereinafter, the configuration and operation will be described, mainly focusing on differences from the first embodiment.

【００８９】選択頻度学習部３２は、利用者に対して提
示される複数の選択肢について、利用者による選択頻度
を学習する。学習結果格納部３４は、選択頻度学習部３
２による学習結果を格納する。本実施形態では、選択肢
を選択するための音声入力を行う際に、利用者が「まか
せる」と入力することにより、選択肢が自動的に選択さ
れるようになっている。この「まかせる」が入力された
場合に、選択項目判定部２４は、学習結果格納部３４に
格納された学習結果を用いて、過去の選択頻度が高い選
択肢を自動的に選択し、動作指示の内容を決定してい
る。なおこの場合には、選択項目判定部２４が文字列選
択手段に対応する。The selection frequency learning section 32 learns the frequency of selection by the user for a plurality of options presented to the user. The learning result storage unit 34 includes the selection frequency learning unit 3
2 is stored. In the present embodiment, when performing a voice input for selecting an option, the user inputs “leave it”, so that the option is automatically selected. When this "leave" is input, the selection item determination unit 24 automatically selects an option with a high selection frequency in the past using the learning result stored in the learning result storage unit 34, and outputs an operation instruction. Content is determined. In this case, the selection item determination unit 24 corresponds to a character string selection unit.

【００９０】音声検索装置１Ｂはこのような構成を有し
ており、次に、過去の選択肢の選択頻度に応じて選択肢
を自動的に選択する場合の動作について説明する。図１
７は、過去の選択肢の選択頻度に応じて選択肢を自動的
に選択する場合の音声検索装置１Ｂの部分的な動作手順
を示す流れ図である。なお、音楽検索装置１Ｂの基本的
な操作手順は、上述した図４に示した第１の実施形態の
音楽検索装置１と同様であり、ステップ１０７以降の処
理内容が異なっている。図１７には、この処理内容の相
違する部分が主に示されている。The voice search device 1B has such a configuration. Next, an operation in the case where an option is automatically selected according to the selection frequency of past options will be described. Figure 1
FIG. 7 is a flowchart showing a partial operation procedure of the voice search device 1B in a case where an option is automatically selected according to the selection frequency of a past option. Note that the basic operation procedure of the music search device 1B is the same as that of the music search device 1 of the first embodiment shown in FIG. 4 described above, and the processing contents after step 107 are different. FIG. 17 mainly shows the difference between the processing contents.

【００９１】選択項目判定部２４は、音声認識処理部１
２から出力される音声認識結果に基づいて、選択肢の中
から「その他」が選択されたか否かを判定する（ステッ
プ１０７）。「その他」が選択された場合の処理は、図
４に示した第１の実施形態の音楽検索装置１と同様であ
り、説明を省略する。[0091] The selection item judging section 24 includes the voice recognition processing section 1
It is determined whether or not “other” has been selected from the options based on the speech recognition result output from Step 2 (Step 107). The process when "other" is selected is the same as that of the music search device 1 of the first embodiment shown in FIG. 4, and the description is omitted.

【００９２】「その他」が選択されなかった場合には、
ステップ１０７で否定判断が行われ、次に選択項目判定
部２４は、音声認識処理部１２から出力される音声認識
結果に基づいて、選択肢の中から「まかせる」が選択さ
れたか否かを判定する（ステップ１４０）。If "other" is not selected,
A negative determination is made in step 107, and then the selection item determination unit 24 determines whether “leave” is selected from the options based on the voice recognition result output from the voice recognition processing unit 12. (Step 140).

【００９３】選択肢の中から「まかせる」が選択された
場合には、ステップ１４０で肯定判断が行われ、選択項
目判定部２４は、学習結果格納部３４に格納された学習
結果を読み出し、過去の選択頻度に基づいて選択肢を自
動的に選択する（ステップ１４１）。例えば、本実施形
態では、最終的な選択肢に至るまでの選択肢が全て自動
的に選択される。If "leave" is selected from the options, an affirmative determination is made in step 140, and the selection item determination unit 24 reads the learning result stored in the learning result storage unit 34, and An option is automatically selected based on the selection frequency (step 141). For example, in the present embodiment, all options up to the final option are automatically selected.

【００９４】過去の選択頻度に応じて最終的な選択肢が
自動的に選択されると、あるいは、ステップ１０８で否
定判断が行われると、選択項目判定部２４は、最終的な
選択肢の内容を動作指示出力部３０に通知する。通知を
受けた動作指示出力部３０は、利用者により選択された
項目の内容に対応する動作指示をナビゲーション装置２
等に出力する（ステップ１１３）。When the final option is automatically selected according to the past selection frequency, or when a negative determination is made in step 108, the selection item determining unit 24 operates the final option. The instruction output unit 30 is notified. Upon receiving the notification, the operation instruction output unit 30 sends an operation instruction corresponding to the content of the item selected by the user to the navigation device 2.
And the like (step 113).

【００９５】次に、上述した図１７に示した処理にした
がって、音声検索装置１Ｂと利用者の間で行われる対話
を具体的に説明する。（対話例９）対話例９は、上述した対話例１と同様に最
寄りの給油所を検索する場合であって、選択肢として
「まかせる」が選択された場合の対話例を示している。
なお、対話例９において用いられるデータの内容は、上
述した図５と同様である。Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い」Ｕ：「給油所検索」Ｓ：「給油所検索ですね、ではフランチャイズ名を、Ａ
石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、その他、の中
から選択してください。または、“まかせる”とお話し
ください」…（３６）Ｕ：「まかせる」…（３７）Ｓ：「まかせていただけますね、それではＢ石油いわき
店に目的地をセットします」上述した対話例では、音声（３６）に示すように、利用
者に対して、“まかせる”と言う選択肢が新たに加えら
れる。これに対して、音声（３７）に示すように、利用
者により“まかせる”が選択されると、過去の選択頻度
に応じて、最も選択頻度の高い選択肢が自動的に選択さ
れる。上述した例では、フランチャイズ名以降の選択肢
が過去の選択頻度に応じて自動的に選択されている。具
体的には、フランチャイズ名としては「Ｂ石油」が自動
的に選択され、位置については「２ｋｍ先右側」が自動
的に選択されることにより、最終的に「Ｂ石油いわき
店」という選択肢が選択されている。Next, the dialog between the voice search device 1B and the user according to the processing shown in FIG. 17 will be described in detail. (Dialogue Example 9) Dialogue Example 9 is a case in which the nearest gas station is searched in the same manner as Dialogue Example 1 described above, and shows a dialogue example in a case where “leave” is selected as an option.
Note that the contents of the data used in the dialogue example 9 are the same as those in FIG. 5 described above. U: The dialogue start button 14 is pressed. S: "Please choose from meal place search, gas station search, facility search, parking lot search, audio operation, etc." U: "gas station search" S: "gas station search, franchise name And A
Please choose from petroleum, petroleum B, petroleum C, petroleum D, petroleum E, etc. Or say "leave it" .... (36) U: "leave it" ... (37) S: "I can leave it to you, then set your destination at B Oil Iwaki Store." As shown in the voice (36), a new option "leave it" is added to the user. On the other hand, as shown in the voice (37), when “leave” is selected by the user, the option with the highest selection frequency is automatically selected according to the past selection frequency. In the example described above, the options after the franchise name are automatically selected according to the past selection frequency. Specifically, “B Oil” is automatically selected as the franchise name, and “2 km ahead right” is automatically selected as the location, so that the option “B Oil Iwaki Store” is finally selected. Selected.

【００９６】なお、上述した例では、最終的な選択肢に
至るまで全て自動的に選択されていたが、その時点での
選択肢のみが自動的に選択されるようにしてもよい。例
えば、フランチャイズ名を選択する際に「まかせる」が
選択された場合であれば、このフランチャイズ名につい
てのみ自動的に選択し、下位階層の候補セットである
「位置」に移行し、この候補セットに含まれる複数の選
択肢を提示するようにすればよい。また上述した例で
は、選択肢を自動的に選択する処理を第１の実施形態に
対して追加した場合について説明したが、第２の実施形
態に対しても同様にしてこの機能を追加することができ
る。In the example described above, all the options are automatically selected up to the final option. However, only the option at that time may be automatically selected. For example, if “leave” is selected when selecting a franchise name, only the franchise name is automatically selected, and the process moves to the lower-level candidate set “position”, A plurality of included options may be presented. Further, in the above-described example, a case has been described in which processing for automatically selecting an option is added to the first embodiment, but this function may be added to the second embodiment in the same manner. it can.

【００９７】また、上述した変形例では、選択肢として
「まかせる」が選択された場合に、過去の選択頻度に応
じて選択肢を選択していたが、過去の選択頻度にかかわ
らずランダムに選択肢を選択するようにしてもよい。こ
の場合には、過去の選択頻度を学習する処理が不要とな
り、構成を簡略化することができる。Further, in the above-described modified example, when “leave” is selected as an option, the option is selected according to the past selection frequency. However, the option is randomly selected regardless of the past selection frequency. You may make it. In this case, the process of learning the past selection frequency becomes unnecessary, and the configuration can be simplified.

【００９８】また、選択肢の過去の選択頻度を学習する
処理を行う場合に、この学習結果を候補セットＤＢ２６
（または２６ａ）に格納されているデータに反映させる
ようにしてもよい。例えば、上述した各実施形態では、
フランチャイズ名に関する複数の選択肢を案内する場合
に、「Ｇ石油」は、１回目に提示される選択肢の中から
“その他”が選択された場合に行われる２回目の案内時
に提示されていた。しかしながら、「Ｇ石油」が高い頻
度で選択されているという学習結果が得られている場合
であれば、この「Ｇ石油」が１回目の案内時に提示され
るようにしてもよい。あるいは、１回の案内に含まれる
複数の選択肢の中においても、過去の選択頻度に応じて
案内順序を入れ替えてもよい。例えば、初期状態では１
回目の案内において、Ａ石油、Ｂ石油、Ｃ石油、…とい
う順番で案内されていた場合に、過去の選択頻度として
Ｇ石油、Ｂ石油、Ａ石油、…という順に選択頻度が高い
という学習結果が得られている場合には、１回目の案内
を、Ｇ石油、Ｂ石油、Ａ石油、…という順番に入れ替え
ればよい。なお、この場合には学習結果格納部３４が選
択履歴格納手段に対応する。When performing a process of learning the past selection frequency of an option, this learning result is stored in the candidate set DB 26.
(Or 26a). For example, in each of the embodiments described above,
When guiding a plurality of options related to the franchise name, “G Petroleum” was presented at the second guidance performed when “Other” was selected from the options presented at the first time. However, if a learning result that “G oil” is selected with high frequency is obtained, this “G oil” may be presented at the first guidance. Alternatively, even among a plurality of options included in one guidance, the guidance order may be changed according to the past selection frequency. For example, 1 in the initial state
In the second guidance, if the guidance was in the order of petroleum A, petroleum B, petroleum C,..., The learning result shows that the selection frequency was high in the order of petroleum G, petroleum B, petroleum A,. If it has been obtained, the first guidance may be changed to G petroleum, B petroleum, A petroleum, and so on. In this case, the learning result storage unit 34 corresponds to a selection history storage unit.

【００９９】また、上述した各実施形態では、案内音声
を再度聞きたい場合には再要求ボタン１５を押下してい
たが、この再要求操作を音声入力によって行うようにし
てもよい。この場合には、例えば、「もう一度」などと
いう音声を入力し、これらの音声に対応して、直前の案
内内容が再度出力されるようにすればよい。In each of the above-described embodiments, the re-request button 15 is pressed to hear the guidance voice again. However, the re-request operation may be performed by voice input. In this case, for example, a voice such as "again" may be input, and the immediately preceding guidance content may be output again in response to these voices.

【０１００】また上述した各実施形態では、一旦選択さ
れた選択肢を取り消して、新たに選択肢を選択する場合
の動作については説明されなかったが、そのような処理
を行うこともできる。図１８は、一旦選択された選択肢
を取り消して、新たに選択肢を選択する場合の音声検索
装置の動作手順を部分的に示す流れ図である。例えば、
上述した第１の実施形態において説明した音声検索装置
１において、この処理が行われるものとして説明を行
う。この場合における基本的な動作手順は、上述した図
４に示す流れ図と同様であり、ステップ１０７以降に新
たな処理が追加されることとなる。図１８には、新たに
追加される処理内容が主に示されている。なお、この変
形例においては、選択項目判定部２４が再選択指示手段
に対応する。In each of the above-described embodiments, the operation in the case where the once selected option is canceled and a new option is selected has not been described, but such processing may be performed. FIG. 18 is a flowchart partially showing an operation procedure of the voice search device in a case where a once-selected option is canceled and a new option is selected. For example,
The description will be made assuming that this processing is performed in the voice search device 1 described in the first embodiment. The basic operation procedure in this case is the same as the flowchart shown in FIG. 4 described above, and a new process is added after step 107. FIG. 18 mainly shows newly added processing contents. In this modification, the selection item determination unit 24 corresponds to a reselection instruction unit.

【０１０１】選択項目判定部２４は、音声認識処理部１
２から出力される音声認識結果に基づいて、選択肢の中
から「その他」が選択されたか否かを判定する（ステッ
プ１０７）。利用者により「その他」が選択された場合
の処理は前述した図４と同様であり、ここでの説明は省
略する。The selection item judging section 24 includes the voice recognition processing section 1
It is determined whether or not “other” has been selected from the options based on the speech recognition result output from Step 2 (Step 107). The process when “other” is selected by the user is the same as that in FIG. 4 described above, and a description thereof will be omitted.

【０１０２】「その他」が選択されなかった場合には、
ステップ１０７で否定判断が行われ、次に選択項目判定
部２４は、音声認識処理部１２から出力される音声認識
結果に基づいて、「修正」という音声が入力されたか否
かを判定する（ステップ１５０）。具体的には、この
「修正」という音声入力によって否定的な見解を示すこ
とにより、一旦選択した選択肢を取り消す処理が行われ
るようになっている。なお「修正」の代わりに「戻る」
や「違う」などといった音声入力を行うことにより、否
定的な見解を示してもよい。If "other" is not selected,
A negative determination is made in step 107, and then the selection item determination unit 24 determines whether or not a voice of “correction” has been input based on the voice recognition result output from the voice recognition processing unit 12 (step S107). 150). Specifically, by giving a negative opinion by the voice input of “correction”, a process of canceling the once selected option is performed. "Back" instead of "Modify"
A negative opinion may be shown by performing a voice input such as “No” or “No”.

【０１０３】「修正」という音声が入力された場合に
は、ステップ１５０で肯定判断が行われ、選択項目判定
部２４は、選択肢設定部１６に対して、その時点で着目
している階層よりも１つ上位階層の候補セットを再度設
定するように指示する。この指示に応じて、選択肢設定
部１６は、上位階層の候補セットを再度設定する（ステ
ップ１５１）。その後、上述した図４に示すステップ１
０２に戻り、上位階層の候補セットに含まれる選択肢に
対応する文字列が認識対象として通知されるとともに、
この選択肢が音声出力され、以降の処理が繰り返され
る。When the voice of "correction" is input, an affirmative determination is made in step 150, and the selection item determination unit 24 instructs the option setting unit 16 to select a value higher than the level of interest at that time. It instructs to set the candidate set of one higher hierarchy again. In response to this instruction, the option setting unit 16 sets the candidate set of the upper hierarchy again (step 151). Thereafter, step 1 shown in FIG.
02, the character string corresponding to the option included in the candidate set in the upper hierarchy is notified as a recognition target,
This option is output as voice, and the subsequent processing is repeated.

【０１０４】また「修正」という音声が入力されていな
い場合には、ステップ１５０で否定判断が行われ、この
場合にはステップ１０８に進み、それ以降の処理が行わ
れる。次に、上述した図１８に示した処理にしたがっ
て、音声検索装置１と利用者の間で行われる対話を具体
的に説明する。If the voice of "correction" has not been input, a negative determination is made in step 150. In this case, the process proceeds to step 108, and the subsequent processing is performed. Next, a dialog performed between the voice search device 1 and the user according to the above-described processing illustrated in FIG. 18 will be specifically described.

【０１０５】（対話例１０）対話例１０は、上述した対
話例１等と同様に最寄りの給油所を検索する場合であっ
て、「修正」という音声が入力された場合の対話例を示
している。Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い。」Ｕ：「給油所検索」Ｓ：「給油所検索ですね、ではフランチャイズ名を、Ａ
石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、その他、の中
から選択してください」Ｕ：「Ａ石油」…（３８）Ｓ：「Ｅ石油ですね、では２ｋｍ先右側、２．５ｋｍ先
左側、３ｋｍ先左側、５ｋｍ左側、その他、の中から選
択してください」…（３９）Ｕ：「修正」…（４０）Ｓ：「ではフランチャイズ名を、Ａ石油、Ｂ石油、Ｃ石
油、Ｄ石油、Ｅ石油、その他、の中から選択してくださ
い」…（４１）Ｕ：「Ａ石油」Ｓ：「Ａ石油ですね、では２ｋｍ先左側、５ｋｍ左側、
その他、の中から選択してください」Ｕ：「２ｋｍ先左側」Ｓ：「２ｋｍ先左側ですね、それではＡ石油いわき店に
目的地をセットします」上述した対話例１０では、音声（３８）に示すように利
用者により選択肢の中から「Ａ石油」が選択されたにも
関わらず、音声（３９）に示すように誤認識が生じて
「Ｅ石油」が選択されたことになっている。この場合
に、音声（４０）に示すように利用者が「修正」と音声
入力を行うことにより、音声（４１）に示しように、上
位階層の候補セットであるフランチャイズ名に基づい
て、利用者が選択可能なフランチャイズ名が再度案内さ
れる。(Dialogue Example 10) Dialogue Example 10 is a case in which the nearest gas station is searched in the same manner as in Dialogue Example 1 and the like described above, and shows a dialogue example in which a voice “correct” is input. I have. U: The dialogue start button 14 is pressed. S: "Please choose from meal place search, gas station search, facility search, parking lot search, audio operation, etc." U: "gas station search" S: "gas station search, franchise First name, A
Please select from petroleum, petroleum B, petroleum C, petroleum D, petroleum E, etc. "U:" Petroleum A "... (38) S:"It's E petroleum, 2 km ahead on the right, 2.5 km Please choose from the left side, 3 km ahead, 5 km left, etc. "... (39) U:" Modify "... (40) S:" Then the franchise name is A, B, C, Please choose from D oil, E oil, etc. "... (41) U:" A oil "S:" It is A oil, 2 km ahead on the left, 5 km left,
Others, please choose from among them. "U:" 2 km ahead on the left "S:" 2 km ahead on the left, then set the destination at A Oil Iwaki Store "In the above dialogue example 10, voice (38) Although the user has selected "A petroleum" from the options as shown in Fig. 7A, an erroneous recognition has occurred as shown in the voice (39), and "E petroleum" has been selected. . In this case, as shown in the voice (40), the user performs "correction" and voice input, and as shown in the voice (41), based on the franchise name which is a candidate set of the upper hierarchy, The franchise name that can be selected is presented again.

【０１０６】（対話例１１）対話例１１は、「修正」と
いう音声が入力された場合の他の対話例を示している。Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い。」Ｕ：「給油所検索」Ｓ：「給油所検索ですね、ではフランチャイズ名を、Ａ
石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、その他、の中
から選択してください」Ｕ：「施設検索」…（４２）Ｓ：「Ｃ石油ですね、では２ｋｍ先右側、２．５ｋｍ先
左側、３ｋｍ先右側、５ｋｍ左側、その他、の中から選
択してください」…（４３）Ｕ：「修正」…（４４）Ｓ：「ではフランチャイズ名を、Ａ石油、Ｂ石油、Ｃ石
油、Ｄ石油、Ｅ石油、その他、の中から選択してくださ
い」…（４５）Ｕ：「修正」…（４６）Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い。」…（４７）Ｕ：「施設検索」以下、施設検索を選択してからの対話は、上述した対話
例５と同様に行われるので、ここでは説明を省略する。(Interaction Example 11) Interaction Example 11 shows another example of the dialog when the voice of "correction" is input. U: The dialogue start button 14 is pressed. S: "Please choose from meal place search, gas station search, facility search, parking lot search, audio operation, etc." U: "gas station search" S: "gas station search, franchise First name, A
Please select from petroleum, petroleum B, petroleum C, petroleum D, petroleum E, etc. "U:" Facility search "... (42) S:"It's C petroleum, 2 km ahead on the right, 2.5 km Please choose from the left side, 3 km ahead right side, 5 km left side, etc. "... (43) U:" Modify "... (44) S:" Then the franchise name is A, B, C, Please select from D Petroleum, E Petroleum, etc. "... (45) U:" Modify "... (46) S:" Mealing place search, gas station search, facility search, parking lot search, audio operation, Other, please select from among. "(47) U:" Facility search "Hereinafter, the dialogue after selecting the facility search is performed in the same manner as in the above-described dialogue example 5, and the description will be given here. Omitted.

【０１０７】上述した対話例１１では、利用者は一旦
「給油所検索」を選択したものの、「施設検索」を選択
したくなったため、音声（４２）に示すように、音声検
索装置１から提示されている選択肢とは異なる選択肢で
ある「施設検索」を音声入力している。この場合であっ
ても音声検索装置１は、音声（４３）に示すように、そ
の時点における選択肢の中から、入力された音声に最も
近いものを選択して処理を続行する。In the above-described dialogue example 11, the user once selected “service station search” but wanted to select “facility search”. "Facility search", which is a different option from the selected option, is input by voice. Even in this case, as shown in the voice (43), the voice search device 1 selects the one closest to the input voice from the options at that time and continues the process.

【０１０８】音声（４４）に示すように、利用者が「修
正」と音声入力を行うことにより、音声（４５）に示す
ように、上位階層の候補セットであるフランチャイズ名
に基づいて、利用者が選択可能なフランチャイズ名が再
度案内される。音声（４６）に示すように、ここで利用
者が、さらに「修正」と音声入力を行うことにより、音
声（４７）に示すように、さらに上位階層の候補セット
である「機能」に基づいて、利用者が選択可能な選択肢
が再度案内される。As shown in voice (44), when the user inputs "correction" and voice, as shown in voice (45), based on the franchise name which is a candidate set of the upper hierarchy, The franchise name that can be selected is presented again. As shown in the voice (46), the user further inputs "correction" and voice, and as shown in the voice (47), based on the "function", which is a higher-level candidate set. Then, the options that can be selected by the user are guided again.

【０１０９】ところで、上述した対話例１１では、提示
される複数の選択肢の内容に沿わない音声入力が行われ
た場合であっても、その時点における複数の選択肢の中
からいずれかが選択されていたが、選択肢の内容に沿わ
ない音声入力が行われ、選択肢を特定することが難しい
場合には、有効な認識結果が得られなかった旨を通知す
るようにしてもよい。この場合には、上述した図４に示
すステップ１０６の処理において認識結果の有効性を判
断し、文字列の一致率が非常に低い（例えば、１０％以
下など）場合には、選択肢を特定できなかった旨を通知
すればよい。以下に、有効な認識結果が得られなかった
場合の対話を具体的に説明する。By the way, in the above-mentioned dialogue example 11, even when a voice input not conforming to the contents of the plurality of options presented is performed, one of the plurality of options at that time is selected. However, if a voice input that does not conform to the content of the option is made and it is difficult to specify the option, it may be notified that an effective recognition result has not been obtained. In this case, the validity of the recognition result is determined in the processing of step 106 shown in FIG. 4 described above, and if the matching rate of the character strings is extremely low (for example, 10% or less), the option can be specified. What is necessary is just to notify that it did not exist. In the following, the dialogue when a valid recognition result is not obtained will be specifically described.

【０１１０】（対話例１２）Ｕ：対話開始ボタン１４を押下する。Ｓ：「食事場所検索、給油所検索、施設検索、駐車場検
索、オーディオ操作、その他、の中から選択してくださ
い。」Ｕ：「給油所検索」Ｓ：「給油所検索ですね、ではフランチャイズ名を、Ａ
石油、Ｂ石油、Ｃ石油、Ｄ石油、Ｅ石油、その他、の中
から選択してください」Ｕ：「Ｈ石油」…（４８）Ｓ：「申し訳ございません。入力された単語を認識でき
ませんでした。フランチャイズ名を、Ａ石油、Ｂ石油、
Ｃ石油、Ｄ石油、Ｅ石油、その他、の中から選択してく
ださい」…（４９）上述した対話例１２では、音声（４８）に示すように、
選択肢として提示されていない「Ｈ石油」が利用者によ
って選択されたため、音声検索装置１は有効な認識結果
を得ることができない。したがって音声検索装置１は、
音声（４９）に示すように、選択肢を特定することがで
きなかった旨を利用者に対して通知するとともに、再度
の選択肢の入力を促す案内を行っている。(Dialogue Example 12) U: The dialogue start button 14 is pressed. S: "Please choose from meal place search, gas station search, facility search, parking lot search, audio operation, etc." U: "gas station search" S: "gas station search, franchise First name, A
Please select from petroleum, petroleum B, petroleum C, petroleum D, petroleum E, or other. "U:" H petroleum "... (48) S:" Sorry. The input word could not be recognized. The franchise names are A Petroleum, B Petroleum,
Please select from C oil, D oil, E oil, and others. ”(49) In the above dialogue example 12, as shown in the voice (48),
Since “H Petroleum”, which is not presented as an option, is selected by the user, the voice search device 1 cannot obtain a valid recognition result. Therefore, the voice search device 1
As shown in the voice (49), the user is notified that the option could not be specified, and guidance for prompting the user to input the option again is provided.

【０１１１】また、音声認識処理を行う際に、提示した
複数の選択肢に対応する文字列と利用者によって入力さ
れた音声に対する文字列との部分的な一致を考慮して、
認識精度を高めるようにしてもよい。例えば、選択肢と
して提示された「給油所検索」を選択する際に、利用者
によっては「給油所」という部分しか発声しないことも
考えられる。このような場合に、「給油所検索」という
文字列の全体だけを音声認識の対象とすると、利用者が
発声した「給油所」とは部分的にしか一致していないた
め一致率が低く、もちろん、他の選択肢（「食事場所検
索」等）とも一致率が低いため、利用者により選択され
た選択肢を正確に特定することが難しい場合がある。し
たがって、例えば給油所検索については、認識対象文字
列を「給油所検索」および「給油所」とし、食事場所検
索については「食事場所検索」、「食事場所」、「食
事」などにすることにより、複数の選択肢に対応する文
字列と利用者によって入力された音声に対する文字列と
の全体的な一致と部分的な一致の両者を判定することが
できるため、認識精度を高めることができる。Further, when performing the voice recognition processing, a partial match between the character string corresponding to the plurality of presented options and the character string corresponding to the voice input by the user is taken into consideration.
The recognition accuracy may be increased. For example, it is conceivable that, when selecting the “service station search” presented as an option, some users may utter only the “service station” part. In such a case, if only the entire character string “Search for gas station” is targeted for speech recognition, the match rate is low because it only partially matches with “Gas station” spoken by the user, Of course, since the matching rate is low with other options (such as “meal place search”), it may be difficult to accurately specify the option selected by the user. Therefore, for example, for a gas station search, the character strings to be recognized are “gas station search” and “gas station”, and for a meal place search, “meal place search”, “meal place”, “meal”, etc. Since it is possible to determine both a general match and a partial match between a character string corresponding to a plurality of options and a character string for a voice input by a user, recognition accuracy can be improved.

【０１１２】なお、このように部分的な一致を考慮する
場合においても、認識結果を返答する際には、文字列の
全体を出力することが好ましい。例えば、利用者により
「給油所」と入力された場合であっても、対応する認識
結果の返答としては、「給油所検索ですね」というよう
に、文字列の全体を返答することが好ましい。Even when such partial matching is considered, it is preferable to output the entire character string when replying to the recognition result. For example, even when the user inputs "fuel station", it is preferable to reply the entire character string, such as "fuel station search," as a response to the corresponding recognition result.

【０１１３】また上述した各実施形態では、複数の選択
肢を提示し、いずれか一を利用者に音声入力させていた
が、各選択肢に対して所定の符号を付加して提示し、所
望の選択肢に付加された符号を音声入力するようにして
もよい。具体的には、所定の符号としては「１、２、
３、…」等の数字や「Ａ、Ｂ、Ｃ、…」等の文字などが
考えられる。例えば、選択肢として複数の機能を提示す
る場合であれば、「１：食事場所検索、２：給油所検
索、３：施設検索、４：駐車場検索、５：オーディオ操
作、６：その他、の中から該当する数字を選択してくだ
さい」というような内容の案内音声を出力し、１〜６の
いずれかの数字を利用者に音声入力させればよい。この
ように、所定の符号を用いる場合には、利用者は所望の
選択肢に対応付けられた符号を発声するだけでよく、音
声入力をより簡単にすることができる。また、音声認識
処理の対象とする文字列を数字等の簡単な文字列にする
ことができるため、認識精度を向上させることができ
る。In each of the above-described embodiments, a plurality of options are presented and one of them is input by voice. However, a predetermined code is added to each option to present the desired option. May be input by voice. Specifically, the predetermined code is “1, 2,
.. And letters such as "A, B, C,...". For example, in the case of presenting a plurality of functions as options, "1: meal place search, 2: gas station search, 3: facility search, 4: parking lot search, 5: audio operation, 6: other, etc. Please select a corresponding number from "" and a user may input any number from 1 to 6 by voice. As described above, when a predetermined code is used, the user only has to utter a code associated with a desired option, and voice input can be further simplified. In addition, since the character string to be subjected to the voice recognition processing can be a simple character string such as a numeral, the recognition accuracy can be improved.

【０１１４】また、上述した各実施形態では、複数の選
択肢が全て提示された後に、利用者が一の選択肢を選択
して音声入力を行っていたが、全ての選択肢が提示され
るよりも先に利用者による音声入力が行われた場合に
は、その時点で音声認識処理を開始するようにしてもよ
い。利用者によっては、出力される音声案内を聞いてい
るとき、所望の選択肢が出力された直後に、音声入力を
開始する場合がある。このような場合には、選択肢が全
て提示された後でなくても、速やかに音声認識処理を開
始することにより、操作性をより向上させることができ
る。In each of the above-described embodiments, the user selects one option and performs voice input after all of the plurality of options are presented. If a voice input is made by the user at the time, the voice recognition processing may be started at that time. When listening to the output voice guidance, some users may start voice input immediately after a desired option is output. In such a case, the operability can be further improved by immediately starting the voice recognition process even after all the options are not presented.

【０１１５】また、上述した各実施形態では、単体で用
いられる音声検索装置について説明していたが、ネット
ワークを介して接続されたサーバと端末装置とに機能を
分散配置して音声検索装置を構成してもよい。図１９
は、ネットワークを介して接続されたサーバと端末装置
とに機能を分散配置した場合の音声検索装置の構成例を
示す図である。図１９に示す音声検索装置は、所定のネ
ットワーク６を介して接続された音声検索端末装置４と
サーバ５から構成されている。Further, in each of the above-described embodiments, the voice search device used alone has been described. However, the functions are distributed to the server and the terminal device connected via the network to configure the voice search device. May be. FIG.
FIG. 3 is a diagram illustrating a configuration example of a voice search device in a case where functions are distributed to a server and a terminal device connected via a network. The voice search device shown in FIG. 19 includes a voice search terminal device 4 and a server 5 connected via a predetermined network 6.

【０１１６】音声検索端末装置４は、基本的には上述し
た第２の実施形態における音声検索装置１Ａと同様の構
成を有しており、通信処理部３６が追加された点が異な
っている。なお、音声検索端末装置４の構成は、上述し
た第１の実施形態における音声検索装置１と同様にして
もよい。The voice search terminal device 4 has basically the same configuration as the voice search device 1A in the second embodiment described above, except that a communication processing unit 36 is added. The configuration of the voice search terminal device 4 may be the same as the voice search device 1 in the first embodiment described above.

【０１１７】音声検索端末装置４に備わった通信処理部
３６は、候補セットＤＢ２６ａに格納されるデータを更
新するために必要な情報をネットワーク６を介してサー
バ５から取得するための通信処理を行う。ＤＢ更新部２
８は、通信処理部３６によって受信された情報に基づい
て、候補セットＤＢ２６ａに格納されているデータを更
新する。この更新処理は、音声検索端末装置４による所
定の処理に先だって行われる。The communication processing unit 36 provided in the voice search terminal device 4 performs a communication process for acquiring information necessary for updating data stored in the candidate set DB 26a from the server 5 via the network 6. . DB update unit 2
8 updates the data stored in the candidate set DB 26a based on the information received by the communication processing unit 36. This updating process is performed prior to the predetermined process by the voice search terminal device 4.

【０１１８】また、サーバ５は、サーバ制御部５０、候
補セットＤＢ５２、通信処理部５４を含んで構成されて
いる。サーバ制御部５０は、サーバ５の全体動作を制御
する。候補セットＤＢ５２は、上述した音声検索端末装
置４に備わっている候補セットＤＢ２６ａと基本的に同
じ内容データを格納している。この候補セットＤＢ５２
に格納されるデータは、随時、新しい内容に更新されて
いる。The server 5 includes a server control unit 50, a candidate set DB 52, and a communication processing unit 54. The server control unit 50 controls the overall operation of the server 5. The candidate set DB 52 stores basically the same content data as the candidate set DB 26a provided in the voice search terminal device 4 described above. This candidate set DB 52
Is updated to new contents at any time.

【０１１９】音声検索端末装置４から所定の要求がなさ
れた場合に、サーバ制御部５０は、以前に音声検索端末
装置４に送信済みの内容に対する変更内容を含んだ所定
の差分情報を候補セットＤＢ５２から抽出し、この差分
情報を音声検索端末装置４に対して送信する。通信処理
部５４は、サーバ５が音声検索端末装置４との間でデー
タの送受を行うために必要な通信処理を行う。When a predetermined request is made from the voice search terminal 4, the server control unit 50 stores the predetermined difference information including the change to the content already transmitted to the voice search terminal 4, in the candidate set DB 52. , And transmits the difference information to the voice search terminal device 4. The communication processing unit 54 performs communication processing necessary for the server 5 to transmit and receive data to and from the voice search terminal device 4.

【０１２０】このように、サーバ５から送信される所定
の差分情報に基づいて、音声検索端末装置４に備わった
候補セットＤＢ２６ａの内容を更新することができるの
で、音声検索端末装置４は、内容の更新された新しい情
報を各種処理に反映させることができる。特に、サーバ
５から音声検索端末装置４に送られてくる情報は、前回
までに送られてきた情報に対する変更内容を含む差分情
報であるため、送受するデータ量を低減し、通信コスト
を削減することができる。As described above, since the contents of the candidate set DB 26a provided in the voice search terminal device 4 can be updated based on the predetermined difference information transmitted from the server 5, the voice search terminal device 4 The updated new information can be reflected in various processes. In particular, since the information sent from the server 5 to the voice search terminal device 4 is difference information including changes to the information sent up to the previous time, the amount of data to be sent and received is reduced, and the communication cost is reduced. be able to.

【０１２１】なお、上述したサーバ５が、検索対象項目
とそれぞれに対応する検索キーに関する情報を格納する
機能を有している。音声検索端末装置４が、認識対象文
字列出力手段、マイクロホン、音声認識処理手段、項目
抽出手段に対応する機能を有しており、これらを用いた
各種の処理に先だって、上述したサーバ５から必要な情
報を取得している。The above-described server 5 has a function of storing information on search target items and search keys corresponding thereto. The voice search terminal device 4 has a function corresponding to a recognition target character string output unit, a microphone, a voice recognition processing unit, and an item extraction unit, and is required from the server 5 before performing various processes using these. Information is acquired.

【０１２２】図２０は、ネットワークを介して接続され
たサーバと端末装置とに機能を分散配置した場合の音声
検索装置の他の構成例を示す図である。図２０に示す音
声検索装置は、所定のネットワーク６を介して接続され
た音声検索端末装置４Ａとサーバ５Ａから構成されてい
る。FIG. 20 is a diagram showing another example of the configuration of the voice search device in the case where functions are distributed to a server and a terminal device connected via a network. The voice search device shown in FIG. 20 includes a voice search terminal device 4A and a server 5A connected via a predetermined network 6.

【０１２３】図２０に示す音声検索装置では、上述した
第１の実施形態の音声検索装置１に備わっていた候補セ
ットＤＢ２６（あるいは第２の実施形態の音声検索装置
１Ａに備わっていた候補セットＤＢ２６ａ）、選択肢設
定部１６、選択項目判定部２４のそれぞれによって実現
される機能に対応する構成がサーバ５Ａに配置されてい
る。具体的には、サーバ５Ａは、サーバ制御部５０、候
補セットＤＢ５２、通信処理部５４、選択肢設定部５
６、選択項目判定部５８を備えている。In the voice search device shown in FIG. 20, the candidate set DB 26 provided in the voice search device 1 of the first embodiment described above (or the candidate set DB 26a provided in the voice search device 1A of the second embodiment) ), The configuration corresponding to the function realized by each of the option setting unit 16 and the selection item determination unit 24 is arranged in the server 5A. Specifically, the server 5A includes a server control unit 50, a candidate set DB 52, a communication processing unit 54, an option setting unit 5
6, a selection item determination unit 58 is provided.

【０１２４】また、音声検索端末装置４Ａは、上述した
音声検索端末装置４から、選択肢設定部１６、選択項目
判定部２４、候補セットＤＢ２６ａ、ＤＢ更新部２８が
省略されており、制御部３８が追加されている。利用者
の発声する音声に対応してナビゲーション装置２等に対
して動作指示を出力する際に、音声検索端末装置４Ａ内
の制御部３８は、選択肢を提示するために必要な最小限
のデータを通信処理部３６を介してサーバ５Ａから取得
する。案内文生成部１８は、制御部３８からの指示にし
たがって、所定の案内文を生成し、出力する。サーバ５
Ａは、利用者の音声に対する音声認識結果を音声検索端
末装置４Ａから取得し、次の候補セットを設定し、選択
肢の提示に必要なデータを音声検索端末装置４Ａに送信
する処理や、最終的に選択された一の選択肢を抽出する
処理などを行っている。The voice search terminal device 4A is different from the above-described voice search terminal device 4 in that the option setting section 16, the selection item determination section 24, the candidate set DB 26a, and the DB update section 28 are omitted. Has been added. When outputting an operation instruction to the navigation device 2 or the like in response to the voice uttered by the user, the control unit 38 in the voice search terminal device 4A transmits the minimum data necessary for presenting an option. It is obtained from the server 5A via the communication processing unit 36. The guidance sentence generation unit 18 generates and outputs a predetermined guidance sentence according to an instruction from the control unit 38. Server 5
A obtains a voice recognition result for the user's voice from the voice search terminal device 4A, sets the next candidate set, transmits data necessary for presenting options to the voice search terminal device 4A, For example, a process of extracting one option selected in the above is performed.

【０１２５】なお、上述したサーバ５Ａが、検索対象項
目とそれぞれに対応する検索キーに関する情報を格納す
るとともに、認識対象文字列出力手段による音声出力の
対象となる文字列の抽出処理と、項目抽出手段による検
索対象項目の抽出処理を行う機能を有している。また、
音声検索端末装置４Ａが、認識対象文字列出力手段、マ
イクロホン、音声認識処理手段に対応する機能を有して
おり、これらの処理に必要な情報を上述したサーバ５Ａ
から取得している。The server 5A stores the information on the search key corresponding to each of the search target items, extracts the character string to be output as a sound by the recognition target character string output means, and performs the item extraction. It has a function of performing a search target item extraction process by means. Also,
The voice search terminal device 4A has functions corresponding to a recognition target character string output unit, a microphone, and a voice recognition processing unit, and stores information necessary for these processes in the server 5A described above.
From.

【０１２６】図２１は、ネットワークを介して接続され
たサーバと端末装置とに機能を分散配置した場合の音声
検索装置の他の構成例を示す図である。図２１に示す音
声検索装置は、所定のネットワーク６を介して接続され
た音声検索端末装置４Ｂとサーバ５Ｂから構成されてい
る。FIG. 21 is a diagram showing another example of the configuration of the voice search device in the case where functions are distributed to a server and a terminal device connected via a network. The voice search device shown in FIG. 21 includes a voice search terminal device 4B and a server 5B connected via a predetermined network 6.

【０１２７】図２１に示す音声検索装置では、上述した
図２０に示した音声検索装置において、さらに案内文生
成部１８の機能をサーバ側に配置した点が異なってい
る。具体的には、サーバ５Ｂは、サーバ制御部５０、候
補セットＤＢ５２、通信処理部５４、選択肢設定部５
６、選択項目判定部５８、案内文生成部６０を備えてい
る。また音声検索端末装置４Ｂは、音声検索端末装置４
Ａから案内文生成部１８が削除された点が異なってい
る。図２１に示す音声検索装置では、案内文の生成がサ
ーバ５Ｂで行われるため、音声検索端末装置４Ｂ内の制
御部３８は、サーバ５Ｂによって生成された案内文を受
け取り、これを音声合成部２０に出力する。それ以外の
動作内容は、図２０に示す音声検索装置と同様である。The voice search device shown in FIG. 21 is different from the voice search device shown in FIG. 20 in that the function of the guidance sentence generation unit 18 is further arranged on the server side. Specifically, the server 5B includes a server control unit 50, a candidate set DB 52, a communication processing unit 54, an option setting unit 5
6, a selection item determination unit 58 and a guidance sentence generation unit 60 are provided. The voice search terminal device 4B is a voice search terminal device 4B.
The difference is that the guidance sentence generation unit 18 is deleted from A. In the voice search device shown in FIG. 21, since the guidance sentence is generated by the server 5B, the control unit 38 in the voice search terminal device 4B receives the guidance sentence generated by the server 5B, and Output to The other operation contents are the same as those of the voice search device shown in FIG.

【０１２８】図２２は、ネットワークを介して接続され
たサーバと端末装置とに機能を分散配置した場合の音声
検索装置の他の構成例を示す図である。図２２に示す音
声検索装置は、所定のネットワーク６を介して接続され
た音声検索端末装置４Ｃとサーバ５Ｃから構成されてい
る。図２２に示す音声検索装置では、上述した図２１に
示した音声検索装置において、さらに音声認識処理部１
２と音声合成部２０の機能をサーバ側に配置した点が異
なっている。具体的には、サーバ５Ｃは、サーバ制御部
５０、候補セットＤＢ５２、通信処理部５４、選択肢設
定部５６、選択項目判定部５８、案内文生成部６０、音
声認識処理部６２、音声合成部６４を備えている。また
音声検索端末装置４Ｃは、音声検索端末装置４Ｂから音
声認識処理部１２と音声合成部２０が削除された点が異
なっている。FIG. 22 is a diagram showing another example of the configuration of the voice search device when the functions are distributed to the server and the terminal device connected via a network. The voice search device shown in FIG. 22 includes a voice search terminal device 4C and a server 5C connected via a predetermined network 6. The voice search device shown in FIG. 22 is different from the voice search device shown in FIG.
2 in that the function of the voice synthesis unit 20 is arranged on the server side. Specifically, the server 5C includes a server control unit 50, a candidate set DB 52, a communication processing unit 54, an option setting unit 56, a selection item determination unit 58, a guidance sentence generation unit 60, a speech recognition processing unit 62, and a speech synthesis unit 64. It has. The voice search terminal device 4C is different from the voice search terminal device 4B in that the voice recognition processing unit 12 and the voice synthesis unit 20 are deleted.

【０１２９】図２２に示す音声検索装置では、マイクロ
ホン１０によって集音された利用者の音声が制御部３８
によってデジタルの音声データに変換されてサーバ５Ｃ
に送信される。そして、送信された音声データに基づい
て、サーバ５Ｃ内の音声認識処理部６２により所定の音
声認識処理が行われる。また、案内文生成部６０によっ
て生成された案内文に対応して、音声合成部６４により
所定の音声合成処理が行われ、案内文に対応した音声デ
ータが生成される。生成された音声データは、音声検索
端末装置４Ｃに送信され、音声検索端末装置４Ｃ内の制
御部３８によってアナログ信号に変換されてスピーカ２
２に出力される。In the voice search device shown in FIG. 22, the voice of the user collected by the microphone 10 is transmitted to the control unit 38.
Is converted into digital audio data by the server 5C.
Sent to. Then, based on the transmitted voice data, a predetermined voice recognition process is performed by the voice recognition processing unit 62 in the server 5C. In addition, a predetermined voice synthesis process is performed by the voice synthesizing unit 64 in accordance with the guidance text generated by the guidance text generation unit 60, and voice data corresponding to the guidance text is generated. The generated voice data is transmitted to the voice search terminal device 4C, converted into an analog signal by the control unit 38 in the voice search terminal device 4C, and
2 is output.

【０１３０】図２０〜図２２に示す変形例の音声検索装
置では、多くの機能をサーバ側に配置しているので、音
声検索端末装置側の処理負担が軽減し、構成の簡略化が
可能となるため、音声検索端末装置のコストダウンを図
ることができる利点がある。また、上述した各実施形態
や変形例では、本発明の音声検索装置を車載用システム
に適用した場合について種々の形態を説明してきたが、
本発明の適用範囲は車載用システムに限定されるもので
はなく、他の種々のシステムに適用することができる。In the voice search device of the modified example shown in FIGS. 20 to 22, since many functions are arranged on the server side, the processing load on the voice search terminal device side is reduced, and the configuration can be simplified. Therefore, there is an advantage that the cost of the voice search terminal device can be reduced. Further, in each of the above-described embodiments and modifications, various embodiments have been described in the case where the voice search device of the present invention is applied to an in-vehicle system.
The application range of the present invention is not limited to a vehicle-mounted system, and can be applied to various other systems.

【０１３１】[0131]

【発明の効果】上述したように、本発明によれば、検索
キーとなりうる複数の文字列の読みを音声出力すること
により、認識対象となる文字列をあらかじめ利用者に提
示しており、これらの文字列のみを音声認識の対象とし
ているため、音声認識の精度を向上させることができ
る。As described above, according to the present invention, a character string to be recognized is presented to a user in advance by reading out a plurality of character strings that can be used as search keys. Since only the character string is targeted for speech recognition, the accuracy of speech recognition can be improved.

[Brief description of the drawings]

【図１】第１の実施形態の音声検索装置を含んで構成さ
れる車載用システムの構成を示す図である。FIG. 1 is a diagram illustrating a configuration of a vehicle-mounted system including a voice search device according to a first embodiment;

【図２】候補セットＤＢに格納されるデータの構造を示
す図である。FIG. 2 is a diagram showing a structure of data stored in a candidate set DB.

【図３】図２に示したデータ構造における上位階層の候
補セットと下位階層の候補セットとの対応関係を示す図
である。FIG. 3 is a diagram showing a correspondence relationship between an upper-layer candidate set and a lower-layer candidate set in the data structure shown in FIG. 2;

【図４】第１の実施形態の音声検索装置の動作手順を示
す流れ図である。FIG. 4 is a flowchart showing an operation procedure of the voice search device of the first embodiment.

【図５】対話例１において用いられるデータの内容を示
す図である。FIG. 5 is a diagram showing the contents of data used in Dialogue Example 1;

【図６】対話例３において用いられるデータの内容を示
す図である。FIG. 6 is a diagram showing the contents of data used in Dialogue Example 3;

【図７】対話例４において用いられるデータの内容を示
す図である。FIG. 7 is a diagram showing the contents of data used in Dialogue Example 4;

【図８】対話例５において用いられるデータの内容を示
す図である。FIG. 8 is a diagram showing the contents of data used in Dialogue Example 5;

【図９】第２の実施形態の音声検索装置を含んで構成さ
れる車載用システムの構成を示す図である。FIG. 9 is a diagram illustrating a configuration of a vehicle-mounted system including a voice search device according to a second embodiment.

【図１０】第２の実施形態の候補セットＤＢに格納され
るデータの構造を示す図である。FIG. 10 is a diagram illustrating a structure of data stored in a candidate set DB according to the second embodiment.

【図１１】第２の実施形態の音声検索装置の部分的な動
作手順を示す流れ図である。FIG. 11 is a flowchart showing a partial operation procedure of the voice search device of the second embodiment.

【図１２】ステップ１２２に示す処理の詳細な手順を示
す流れ図である。FIG. 12 is a flowchart showing a detailed procedure of a process shown in step 122;

【図１３】対話例６において候補セットＤＢから抽出さ
れるレコードの内容を示す図である。FIG. 13 is a diagram showing the contents of a record extracted from a candidate set DB in Dialog Example 6;

【図１４】対話例７において候補セットＤＢから抽出さ
れるレコードの内容を示す図である。FIG. 14 is a diagram illustrating the contents of a record extracted from a candidate set DB in Dialog Example 7;

【図１５】対話例８において候補セットＤＢから抽出さ
れるレコードの内容を示す図である。FIG. 15 is a diagram showing the contents of a record extracted from a candidate set DB in Dialogue Example 8;

【図１６】選択肢が自動的に選択される変形例における
音声検索装置の構成を示す図である。FIG. 16 is a diagram showing a configuration of a voice search device in a modification in which an option is automatically selected.

【図１７】選択頻度に応じて選択肢を自動的に選択する
場合の音声検索装置の部分的な動作手順を示す流れ図で
ある。FIG. 17 is a flowchart showing a partial operation procedure of the voice search device when an option is automatically selected according to a selection frequency.

【図１８】一旦選択された選択肢を取り消して、新たに
選択肢を選択する場合の音声検索装置の動作手順を部分
的に示す流れ図である。FIG. 18 is a flowchart partially showing an operation procedure of the voice search device in a case where a once selected option is canceled and a new option is selected.

【図１９】ネットワークを介して接続されたサーバと端
末装置とに機能を分散配置した場合の音声検索装置の構
成例を示す図である。FIG. 19 is a diagram illustrating a configuration example of a voice search device when functions are distributed to a server and a terminal device connected via a network.

【図２０】ネットワークを介して接続されたサーバと端
末装置とに機能を分散配置した場合の音声検索装置の他
の構成例を示す図である。FIG. 20 is a diagram illustrating another configuration example of the voice search device when functions are distributed to a server and a terminal device connected via a network.

【図２１】ネットワークを介して接続されたサーバと端
末装置とに機能を分散配置した場合の音声検索装置の他
の構成例を示す図である。FIG. 21 is a diagram illustrating another configuration example of the voice search device when functions are distributed to a server and a terminal device connected via a network.

【図２２】ネットワークを介して接続されたサーバと端
末装置とに機能を分散配置した場合の音声検索装置の他
の構成例を示す図である。FIG. 22 is a diagram illustrating another configuration example of the voice search device when functions are distributed to a server and a terminal device connected via a network.

[Explanation of symbols]

１、１Ａ、１Ｂ音声検索装置２ナビゲーション装置３オーディオ装置４、４Ａ、４Ｂ、４Ｃ音声検索端末装置５、５Ａ、５Ｂ、５Ｃサーバ６ネットワーク１０マイクロホン１２、６２音声認識処理部１４対話開始ボタン１５再要求ボタン１６、５６選択肢設定部１８、６０案内文生成部２０、６４音声出力部２２スピーカ２４、５８選択項目判定部２６、２６ａ、５２候補セットＤＢ（データベース）２８ＤＢ更新部３０動作指示出力部３２選択頻度学習部３４学習結果格納部 1, 1A, 1B Voice search device 2 Navigation device 3 Audio device 4, 4A, 4B, 4C Voice search terminal device 5, 5A, 5B, 5C server 6 Network 10 Microphone 12, 62 Voice recognition processing unit 14 Dialogue start button 15 Re Request button 16, 56 Option setting unit 18, 60 Guidance sentence generation unit 20, 64 Voice output unit 22 Speaker 24, 58 Selection item determination unit 26, 26a, 52 Candidate set DB (database) 28 DB update unit 30 Operation instruction output unit 32 selection frequency learning unit 34 learning result storage unit

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ０６Ｆ 17/30 Ｇ０６Ｆ 17/30 ３１０Ｂ３４０３４０ＢＧ１０Ｌ 15/06 Ｇ１０Ｌ 3/00 ５２１Ｗ 15/18 ５３７Ｊ 15/00 ５５１Ｑ 15/28 (72)発明者斉藤望東京都品川区西五反田１丁目１番８号アルパイン株式会社内Ｆターム(参考） 5B075 NK02 NK43 PP07 PP13 PQ04 PQ46 PQ75 PR03 5D015 AA04 BB01 DD02 HH00 KK01──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) G06F 17/30 G06F 17/30 310B 340 340B G10L 15/06 G10L 3/00 521W 15/18 537J 15/00 551Q 15/28 (72) Inventor Nozomu Saito 1-18 Nishi Gotanda, Shinagawa-ku, Tokyo Alpine, Inc. F-term (reference) 5B075 NK02 NK43 PP07 PP13 PQ04 PQ46 PQ75 PR03 5D015 AA04 BB01 DD02 HH00 KK01

Claims

[Claims]

1. A search key is associated with each of a plurality of search target items, and by comparing the content of voice input by a user with the search key, a corresponding search target item is selected. A recognition target character string output unit that sets a maximum number of character strings that can be the search key, and outputs a voice of a plurality of the character strings in a range not exceeding the number. A microphone that collects the voice of the voice, and performs voice recognition processing on the voice collected by the microphone, and outputs a character string corresponding to this voice as a target of voice output by the recognition target character string output unit. Voice recognition processing means for selecting from a character string; and the search target corresponding to the search key specified by the character string selected by the voice recognition processing means Voice search apparatus characterized by comprising an item extracting means for extracting an eye, a.

2. The apparatus according to claim 1, further comprising: a switch operated before the user speaks, wherein when the switch is operated, a voice recognition process by the voice recognition processing unit is started. Voice search device.

3. The voice search device according to claim 1, further comprising a selected character string confirmation unit that outputs a voice of the character string selected by the voice recognition processing unit.

4. The method according to claim 3, wherein when a negative opinion by the user is given to a result of the selection of the character string by the voice recognition processing means,
A voice search apparatus further comprising reselection instructing means for instructing the recognition target character string output means to output again a voice reading of the plurality of character strings used for obtaining the selection result. .

5. The character string output unit according to claim 1, wherein the recognition target character string output unit divides the character string into a plurality of times when the total number of the character strings to be recognized exceeds the maximum number. A voice search device, which outputs voice readings of a number of character strings not exceeding a maximum number, and wherein the voice recognition processing unit performs selection determination of the character strings for each voice output.

6. The voice output instruction according to claim 5, wherein when the user instructs voice output of another selection candidate, the recognition target character string output means is instructed to output a second or subsequent voice. A voice search device further comprising means.

7. The plurality of characters that have been output immediately before to the recognition target character string output means when the user instructs to output sound again, according to any one of claims 1 to 6. A speech retrieval apparatus further comprising a re-speech output instructing unit for instructing to read a row again by speech.

8. The apparatus according to claim 1, wherein when the user instructs to perform the character string selecting operation, the result of the voice recognition processing by the voice recognition processing unit is not used. Further comprising a character string selecting means for selecting the character string, wherein when the character string is selected by the character string selecting means, the item extracting means is selected by the voice recognition processing means. A voice search device, wherein an operation of extracting the search target item is performed using the character string selected by the character string selection means instead of a character string.

9. The method according to claim 1, wherein a plurality of the search keys are associated with each of the search target items, and the item extraction means corresponds to one of the search keys. When the one search target item cannot be narrowed down, the recognition target character string output unit using the other search key, the speech recognition processing unit, and the voice recognition unit until the one search target item can be narrowed down. A voice search device characterized by repeating processing by an item extracting means.

10. The plurality of characters corresponding to the plurality of search keys, wherein each of the plurality of search keys is associated with a different priority, and each of the plurality of search target items corresponds to the plurality of search keys. Further comprising a table storage unit for storing table information associated with a column, wherein the recognition target character string output unit, based on the table information, in order from the search key with a higher priority, the corresponding character string A voice search device characterized by outputting voice reading of a voice.

11. The method according to claim 9, wherein when the character string corresponding to one of the search keys is selected, the search key to be selected next and the character string corresponding to the search key are selected. A tree structure storage unit for storing tree structure information of a plurality of hierarchies, wherein the recognition target character string output unit is configured to store a plurality of tree structures corresponding to the search keys to be output next based on the tree structure information. Wherein the character strings are extracted and the readings of these character strings are output as voice.

12. The apparatus according to claim 1, further comprising a selection history storage unit configured to store past selection history information by the voice recognition processing unit, wherein the recognition target character string output unit includes the selection history information. A voice search device that determines the character string having a high selection frequency based on the character string, and preferentially outputs voice reading of the character string.

13. The character string according to claim 1, wherein each of the plurality of character strings is composed of one of 50 Japanese sounds. A voice search device, wherein the search key that matches the one sound selected by the voice recognition processing unit is extracted.

14. The speech recognition processing device according to claim 1, wherein the speech recognition processing unit compares all characters constituting the character string with the entire speech recognition processing result, A voice search device for making a selection.

15. The character string processing device according to claim 1, wherein the voice recognition processing unit compares a character constituting a part of the character string with a whole voice recognition processing result. A voice search device characterized by making a selection.

16. The voice recognition processing unit according to claim 14, wherein the voice of the user is collected by the microphone before the voice output by the recognition target character string output unit ends. , A character string selecting operation is started.

17. The voice search device according to claim 1, wherein the maximum number is set in a range of 7 ± 2.

18. The server according to claim 1, wherein a function is distributed to a server and a terminal device connected via a network, and the server corresponds to the search target item. The terminal device has a function of storing information related to a search key, and the terminal device has a function corresponding to the recognition target character string output unit, the microphone, the voice recognition processing unit, and the item extraction unit, and uses these. A voice search device for acquiring necessary information from the server prior to various processes.

19. The voice search device according to claim 18, wherein the information sent from the server to the terminal device is difference information including a change to the information sent up to the previous time.

20. The function according to claim 1, wherein a function is distributed to a server and a terminal device connected via a network, and the server corresponds to the search target item. A function of storing information related to a search key and performing a process of extracting the character string to be output as a sound by the recognition target character string output unit and a process of extracting the search target item by the item extraction unit, The terminal device has a function corresponding to the recognition target character string output unit, the microphone, and the voice recognition processing unit, and acquires information necessary for these processes from the server. .