JP4466171B2

JP4466171B2 - Information retrieval device

Info

Publication number: JP4466171B2
Application number: JP2004114385A
Authority: JP
Inventors: 和幸高木
Original assignee: Nissan Motor Co Ltd
Current assignee: Nissan Motor Co Ltd
Priority date: 2004-04-08
Filing date: 2004-04-08
Publication date: 2010-05-26
Anticipated expiration: 2024-04-08
Also published as: JP2005301511A

Description

本発明は、音声により情報を検索する情報検索装置、および方法に関する。 The present invention relates to an information search apparatus and method for searching for information by voice.

電話番号と、電話番号と関連付けた音声タグとをメモリに格納しておき、使用者に対してメモリに格納した音声タグを登録順に順次音声で提示し、使用者が発信先に相当する音声タグが提示された時点で当該音声タグを選択することで、選択した音声タグに関連付けられた電話番号を参照するようにした無線電話が特許文献１によって知られている。 The phone number and the voice tag associated with the phone number are stored in the memory, and the voice tag stored in the memory is sequentially presented to the user in the order of registration, so that the user corresponds to the destination. Patent Document 1 discloses a radiotelephone in which a telephone number associated with a selected voice tag is referred to by selecting the voice tag at the time when the voice tag is presented.

特開平１１−１０３３３８号公報Japanese Patent Laid-Open No. 11-103338

しかしながら、使用者に対してメモリに記憶した音声タグを格納順に順次音声で提示する従来の装置においては、メモリに格納したデータ量が多い場合、また発信先に該当する音声タグがメモリ内の後方に格納されている場合には、発信先の音声タグが提示されるまでに時間がかかるという問題が生じていた。 However, in the conventional apparatus that presents the voice tags stored in the memory to the user sequentially in the order of storage, when the amount of data stored in the memory is large, the voice tag corresponding to the destination is located in the rear of the memory. In the case where it is stored in the network, there has been a problem that it takes time until the destination voice tag is presented.

本発明は、複数の情報を記憶するとともに、各情報ごとに付された情報の略称の音声データ（以下、音声タグという）を前記各情報に対応付けて記憶しておき、音声入力した情報の略称を各情報の音声タグと照合し、記憶されている情報の中から一致結果が得られた音声タグに対応する情報を検索する情報検索装置および方法において、各情報の検索履歴を記憶しておき、複数の情報の略称のリストを各情報の検索履歴に応じた順序で音声により提示することを特徴とする。 The present invention stores a plurality of pieces of information and also stores abbreviated audio data (hereinafter referred to as audio tags) of information attached to each information in association with the information. In an information search device and method for searching for information corresponding to a voice tag for which a matching result is obtained from stored information by comparing an abbreviation with a voice tag of each information, a search history of each information is stored. In addition, a list of abbreviations of a plurality of information is presented by voice in an order corresponding to a search history of each information.

本発明によれば、必要な情報にたどり着くまでの検索時間を短縮できる可能性が高くなり、利便性を向上させることができる。 According to the present invention, it is highly possible to shorten the search time until the required information is reached, and convenience can be improved.

図１は、本発明による情報検索装置の一実施の形態を示し、情報検索装置を車両用のハンズフリー電話システムへと適用した場合のブロック図である。ハンズフリー電話システム１００は、運転者（使用者）の発話を入力するマイクロフォン１０１と、音声を出力するスピーカー１０２と、運転者がシステムの起動するための起動スイッチ１０３と、日時を計時する時計１０４と、携帯電話１０５と、情報検索コントローラー１０６とを有している。この車両用ハンズフリー電話システムでは、運転者が発信相手と通話するときはマイクロフォン１０１を介して発話し、スピーカー１０２から出力される発信相手の声を聞き取る。 FIG. 1 shows an embodiment of an information search apparatus according to the present invention, and is a block diagram when the information search apparatus is applied to a hands-free telephone system for a vehicle. A hands-free telephone system 100 includes a microphone 101 that inputs a driver's (user) speech, a speaker 102 that outputs sound, a start switch 103 that allows the driver to start the system, and a clock 104 that measures the date and time. And a mobile phone 105 and an information search controller 106. In this vehicular hands-free telephone system, when a driver talks with a calling party, he / she speaks through the microphone 101 and listens to the calling party's voice output from the speaker 102.

情報検索コントローラー１０６は、電話帳データ記憶メモリ１０６ａと、電話帳データ更新部１０６ｂと、音声タグ出力制御部１０６ｃと、音声認識用辞書メモリ１０６ｄと、音声認識部１０６ｅと、音声タグ−電話番号変換部１０６ｆと、通話制御部１０６ｇとを有している。なお、電話帳データ更新部１０６ｂ、音声タグ出力制御部１０６ｃ、音声認識部１０６ｅ、音声タグ−電話番号変換部１０６ｆ、および通話制御部１０６ｇは、マイクロコンピューターのソフトウェア形態により構成される。また、電話帳データ記憶メモリ１０６ａと音声認識用辞書メモリ１０６ｄにはＥＥＰＲＯＭなどの不揮発性メモリやバックアップＲＡＭなどを用いる。 The information retrieval controller 106 includes a telephone directory data storage memory 106a, a telephone directory data update unit 106b, a voice tag output control unit 106c, a voice recognition dictionary memory 106d, a voice recognition unit 106e, and a voice tag-phone number conversion. 106f and a call control unit 106g. The telephone directory data update unit 106b, the voice tag output control unit 106c, the voice recognition unit 106e, the voice tag-phone number conversion unit 106f, and the call control unit 106g are configured in the form of a microcomputer software. Further, a non-volatile memory such as an EEPROM or a backup RAM is used for the telephone directory data storage memory 106a and the voice recognition dictionary memory 106d.

電話帳データ記憶メモリ１０６ａには、図２に示す形式でデータが格納される。各レコードには一意に識別可能な識別番号２ａがレコードの登録順に付与され、電話帳データとしての電話番号２ｂが記憶される。音声タグ２ｃは、電話帳から電話番号を簡単、容易に読み出すために電話番号ごとに付けられた略称の音声データであり、使用者が電話番号の登録の際に自由に設定することができる。例えば、図２に示す例では、識別番号２ａが１の電話番号に対して「自宅」という音声タグ２ｃが設定されている。なお、図２においては、音声タグ２ｃに格納されるデータ（「自宅」、「会社」）は説明の便宜上、文字で表記してあるが、実際には波形で表される音声データとして格納される。 The telephone book data storage memory 106a stores data in the format shown in FIG. Each record is given a uniquely identifiable identification number 2a in the order of registration of the records, and a telephone number 2b is stored as telephone directory data. The voice tag 2c is abbreviated voice data assigned to each telephone number in order to easily and easily read out the telephone number from the telephone directory, and can be freely set by the user when registering the telephone number. For example, in the example shown in FIG. 2, the voice tag 2 c “home” is set for the telephone number having the identification number 2 a of 1. In FIG. 2, the data (“home”, “company”) stored in the audio tag 2c is represented by characters for convenience of explanation, but is actually stored as audio data represented by a waveform. The

最終アクセス日時２ｄは、運転者が電話番号を呼び出すためにレコードを参照した際に更新され、常に当該レコードの最終（最新）の参照日時が格納される。アクセス回数カウンタ２ｅは、運転者がレコードを参照した際に１ずつカウントアップされ、運転者によるレコード参照回数が格納される。すなわち、運転者が音声タグ２ｃが「自宅」である電話番号２ｂを発信先として呼び出したときに、時計１０４から取得したそのときの日時が最終アクセス日時２ｄに格納され、アクセス回数カウンタ２ｅに１が加算される。 The last access date and time 2d is updated when the driver refers to the record to call the telephone number, and the last (latest) reference date and time of the record is always stored. The access count counter 2e is incremented by 1 when the driver refers to the record, and stores the record reference count by the driver. That is, when the driver calls the telephone number 2b whose voice tag 2c is “home” as a call destination, the current date and time acquired from the clock 104 is stored in the last access date and time 2d, and the access number counter 2e is set to 1. Is added.

電話帳データ更新部１０６ｂは、上述した電話帳データ記憶メモリ１０６ａへのデータの登録、および更新を制御する。すなわち、運転者によって電話帳データの登録が指示された場合には、運転者からの入力内容に基づいて、図２に示す形式で電話帳データを登録する。なお、登録時においては、最終アクセス日時２ｄには登録日時が、アクセス回数カウンタ２ｅには０が初期値として格納される。そして、運転者によって電話帳データ記憶メモリ１０６ａ内のデータが参照された際には、上述したとおり最終アクセス日時２ｄ、およびアクセス回数カウンタ２ｅを更新する。電話帳データ更新部１０６ｂはまた、電話帳データ記憶メモリ１０６ａへ登録された音声タグ２ｃを後述する音声認識用辞書メモリ１０６ｄへ登録し、その音声データを運転者が発話したときに照合できるようにする。 The phone book data update unit 106b controls registration and update of data in the above-described phone book data storage memory 106a. That is, when registration of phone book data is instructed by the driver, the phone book data is registered in the format shown in FIG. 2 based on the input contents from the driver. At the time of registration, the registration date is stored as the last access date 2d, and 0 is stored as the initial value in the access counter 2e. When the data in the telephone directory data storage memory 106a is referred to by the driver, the last access date 2d and the access count counter 2e are updated as described above. The telephone directory data updating unit 106b also registers the voice tag 2c registered in the telephone directory data storage memory 106a in a voice recognition dictionary memory 106d described later so that the voice data can be collated when the driver speaks. To do.

音声タグ出力制御部１０６ｃは、運転者からの指示に基づいて電話帳データ記憶メモリ１０６ａに格納された音声タグ２ｃのリストをスピーカー１０２を介して運転者に提示する。運転者は目的の発信先を呼び出す際に、電話番号２ｂに関連付けられた音声タグ２ｃを発話するが、発信先の音声タグ２ｃをどのような略称で登録したか忘れてしまうことが考えられる。このような場合に、電話帳データ記憶メモリ１０６ａに格納された音声タグ２ｃの内容を確認するため、運転者は音声タグ２ｃのリストを出力するよう指示を出す。 The voice tag output control unit 106c presents a list of voice tags 2c stored in the telephone directory data storage memory 106a to the driver via the speaker 102 based on an instruction from the driver. When the driver calls the target destination, the driver utters the voice tag 2c associated with the telephone number 2b. However, the driver may forget what abbreviation of the destination voice tag 2c is registered. In such a case, in order to confirm the contents of the voice tag 2c stored in the telephone directory data storage memory 106a, the driver gives an instruction to output a list of the voice tag 2c.

このとき、運転者はリストの出力順を指定することができ、本実施の形態においては、運転者は「登録順」、「アクセス頻度順」、「アクセス日時順」のいずれかを指定する。音声タグ出力制御部１０６ｃは、運転者によって出力順が指定されると、識別番号２ａ、アクセス回数カウンタ２ｅ、および最終アクセス日時２ｄをソート情報として用い、指定された出力順で音声タグ２ｃを音声出力する。 At this time, the driver can specify the output order of the list, and in the present embodiment, the driver specifies any one of “registration order”, “access frequency order”, and “access date / time order”. When the output order is designated by the driver, the voice tag output control unit 106c uses the identification number 2a, the access count counter 2e, and the last access date and time 2d as sort information, and the voice tag 2c is voiced in the designated output order. Output.

音声タグ出力制御部１０６ｃは、出力順として「登録順」が指定された場合は、電話帳データ記憶メモリ１０６ａに登録された順、すなわち電話帳データ記憶メモリ１０６ａに格納された識別番号２ａが小さいレコードから順に音声タグ２ｃのリストを音声出力する。これにより、かなり以前に登録し、今では忘れてしまっている可能性が高い音声タグ２ｃを優先的に出力することができる。 When “registration order” is designated as the output order, the voice tag output control unit 106c has a smaller order of registration in the phone book data storage memory 106a, that is, the identification number 2a stored in the phone book data storage memory 106a is smaller. A list of audio tags 2c is output in audio from the record. As a result, it is possible to preferentially output the audio tag 2c which has been registered quite a long time ago and is now likely to be forgotten.

「アクセス頻度順」が指定された場合は、アクセス頻度の少ない順、すなわち電話帳データ記憶メモリ１０６ａに格納されたアクセス回数カウンタ２ｅが小さいレコードから順に音声タグ２ｃのリストを音声出力する。これにより、運転者が過去にあまり使用しておらず、忘れてしまっている可能性の高い音声タグ２ｃを優先的に出力することができる。 When “access frequency order” is designated, the list of the audio tags 2c is output as a voice in the order from the lowest access frequency, that is, from the record with the smallest access count counter 2e stored in the telephone directory data storage memory 106a. As a result, it is possible to preferentially output the audio tag 2c that the driver has not used much in the past and is likely to have forgotten.

「アクセス日時順」が指定された場合は、アクセス時間の古い順、すなわち最終アクセス日時２ｄが古いレコードから順に音声タグ２ｃのリストを音声出力する。これにより、運転者が最近は使用しておらず、忘れてしまっている可能性の高い音声タグ２ｃを優先的に出力することができる。 When “in order of access date / time” is designated, the list of audio tags 2c is output in audio order from the oldest access time, that is, from the record with the oldest last access date / time 2d. As a result, it is possible to preferentially output the audio tag 2c that has not been used recently by the driver and is likely to be forgotten.

音声認識用辞書メモリ１０６ｄは、情報検索コントローラー１０６が運転者の発話を待ち受ける音声データを格納する。上述したように、本実施の形態における情報検索コントローラー１０６は、音声タグ２ｃの運転者による発話を待ち受け、それらを認識する必要があるため、電話帳データ記憶メモリ１０６ａに格納されているすべての音声タグ２ｃを音声認識用辞書メモリ１０６ｄに格納する。音声認識部１０６ｅは、運転者によって発話されマイクロフォン１０１を介して入力された各種コマンドや音声タグ２ｃと、音声認識用辞書メモリ１０６ｄに格納された各種コマンドや音声タグ２ｃとをマッチング処理して運転者の発話内容を認識する。 The voice recognition dictionary memory 106d stores voice data on which the information search controller 106 waits for the driver's speech. As described above, the information search controller 106 according to the present embodiment needs to wait for utterances by the driver of the voice tag 2c and recognize them, so all the voices stored in the telephone directory data storage memory 106a. The tag 2c is stored in the speech recognition dictionary memory 106d. The voice recognition unit 106e performs matching processing between various commands and voice tags 2c uttered by the driver and input via the microphone 101, and various commands and voice tags 2c stored in the voice recognition dictionary memory 106d. Recognize the utterance content of the person.

音声タグ−電話番号変換部１０６ｆは、電話帳データ記憶メモリ１０６ａを参照して音声認識部１０６ｅで音声認識された音声タグ２ｃに対応する電話番号２ｂを取得し、音声タグ２ｃから電話番号２ｂへの変換を行う。すなわち、音声認識された音声タグ２ｃに対応する電話番号２ｂを検索する。これによって、運転者が音声タグ２ｃを発話して発信先を指定すれば、発信先の電話番号２ｂを参照することができる。 The voice tag-phone number conversion unit 106f refers to the phone book data storage memory 106a, acquires the phone number 2b corresponding to the voice tag 2c that has been voice-recognized by the voice recognition unit 106e, and transfers the voice tag 2c to the phone number 2b. Perform the conversion. That is, the telephone number 2b corresponding to the voice tag 2c that has been voice-recognized is searched. Thus, if the driver utters the voice tag 2c and designates the destination, the destination telephone number 2b can be referred to.

通話制御部１０６ｇは携帯電話１０５による通話を制御する。すなわち、発信時には、音声タグ−電話番号変換部１０６ｆによって変換された発信先電話番号に発信するように携帯電話１０５を制御する。また着信時には、運転者からの通話開始指示にしたがって、相手との通話を開始するように携帯電話１０５を制御する。 The call control unit 106g controls a call by the mobile phone 105. That is, when making a call, the mobile phone 105 is controlled to make a call to the callee telephone number converted by the voice tag-phone number conversion unit 106f. When receiving a call, the mobile phone 105 is controlled to start a call with the other party in accordance with a call start instruction from the driver.

図３は、電話帳データ記憶メモリ１０６ａに電話帳データを新しく登録する処理を示すフローチャートである。ステップＳ１０で起動スイッチ１０３がオン状態にあって、ステップＳ２０で情報検索コントローラ１０６が音声入力待機状態にあるときに、ステップＳ３０で運転者から電話帳登録コマンド「電話帳登録」が発話され、マイクロフォン１０１を介して音声データが入力されたと判断するとステップＳ４０へ進む。 FIG. 3 is a flowchart showing a process of newly registering phone book data in the phone book data storage memory 106a. When the activation switch 103 is on in step S10 and the information search controller 106 is in a voice input standby state in step S20, a phone book registration command “phone book registration” is uttered from the driver in step S30, and the microphone If it is determined that voice data has been input via 101, the process proceeds to step S40.

ステップＳ４０において、音声認識部１０６ｅは、運転者により発話された「電話帳登録」の音声データと音声認識用辞書メモリ１０６ｄに格納されたコマンドとをマッチング処理し、最も一致度の高いコマンドを音声認識結果として決定する。運転者によって発話された電話帳登録コマンド「電話帳登録」が音声認識部１０６ｅによって正常に認識されると、情報検索コントローラー１０６は電話帳登録のための処理を開始する。 In step S40, the voice recognition unit 106e performs a matching process on the voice data of “phone book registration” uttered by the driver and the command stored in the voice recognition dictionary memory 106d, and the command with the highest degree of matching is voiced. Determined as a recognition result. When the phone book registration command “phone book registration” uttered by the driver is normally recognized by the voice recognition unit 106e, the information search controller 106 starts processing for phone book registration.

ステップＳ５０において、運転者に対して電話帳への登録を促すガイダンス音声をスピーカー１０２を介して出力する。本実施の形態においては、まず運転者に対して音声タグ２ｃの発話を促すため、「音声タグを入力してください」のようにガイダンス音声を出力する。その後、ステップＳ６０において、情報検索コントローラー１０６は音声入力待機状態となる。 In step S <b> 50, a guidance voice that prompts the driver to register in the telephone directory is output via the speaker 102. In the present embodiment, first, in order to prompt the driver to speak the voice tag 2c, a guidance voice is output as "Please input the voice tag". Thereafter, in step S60, the information search controller 106 enters a voice input standby state.

ステップＳ７０において、運転者によって音声タグ２ｃ、例えば「自宅」が発話されると、ステップＳ８０へ進む。ステップＳ８０において、電話帳データ更新部１０６ｂは、識別番号２ａを採番して電話帳データ記憶メモリ１０６ａ内に新規登録レコードを追加する。ステップＳ９０において、電話帳データ更新部１０６ｂは、ステップＳ７０で運転者によって発話された音声タグ２ｃを新規登録レコードの音声タグ２ｃに格納する。 In step S70, when the driver speaks the voice tag 2c, for example, "home", the process proceeds to step S80. In step S80, the telephone directory data update unit 106b assigns the identification number 2a and adds a new registration record in the telephone directory data storage memory 106a. In step S90, the telephone directory data update unit 106b stores the voice tag 2c uttered by the driver in step S70 in the voice tag 2c of the new registration record.

ステップＳ１００において、電話帳データ更新部１０６ｂは、上記新規登録レコードの音声タグ２ｃを音声認識用辞書メモリ１０６ｄに格納する。ステップＳ１１０において、電話帳データ更新部１０６ｂは、時計１０４から取得した現在の日時を上記新規登録レコードの最終アクセス日時２ｄに格納する。ステップＳ１２０において、電話帳データ更新部１０６ｂは、上記新規登録レコードのアクセス回数カウンタ２ｅに初期値として０を格納する。 In step S100, the telephone directory data update unit 106b stores the voice tag 2c of the new registration record in the voice recognition dictionary memory 106d. In step S110, the telephone directory data update unit 106b stores the current date and time acquired from the clock 104 in the last access date and time 2d of the new registration record. In step S120, the telephone directory data update unit 106b stores 0 as an initial value in the access count counter 2e of the new registration record.

ステップＳ１３０において、運転者に対して電話番号の入力を促すガイダンス音声、例えば「電話番号を入力してください」をスピーカー１０２を介して出力する。その後、ステップＳ１４０において、情報検索コントローラー１０６は音声入力待機状態となる。ステップＳ１５０において、運転者によって電話番号が発話されるとステップＳ１６０へ進む。 In step S <b> 130, a guidance voice that prompts the driver to input a telephone number, for example, “Please input a telephone number”, is output via the speaker 102. Thereafter, in step S140, the information search controller 106 enters a voice input standby state. In step S150, when the telephone number is uttered by the driver, the process proceeds to step S160.

ステップＳ１６０において、音声認識部１０６ｅは、運転者により発話された電話番号の数字と、音声認識用辞書メモリ１０６ｄに格納された数字とをマッチング処理し、最も一致度の高い数字を音声認識結果として決定する。運転者によって発話された電話番号が音声認識部１０６ｅによって正常に認識されると、ステップＳ１７０へ進む。 In step S160, the speech recognition unit 106e performs a matching process on the number of the telephone number uttered by the driver and the number stored in the speech recognition dictionary memory 106d, and uses the number with the highest degree of matching as the speech recognition result. decide. When the telephone number spoken by the driver is normally recognized by the voice recognition unit 106e, the process proceeds to step S170.

ステップＳ１７０において、認識した電話番号を新規登録レコードの電話番号２ｂに格納する。その後、処理を終了する。以上の処理により、図２に示したデータ形式で電話帳データ記憶メモリ１０６ａに新規レコードが登録される。 In step S170, the recognized telephone number is stored in telephone number 2b of the new registration record. Thereafter, the process ends. Through the above processing, a new record is registered in the telephone directory data storage memory 106a in the data format shown in FIG.

次に、電話帳データ記憶メモリ１０６ａに格納された電話帳データを参照して任意の発信先を呼び出す処理について説明する。図４は、運転者が任意の発信先の音声タグ２ｃを発話して電話帳データ記憶メモリ１０６ａから発信先の電話番号を参照する処理を示すフローチャートである。 Next, a process for calling an arbitrary destination with reference to the phone book data stored in the phone book data storage memory 106a will be described. FIG. 4 is a flowchart showing a process in which the driver speaks the voice tag 2c of an arbitrary destination and refers to the telephone number of the destination from the telephone directory data storage memory 106a.

ステップＳ２１０で起動スイッチ１０３がオン状態にあって、ステップＳ２２０で情報検索コントローラー１０６が音声入力待機状態にあるときに、ステップＳ２３０で運転者から電話帳参照コマンド「電話帳参照」が発話され、マイクロフォン１０１を介して音声データが入力されたと判断するとステップＳ２４０へ進む。 When the activation switch 103 is on in step S210 and the information search controller 106 is in a voice input standby state in step S220, the driver issues a phone book reference command “phone book reference” in step S230, and the microphone If it is determined that the voice data is input via 101, the process proceeds to step S240.

ステップＳ２４０において、音声認識部１０６ｅは、運転者により発話された「電話帳参照」の音声データと音声認識用辞書メモリ１０６ｄに格納されたコマンドとをマッチング処理し、最も一致度の高いコマンドを音声認識結果として決定する。運転者によって発話された電話帳登録コマンド「電話帳参照」が音声認識部１０６ｅによって正常に認識されると、情報検索コントローラー１０６は電話帳参照のための処理を開始する。 In step S240, the voice recognition unit 106e performs a matching process on the voice data of “refer to the phone book” uttered by the driver and the command stored in the voice recognition dictionary memory 106d, and the command with the highest degree of matching is voiced. Determined as a recognition result. When the phone book registration command “phone book reference” uttered by the driver is normally recognized by the voice recognition unit 106e, the information search controller 106 starts processing for phone book reference.

ステップＳ２５０において、運転者に対して参照する音声タグ２ｃの発話を促すため、「音声タグを入力してください」のようにガイダンス音声を出力した後、ステップＳ２６０で情報検索コントローラー１０６は音声入力待機状態となる。ステップＳ２７０において、運転者によって音声タグ２ｃ、ここでは例えば「自宅」が発話されたとするとステップＳ２８０へ進み、音声認識部１０６ｅは、運転者により発話された音声タグと音声認識用辞書メモリ１０６ｄに格納された音声タグ２ｃとをマッチング処理し、最も一致度の高い音声タグ２ｃを音声認識結果として決定する。 In step S250, in order to prompt the driver to speak the voice tag 2c to be referred to, after outputting a guidance voice such as “Please input voice tag”, the information search controller 106 waits for voice input in step S260. It becomes a state. In step S270, if the driver utters a voice tag 2c, for example, "home" here, the process proceeds to step S280, and the voice recognition unit 106e stores the voice tag spoken by the driver and the voice recognition dictionary memory 106d. The voice tag 2c thus matched is subjected to a matching process, and the voice tag 2c having the highest degree of matching is determined as a voice recognition result.

ステップＳ２９０において、音声タグ−電話番号変換部１０６ｆは、電話帳データ記憶メモリ１０６ａを参照して運転者により発話された音声タグ２ｃに対応する電話番号２ｂを読み出し、音声タグ２ｃから電話番号２ｂへの変換を行う。すなわち、音声タグ２ｃが「自宅」のレコードを参照して電話番号２ｂを取得する。 In step S290, the voice tag-phone number conversion unit 106f reads the telephone number 2b corresponding to the voice tag 2c spoken by the driver with reference to the telephone directory data storage memory 106a, and from the voice tag 2c to the telephone number 2b. Perform the conversion. That is, the telephone number 2b is acquired by referring to the record in which the voice tag 2c is “home”.

ステップＳ３００において、電話帳データ更新部１０６ｂは、電話帳データ記憶メモリ１０６ａの音声タグ２ｃが「自宅」のレコードの最終アクセス日時２ｄを時計１０４から取得した現在の日時で更新する。ステップＳ３１０において、電話帳データ更新部１０６ｂは、音声タグ２ｃが「自宅」のレコードのアクセス回数カウンタ２ｅのカウント値に１を加算する。 In step S300, the phone book data update unit 106b updates the last access date 2d of the record whose voice tag 2c in the phone book data storage memory 106a is “home” with the current date and time acquired from the clock 104. In step S310, the telephone directory data update unit 106b adds 1 to the count value of the access count counter 2e of the record whose voice tag 2c is “home”.

ステップＳ３２０において、電話帳データ更新部１０６ｂは、アクセス回数カウンタ２ｅのカウント値に１を加算した結果、当該レコードのカウント値に桁あふれが発生したか否かを判断する。カウント値に桁あふれが発生したと判断した場合、電話帳データ更新部１０６ｂは、音声タグ２ｃが「自宅」のレコードのアクセス回数カウンタ２ｅのカウント値を１を加算する前の値に戻し、すべてのレコードのアクセス回数カウンタ２ｅのカウント値を１／２にする。その後、改めて音声タグ２ｃが「自宅」のレコードのアクセス回数カウンタｅのカウント値に１を加算する。これによって、各レコード間のカウント値の大小関係を保持したまま、桁あふれを回避することができる。なお、カウント値が２で割り切れない場合には、小数点以下は切り捨てるものとする。 In step S320, the telephone directory data update unit 106b determines whether or not an overflow has occurred in the count value of the record as a result of adding 1 to the count value of the access count counter 2e. When it is determined that an overflow has occurred in the count value, the telephone directory data update unit 106b returns the count value of the access count counter 2e of the record whose voice tag 2c is “home” to the value before adding 1, The count value of the access count counter 2e of the record is halved. Thereafter, 1 is added again to the count value of the access counter e of the record in which the voice tag 2c is “home”. As a result, overflow of digits can be avoided while maintaining the magnitude relationship of the count values between the records. If the count value is not divisible by 2, the decimal part is rounded down.

なお、カウント値が桁あふれを起こしていないと判断した場合は、ステップＳ３３０をスキップする。ステップＳ３４０において、発話者により発話された音声タグ２ｃ、ここでは「自宅」の電話番号２ｂを発信先として発信するか否かを運転者に確認するために、例えば「自宅へ発信しますか？」という音声ガイダンスをスピーカー１０２を介して出力する。その後、ステップＳ３５０において、情報検索コントローラー１０６は音声入力待機状態となる。 If it is determined that the count value has not overflowed, step S330 is skipped. In step S340, in order to confirm to the driver whether or not the voice tag 2c uttered by the speaker, in this case, the telephone number 2b of “home” is to be transmitted as a call destination, for example, “Do you want to call home? Is output via the speaker 102. Thereafter, in step S350, the information search controller 106 enters a voice input standby state.

ステップＳ３６０において、発信を確認するための音声ガイダンスに対する応答を確認する。運転者からの応答があった場合はステップＳ３７０に進み、音声認識部１０６ｅは、運転者の応答内容と音声認識用辞書メモリ１０６ｄに格納された「はい」、「いいえ」とをマッチング処理し、最も一致度の高い待受け単語を音声認識結果として決定する。 In step S360, a response to the voice guidance for confirming the outgoing call is confirmed. If there is a response from the driver, the process proceeds to step S370, where the speech recognition unit 106e performs a matching process between the response content of the driver and “yes”, “no” stored in the speech recognition dictionary memory 106d, The standby word with the highest degree of matching is determined as the speech recognition result.

ステップＳ３８０において運転者の応答内容を確認し、発信先への発信に同意する「はい」であった場合はステップＳ３９０へ進む。ステップＳ３９０において、通信制御部１０６ｇは携帯電話１０５を制御して発信先へ発信する。すなわち電話をかける。なお、発話者の応答内容が発信先への発信に同意しない「いいえ」であった場合は発信せずに処理を終了する。 In step S380, the driver's response content is confirmed. If the answer is “Yes” to agree to the transmission to the destination, the process proceeds to step S390. In step S390, the communication control unit 106g controls the mobile phone 105 to make a call to the destination. That is, make a phone call. If the response content of the speaker is “No” not agreeing to the destination, the process is terminated without making a call.

次に、電話帳データ記憶メモリ１０６ａに格納された電話帳データから、すべてのレコードの音声タグ２ｃのリストを出力する処理について説明する。図５は、電話帳データ記憶メモリ１０６ａに格納された電話帳データから音声タグ２ｃのリストを出力する処理を示すフローチャートである。 Next, processing for outputting a list of voice tags 2c of all records from the phone book data stored in the phone book data storage memory 106a will be described. FIG. 5 is a flowchart showing a process of outputting a list of voice tags 2c from the phone book data stored in the phone book data storage memory 106a.

ステップＳ４１０で起動スイッチ１０３がオン状態にあって、ステップＳ４２０で情報検索コントローラー１０６が音声入力待機状態にあるときに、ステップＳ４３０で運転者から音声タグ２ｃのリストを出力するコマンド「リスト出力」が発話され、マイクロフォン１０１を介して音声データが入力されたと判断するとステップＳ４４０へ進む。 When the activation switch 103 is on in step S410 and the information search controller 106 is in a voice input standby state in step S420, a command “list output” for outputting a list of voice tags 2c from the driver in step S430 is issued. If it is determined that voice data is input through the microphone 101, the process proceeds to step S440.

ステップＳ４４０において、音声認識部１０６ｅは、運転者により発話された「リスト出力」の音声データと音声認識用辞書メモリ１０６ｄに格納されたコマンドとをマッチング処理し、最も一致度の高いコマンドを音声認識結果として決定する。運転者によって発話された電話帳登録コマンド「リスト出力」が音声認識部１０６ｅによって正常に認識されると、情報検索コントローラー１０６は電話帳参照のための処理を開始する。 In step S440, the voice recognition unit 106e performs a matching process on the voice data of “list output” uttered by the driver and the command stored in the voice recognition dictionary memory 106d, and recognizes the command having the highest degree of coincidence. Determine as a result. When the phone book registration command “list output” uttered by the driver is normally recognized by the voice recognition unit 106e, the information search controller 106 starts processing for referring to the phone book.

ステップＳ４５０において、運転者に対して音声タグ２ｃのリストの出力順序の発話を促すために、「リスト出力順を入力してください」という音声ガイダンスを出力する。その後、ステップＳ４６０において情報検索コントローラー１０６は音声入力待機状態となる。 In step S450, in order to prompt the driver to speak the output order of the list of the voice tag 2c, a voice guidance “Please input the list output order” is output. Thereafter, in step S460, the information search controller 106 enters a voice input standby state.

ステップＳ４７０において、運転者によってリスト出力順として「登録順」、「アクセス頻度順」、「アクセス日時順」のいずれかが発話されると、ステップＳ４８０へ進む。ステップＳ４８０において、音声認識部１０６ｅは、運転者により発話されたリスト出力順の音声データと音声認識用辞書メモリ１０６ｄに格納されたリスト出力順の音声データとをマッチング処理し、最も一致度の高いリスト出力順を音声認識結果として決定する。 In step S470, if any one of “registration order”, “access frequency order”, and “access date / time order” is uttered as the list output order by the driver, the process proceeds to step S480. In step S480, the voice recognition unit 106e performs a matching process on the voice data in the list output order uttered by the driver and the voice data in the list output order stored in the voice recognition dictionary memory 106d, and has the highest degree of matching. The list output order is determined as a speech recognition result.

ステップＳ４９０において、リスト出力順に応じて処理を分岐する。リスト出力順が「登録順」であった場合にはステップＳ５００へ進み、音声タグ出力制御部１０６ｃは、電話帳データ記憶メモリ１０６ａに格納された識別番号２ａが小さいレコードから順に音声タグ２ｃのリストを音声出力する。これにより、かなり以前に登録し、今では忘れてしまっている可能性が高い音声タグ２ｃを優先的に出力することができる。 In step S490, the process branches according to the list output order. If the list output order is "registration order", the process proceeds to step S500, where the voice tag output control unit 106c lists the voice tags 2c in order from the record with the smallest identification number 2a stored in the telephone directory data storage memory 106a. Is output as audio. As a result, it is possible to preferentially output the audio tag 2c which has been registered quite a long time ago and is now likely to be forgotten.

また、リスト出力順が「アクセス頻度順」であった場合にはステップＳ５１０へ進み、音声タグ出力制御部１０６ｃは、アクセス頻度の少ない順、すなわち電話帳データ記憶メモリ１０６ａに格納されたアクセス回数カウンタ２ｅが小さいレコードから順に音声タグ２ｃのリストを音声出力する。これにより、運転者が過去にあまり使用しておらず、忘れてしまっている可能性の高い音声タグ２ｃを優先的に出力することができる。 If the list output order is “access frequency order”, the process advances to step S510, and the voice tag output control unit 106c determines the access frequency counter stored in the phone book data storage memory 106a in the order of low access frequency. A list of audio tags 2c is output in audio from the records with the smallest 2e. As a result, it is possible to preferentially output the audio tag 2c that the driver has not used much in the past and is likely to have forgotten.

リスト出力順が「アクセス日時順」であった場合にはステップＳ５２０へ進み、音声タグ出力制御部１０６ｃは、アクセス日時の古い順、すなわち最終アクセス日時２ｄが古いレコードから順に音声タグ２ｃのリストを音声出力する。これにより、運転者が最近は使用しておらず、忘れてしまっている可能性の高い音声タグ２ｃを優先的に出力することができる。 When the list output order is “access date order”, the process proceeds to step S520, and the audio tag output control unit 106c displays the list of the audio tags 2c in order from the oldest access date, that is, the record with the last access date 2d from the oldest. Output audio. As a result, it is possible to preferentially output the audio tag 2c that has not been used recently by the driver and is likely to be forgotten.

以上、本実施の形態によれば、以下のような作用効果を得ることができる。
（１）運転者は、音声タグのリスト出力順として「登録順」、「アクセス頻度順」、「アクセス日時順」のいずれかを指定し、指定された出力順に基づいて音声タグのリストを音声出力することとした。これにより、音声タグのリストの出力順を指定することができ、運転者が捜し求めている音声タグを早く提示させる可能性が高くなる。
（２）音声タグの出力順として「アクセス頻度順」が指定された場合には、アクセス頻度の少ないレコードから順に音声タグのリストを音声出力することにした。これにより、過去にあまり使用しておらず、忘れてしまっている可能性の高い音声タグから優先的に提示することができる。 As described above, according to the present embodiment, the following operational effects can be obtained.
(1) The driver designates “registration order”, “access frequency order”, or “access date order” as the voice tag list output order, and the voice tag list is voiced based on the designated output order. I decided to output it. As a result, the output order of the list of voice tags can be specified, and the possibility of promptly presenting the voice tag that the driver is looking for increases.
(2) When “access frequency order” is designated as the output order of the audio tags, the audio tag list is output in order from the record with the lowest access frequency. As a result, it is possible to present with priority from voice tags that have not been used much in the past and are likely to be forgotten.

（３）音声タグの出力順として「アクセス日時順」が指定された場合には、最終アクセス日時の古いレコードから順に音声タグのリストを音声出力することにした。これにより、運転者が最近は使用しておらず、忘れてしまっている可能性の高い音声タグから優先的に提示することができる。
（４）アクセス回数カウンタ２ｅのカウント値に１を加算した結果、当該レコードのカウント値に桁あふれが発生する場合は、電話帳データ記憶メモリ１０６ａに格納されている全レコードのアクセス回数カウンタ２ｅのカウント値を１／２にしてから改めて１を加算することにした。これによって、各レコード間のカウント値の大小関係を保持したまま、桁あふれを回避することができ、処理を続行することができる。 (3) When “order of access date / time” is designated as the output order of audio tags, the audio tag list is output in audio from the records with the oldest access date / time. Thereby, it can present preferentially from the voice tag which has not been used recently by the driver and is likely to be forgotten.
(4) As a result of adding 1 to the count value of the access count counter 2e, if an overflow occurs in the count value of the record, the access count counter 2e of all records stored in the telephone directory data storage memory 106a After the count value was halved, 1 was added again. Thus, overflow can be avoided while maintaining the magnitude relationship of the count values between records, and the processing can be continued.

なお、カウント値が２で割り切れない場合は小数点以下は切り捨てることにしたが、小数点以下は切り上げても四捨五入してもよい。また、全レコードのアクセス回数カウンタ２ｅのカウント値を１／２にして桁あふれを回避する例を示したが、例えば１／３、１／４等にして桁あふれを回避してもよい。 When the count value is not divisible by 2, the decimal part is rounded down. However, the decimal part may be rounded up or rounded off. Moreover, although the example in which the count value of the access count counter 2e of all the records is halved to avoid the overflow is shown, the overflow may be avoided by, for example, 1/3, 1/4, and the like.

なお、上述した一実施の形態では、運転者は、音声タグ２ｃのリスト出力順として「登録順」、「アクセス頻度順」、「アクセス日時順」の３つからいずれかの出力順を指定することとしたが、これらのうち少なくともいずれか２つの出力順を指定可能としてもよい。 In the above-described embodiment, the driver designates one of the output orders from “registration order”, “access frequency order”, and “access date order” as the list output order of the audio tag 2c. However, at least any two of these output orders may be designated.

さらに、上述した一実施の形態では、電話帳データ記憶メモリ１０６ａに新しいレコードを登録する際には、電話番号２ｂ、および音声他タグ２ｃを音声入力することにした。しかし、例えば操作スイッチなど不図示の手動入力装置によって入力を行ってもよい。また、音声入力と手動入力の切替スイッチを設け、運転者が必要に応じていずれかの方法に切り替えて入力を行うようにしてもよい。なお、音声入力以外の方法で音声タグを入力する場合には、電話帳データ記憶メモリ１０６ａに格納するときに、音声タグを音声データに変換する処理を行って音声データで記憶する。 Further, in the above-described embodiment, when a new record is registered in the telephone directory data storage memory 106a, the telephone number 2b and the voice other tag 2c are inputted by voice. However, input may be performed by a manual input device (not shown) such as an operation switch. Further, a switch for switching between voice input and manual input may be provided, and the driver may switch to one of the methods as necessary to perform input. When inputting a voice tag by a method other than voice input, when the voice tag is stored in the telephone directory data storage memory 106a, the voice tag is converted into voice data and stored as voice data.

さらに、上述した一実施の形態では、音声タグ２ｃのリストはスピーカー１０２を介して音声で出力することにした。しかし、これに限定されず、以下のように変形してもよい。例えば、文字や画像を表示するモニタを設け、音声タグのリストをモニタ画面上に文字で出力してもよい。この場合は、電話帳データ記憶メモリ１０６ａに音声タグ２ｃを文字データとして格納しておくか、あるいは音声タグ２ｃを文字データに変換して出力すればよい。 Furthermore, in the above-described embodiment, the list of the audio tags 2 c is output as audio via the speaker 102. However, the present invention is not limited to this and may be modified as follows. For example, a monitor that displays characters and images may be provided, and a list of voice tags may be output as characters on the monitor screen. In this case, the voice tag 2c may be stored as character data in the telephone directory data storage memory 106a, or the voice tag 2c may be converted into character data and output.

さらにまた、上述した一実施の形態では、本発明による音声認識装置を車両用ハンズフリー電話システムに適用した一例を示した。しかし、本発明はハンズフリー電話システム以外の例えばオーディオシステムやナビゲーションシステムに適用してもよいし、その他の情報機器、あるいは車両用以外のそれらの装置に適用してもよい。この場合には、音声タグと関連付ける情報は、電話番号に代えて、各システム、および機器で使用者が頻繁に呼び出す情報、例えばナビゲーションシステムにおいては目的地情報等とすればよい。 Furthermore, in the above-described embodiment, an example in which the voice recognition device according to the present invention is applied to a hands-free telephone system for a vehicle is shown. However, the present invention may be applied to, for example, an audio system or a navigation system other than the hands-free telephone system, or may be applied to other information devices or devices other than those for vehicles. In this case, the information associated with the voice tag may be information frequently called by the user in each system and device, for example, destination information in the navigation system, instead of the telephone number.

特許請求の範囲の構成要素と一実施の形態の構成要素との対応関係は次の通りである。すなわち、電話帳データ記憶メモリ１０６ａおよび音声認識用辞書メモリ１０６ｄが情報記憶手段を、マイクロフォン１０１が音声入力手段を、音声タグ−電話番号変換部１０６ｆが情報検索手段を、音声タグ出力制御部１０６ｃが略称提示手段を、音声認識部１０６ｅがリスト順選択手段をそれぞれ構成する。なお、本発明の特徴的な機能を損なわない限り、各構成要素は上記構成に限定されるものではない。 The correspondence between the constituent elements of the claims and the constituent elements of the embodiment is as follows. That is, the telephone directory data storage memory 106a and the voice recognition dictionary memory 106d are information storage means, the microphone 101 is a voice input means, the voice tag-phone number conversion unit 106f is an information search means, and the voice tag output control unit 106c is As the abbreviation presenting means, the voice recognition unit 106e constitutes a list order selecting means. In addition, as long as the characteristic function of this invention is not impaired, each component is not limited to the said structure.

本発明による音声認識装置をハンズフリー電話システムへと適用した場合のブロック図である。It is a block diagram at the time of applying the speech recognition apparatus by this invention to a hands-free telephone system. 電話帳データ記憶メモリ１０６ａに格納されたデータ構造の具体例を示す図である。It is a figure which shows the specific example of the data structure stored in the telephone directory data storage memory 106a. 運転者が電話帳データ記憶メモリ１０６ａに電話帳データを登録するときの電話帳登録処理の流れを示したフローチャート図である。It is the flowchart figure which showed the flow of the telephone directory registration process when a driver | operator registers telephone directory data in the telephone directory data storage memory 106a. 運転者が音声タグ２ｃを発話して、該当する発信先の電話番号を電話帳データ記憶メモリ１０６ａから参照するときの電話帳参照処理の流れを示したフローチャート図である。It is the flowchart figure which showed the flow of the telephone directory reference process when a driver | operator utters the audio | voice tag 2c and references the telephone number of an applicable transmission destination from the telephone directory data storage memory 106a. 電話帳データ記憶メモリ１０６ａに格納された電話帳データから、音声タグ２ｃの音声データのリストを出力するときのリスト出力処理の流れを示したフローチャート図である。It is the flowchart figure which showed the flow of the list output process when outputting the list | wrist of the audio | voice data of the audio | voice tag 2c from the telephone directory data stored in the telephone directory data storage memory 106a.

Explanation of symbols

１００ハンズフリー電話システム
１０１マイクロフォン
１０２スピーカー
１０３起動スイッチ
１０４時計
１０５携帯電話
１０６情報検索コントローラー
１０６ａ電話帳データ記憶メモリ
１０６ｂ電話帳データ更新部
１０６ｃ音声タグ出力制御部
１０６ｄ音声認識用辞書メモリ
１０６ｅ音声認識部
１０６ｆ音声タグ−電話番号変換部
１０６ｇ通話制御部 DESCRIPTION OF SYMBOLS 100 Hands-free telephone system 101 Microphone 102 Speaker 103 Starting switch 104 Clock 105 Mobile phone 106 Information retrieval controller 106a Telephone book data storage memory 106b Telephone book data update part 106c Voice tag output control part 106d Voice recognition dictionary memory 106e Voice recognition part 106f Voice tag-phone number conversion unit 106g Call control unit

Claims

Information storage means for storing a plurality of pieces of information and storing voice data (hereinafter referred to as voice tags) of information abbreviated to each piece of information in association with each piece of information;
Voice input means for inputting voice;
The abbreviation of the information input by the voice input unit is collated with the voice tag of each information, and information corresponding to the voice tag for which a matching result is obtained is searched from the information stored in the storage unit. In an information search device comprising information search means,
The information storage means further stores a search history of each information,
An information search apparatus comprising: abbreviation presentation means for presenting a list of abbreviations of the plurality of information by voice in an order corresponding to a search history of each information.

The information search device according to claim 1,
The search history includes the number of searches of each piece of information, and the abbreviation presentation unit presents a list of abbreviations of the information in order of the information with the smallest number of searches.

The information search device according to claim 1,
The search history includes the latest search date and time of each piece of information, and the abbreviation presentation means presents a list of abbreviations of the information in order of the information with the latest search date and time being old. .

The information search device according to claim 1,
The search history includes the number of searches for each piece of information and the latest search date and time,
A list order selection means for selecting one of the order of the information stored in the oldest date, the order of the information with a small number of searches, and the order of the information in the latest search date and time;
The abbreviation presentation means presents a list of abbreviations of the information in the order selected by the list order selection means.