JP4964574B2

JP4964574B2 - Information processing apparatus and method for registering speech reading vocabulary

Info

Publication number: JP4964574B2
Application number: JP2006332829A
Authority: JP
Inventors: 一紀佐久間; 修一松本
Original assignee: Alpine Electronics Inc
Current assignee: Alpine Electronics Inc
Priority date: 2006-12-11
Filing date: 2006-12-11
Publication date: 2012-07-04
Anticipated expiration: 2026-12-11
Also published as: JP2008145722A

Description

本発明は、情報処理装置及び音声読み上げ用語彙の登録方法に関し、より詳細には、電子メール（Ｅメール）文章やニュース文等の仮名漢字混じりのテキスト情報を音声で読み上げる機能（ＴＴＳ(Text To Speech)エンジン）を備えた情報処理装置において音声読み上げ用の語彙（テキスト）を登録するのに適応された技術に関する。 The present invention relates to an information processing apparatus and a method for registering a speech-to-speech vocabulary. More specifically, the present invention relates to a text-to-speech (TTS (Text To) text information including kana-kanji characters such as e-mail (E-mail) texts and news texts. The present invention relates to a technique adapted to register a vocabulary (text) for reading aloud in an information processing apparatus equipped with a speech engine.

従来、車載用ナビゲーション装置等の車載機器には、音声出力により様々な音声案内を行えるようにした機能が搭載されている。例えば、音声認識した結果を発声する「オウム返し」や、操作した結果に応じてタスクが実行される際に音声にて知らせる「トークバック」、例えば、「縮尺を１００ｍにします」などの音声案内がある。また、Ｅメール文章やニュース文等の仮名漢字混じりのテキスト情報も、ＴＴＳエンジンを用いた音声合成技術を利用すれば読み上げること（音声案内）が可能である。これらの音声案内（「オウム返し」、「トークバック」、「Ｅメール文章やニュース文等の読み上げ」）は、それぞれ「録音音声」、「ＴＴＳエンジン」、「録音音声及びＴＴＳエンジン」といった方法で実現されている。 2. Description of the Related Art Conventionally, in-vehicle devices such as an in-vehicle navigation device are equipped with a function that enables various voice guidance by voice output. For example, voice guidance such as “Returning a parrot” that utters the result of voice recognition, “Talkback” that informs by voice when a task is executed according to the operation result, for example, “Set the scale to 100 m” There is. Also, text information mixed with kana and kanji such as e-mail texts and news texts can be read aloud (voice guidance) by using a speech synthesis technology using a TTS engine. These voice guidance (“Parrot return”, “Talkback”, “Reading out e-mail texts and news texts”) can be made by methods such as “Recording voice”, “TTS engine”, “Recording voice and TTS engine”, respectively. It has been realized.

ＴＴＳエンジンは、日本語の文章を入力するとそれを音声で読み上げるシステム（変換ソフト）であり、携帯電話の電話応答サービスで広く利用され、最近では個人向けポータブルサービスで、ユーザの身近にモバイル端末やコンピュータがないという状況でも、電話をかければＥメールやニュース、市場動向等の情報を読み上げてくれる音声サービスで使われている。ＴＴＳエンジンは、入力された文章を音声に変換して読み上げるためのものであるが、仮名漢字混じりの文章を１００％正確に読むことは実質的に不可能である。かかる不都合を緩和するために、現状のＴＴＳエンジンはユーザ辞書を備えており、ユーザがこのユーザ辞書に漢字の「読み」を登録することで、それまで読めなかった漢字を正しく読めるようになる。 The TTS engine is a system (conversion software) that reads out Japanese texts by voice when it is entered, and is widely used in mobile phone answering services. Recently, it is a portable service for individuals, and it is close to the user. Even in the absence of a computer, it is used in voice services that read out information such as emails, news, and market trends when you make a call. The TTS engine is for converting the input text into speech and reading it out, but it is practically impossible to read the text mixed with kana and kanji 100% accurately. In order to alleviate such inconvenience, the current TTS engine is provided with a user dictionary, and the user registers kanji “reading” in the user dictionary so that kanji that could not be read can be read correctly.

図１はそのシステム（ＴＴＳエンジン）の一例を示したものであり、図示のように、言語解析部２と波形生成部３からなるＴＴＳ読み上げ部１と、ＴＴＳ処理を行う際に参照する言語解析辞書４、ユーザ辞書５及び音声合成辞書６とを備えている。このＴＴＳエンジンを使用して仮名漢字混じりの「テキスト」を読み上げる場合、言語解析部２において言語解析辞書４及びユーザ辞書５を基に当該テキスト情報を解析し、中間言語（読みとアクセントを記した文字列）を生成する。言語解析辞書４を使用しても正しく読めない仮名漢字混じりテキストは、そのテキスト（語彙）とその「読み」をユーザ辞書５に登録することで、正しく読み上げることが可能となる。言語解析部２を通して生成された中間言語は波形生成部３に入力され、音声合成辞書６を参照して、当該テキスト情報に対応する音声（波形）データに変換される。ユーザ辞書５は言語解析辞書４と異なり、カスタマイズが可能であるため、ＴＴＳエンジンを使用してアプリケーションを構築する設計者や開発者等の判断、客先からの要望などに基づいて、あらかじめ必要な語彙を登録しておくことができる。 FIG. 1 shows an example of the system (TTS engine). As shown in FIG. 1, a TTS reading unit 1 including a language analyzing unit 2 and a waveform generating unit 3 and a language analysis to be referred to when performing TTS processing. A dictionary 4, a user dictionary 5 and a speech synthesis dictionary 6 are provided. When using this TTS engine to read “text” mixed with kana and kanji, the language analysis unit 2 analyzes the text information based on the language analysis dictionary 4 and the user dictionary 5 and writes an intermediate language (with readings and accents). String). A kana-kanji mixed text that cannot be read correctly even if the language analysis dictionary 4 is used can be read out correctly by registering the text (vocabulary) and the “reading” in the user dictionary 5. The intermediate language generated through the language analysis unit 2 is input to the waveform generation unit 3 and converted into speech (waveform) data corresponding to the text information with reference to the speech synthesis dictionary 6. Since the user dictionary 5 can be customized unlike the language analysis dictionary 4, it is necessary in advance based on the judgment of the designer or developer who builds the application using the TTS engine, the request from the customer, etc. Vocabulary can be registered.

上記の従来技術に関連する技術としては、例えば、特許文献１に記載されるように、任意のテキストを読み上げて音声出力を行う機能を備えた情報処理装置において、同じ表記で読みの異なる語彙の仮名読みとしてユーザが使用する読み方を優先させ、より正確な読み上げを行えるようにしたものがある。
特開平９−２４５０２３号公報 As a technique related to the above-described conventional technique, for example, as described in Patent Document 1, in an information processing apparatus having a function of reading out an arbitrary text and outputting a voice, the vocabulary of different vocabulary with the same notation is read. Some of the kana readings give priority to the reading method used by the user so that more accurate reading can be performed.
Japanese Patent Laid-Open No. 9-245023

上述したように従来の技術（図１）では、言語解析辞書４を用いて正しく読めない仮名漢字混じりのテキストを、その「読み」と共にユーザ辞書５に登録することで正しく読めるようにしている。このように、言語解析辞書のみを用いて正しく読めないテキスト（漢字）及びその「読み」をユーザ辞書に登録することで正しく読めるようにした状態を、以下、「デフォルト状態」という。図２はその一例を示したものであり、従来のＴＴＳ処理を利用したデフォルト状態でのユーザ辞書登録の一例を示している。図示の例では、言語解析辞書のみを用いて正しく読めるか読めないかをチェックした結果に基づき、正しく読めなかったテキスト（「三加茂町」、「三股町」、「大分自動車道」）について、その表記（漢字）と共に「読み」（「ミカモチョウ」、「ミマタチョウ」、「オオイタジドウシャドウ」）をユーザ辞書に登録している。これにより、図中右側に示すように、言語解析辞書及びユーザ辞書を用いて最終的に全てのテキスト（市区町村名や道路名称等の読み上げ対応テキスト）を正しく読ませることができる。 As described above, in the conventional technique (FIG. 1), text mixed with kana and kanji characters that cannot be read correctly using the language analysis dictionary 4 is registered in the user dictionary 5 together with the “reading” so that it can be read correctly. The state in which text (kanji) that cannot be read correctly using only the language analysis dictionary and its “reading” registered in the user dictionary are referred to as “default state” hereinafter. FIG. 2 shows an example thereof, and shows an example of user dictionary registration in a default state using a conventional TTS process. In the example shown in the figure, based on the result of checking whether it can be read correctly or not using only the language analysis dictionary, texts that could not be read correctly (“Mikamocho”, “Mitsumacho”, “Oita Expressway”) Along with the notation (kanji), “reading” (“Mikachou”, “Mimachou”, “Oitaji Dodo Shadow”) is registered in the user dictionary. As a result, as shown on the right side of the figure, finally, all the texts (text corresponding to readings such as city names and street names) can be correctly read using the language analysis dictionary and the user dictionary.

しかしながら、ユーザにより新たに「仮名漢字混じりテキスト」をユーザ辞書５に追加した場合に、以下に記述するような問題が生じる。このようにデフォルト状態のユーザ辞書にユーザがテキストを追加した状態を、以下、「ユーザカスタマイズ状態」という。図３はその一例を示したものであり、従来のＴＴＳ処理を利用したユーザカスタマイズ状態でのユーザ辞書登録の一例を示している。図示の例では、ユーザが「川内」という語彙をその読み「カワウチ」と共に登録したが（図中、Ａ，Ｂで囲んだ部分）、デフォルト状態の読み上げ対応テキスト中に当該語彙と同じ語彙「川内」を含むテキスト「薩摩川内市」が含まれているため、ユーザカスタマイズ状態の読み上げ対応テキストでは、ユーザが追加した語彙「川内（カワウチ）」については正しく読めるが（図中、Ｃで囲んだ部分）、デフォルト状態で有していたテキスト「薩摩川内市」については、本来の読みである「サツマセンダイシ」が「サツマカワウチシ」と間違った「読み」になっている（図中、Ｄで囲んだ部分）。つまり、デフォルト状態では正しく読めていた読み上げ対応テキストが、ユーザカスタマイズ状態では正しく読めなくなるといった不都合が起こり得る。 However, when a new “text mixed with kana / kanji” is added to the user dictionary 5 by the user, the following problems arise. The state where the user adds text to the user dictionary in the default state in this way is hereinafter referred to as “user customization state”. FIG. 3 shows an example of this, and shows an example of user dictionary registration in a user customization state using conventional TTS processing. In the example shown in the figure, the user registered the vocabulary “Kawauchi” together with the reading “Kawauchi” (the part enclosed by A and B in the figure), but the same vocabulary “Kawauchi” in the default text-to-speech correspondence "Satsuma Kawauchi City" is included, so the user-customized text that can be read aloud can be read correctly for the vocabulary "Kawauchi" added by the user (the part enclosed in C in the figure) ) For the text “Satsuma Kawauchi City” that was in the default state, “Satsuma Sendai”, which is the original reading, is mistakenly read as “Satsuma Kawauchi” (D in the figure) Enclosed part). That is, there may be a problem that the text corresponding to reading that is correctly read in the default state cannot be read correctly in the user customization state.

かかる不都合は、例えば、市区町村名や道路名称等の「読み」を全てユーザ辞書に登録すれば解消され得るが、全ての語彙をユーザ辞書に登録すると、ユーザ辞書のサイズ（メモリ容量）が膨大になり、音声合成時に必要なリソース（メモリ、ＣＰＵ負荷など）が大きくなってしまうといった問題が生じる。 Such inconvenience can be solved, for example, by registering all “reading” such as city names and road names in the user dictionary, but if all vocabularies are registered in the user dictionary, the size (memory capacity) of the user dictionary is reduced. There is a problem that the amount of resources (memory, CPU load, etc.) required for speech synthesis increases.

かかる従来技術の課題は、車載用ナビゲーション装置等の車載機器に特有のものではなく、上述したように仮名漢字混じりのテキスト情報を音声で読み上げる機能（ＴＴＳエンジン）を備えた情報処理装置であれば、例えば、パーソナルコンピュータ（ＰＣ）やＰＤＡ等の携帯情報端末などの情報処理装置においても同様に起こり得る。 Such a problem of the prior art is not unique to an in-vehicle device such as an in-vehicle navigation device, but as long as it is an information processing device having a function (TTS engine) for reading out text information mixed with kana and kanji as described above. For example, it can occur in an information processing apparatus such as a personal computer (PC) or a personal digital assistant such as a PDA.

本発明は、かかる従来技術における課題に鑑み創作されたもので、ユーザ指示に基づいた語彙の辞書登録を行った場合でも読み上げ対応テキスト内での当該語彙の読みを常に正しい状態に保つことができる情報処理装置及び音声読み上げ用語彙の登録方法を提供することを目的とする。 The present invention was created in view of the problems in the prior art, and even when a dictionary of a vocabulary is registered based on a user instruction, the reading of the vocabulary in a text to be read out can always be kept in a correct state. It is an object to provide an information processing apparatus and a method for registering a speech reading vocabulary.

上記の従来技術の課題を解決するため、本発明の一形態によれば、仮名漢字混じりのテキストを音声データに変換して出力する機能を有した音声読み上げ手段と、記憶手段であって、音声読み上げ対象のテキストを解析する際に参照する言語解析辞書と、該言語解析辞書を使用しても正しく読めないテキスト及びその正しい読みをあらかじめ登録しておくためのデフォルト用とユーザ指示に基づいた語彙を登録するためのカスタマイズ用とに分かれたユーザ辞書と、前記言語解析辞書及び前記ユーザ辞書を用いて作成される音声読み上げ対象のテキストを含む読み上げ対応テキストと、音声読み上げを実行する際に参照する音声合成辞書とが格納されたものと、ユーザの指示を入力する入力手段と、前記音声読み上げ手段、前記記憶手段及び前記入力手段に動作可能に接続された制御手段とを備え、前記制御手段は、ユーザ指示に基づいた登録対象の語彙を取得したときに、前記読み上げ対応テキストを参照して当該語彙をその一部として含むテキストが含まれている場合に当該テキストを抽出し、該抽出した当該テキストのデフォルト状態での読みを取得した後、当該語彙及び当該語彙の読みを前記カスタマイズ用のユーザ辞書に登録してカスタマイズ状態とし、該カスタマイズ状態で取得した当該語彙をその一部として含むテキストの読みと当該テキストの前記デフォルト状態での読みとを比較して、両者の読みが不一致の場合に当該テキストの前記デフォルト状態での読みを前記デフォルト用のユーザ辞書に登録することを特徴とする情報処理装置が提供される。 In order to solve the above-described problems of the prior art, according to an aspect of the present invention, there is provided a speech reading unit having a function of converting text mixed with kana and kanji into speech data and outputting the speech data, and a storage unit, A language analysis dictionary to be referred to when analyzing the text to be read, a vocabulary based on default and user instructions for registering in advance text that cannot be read correctly even if the language analysis dictionary is used User dictionary divided for customization for registering , text-to-speech text including text-to-speech text created using the language analysis dictionary and the user dictionary, and reference when executing speech-to-speech and what is the speech synthesis dictionary stored, input means for inputting a user's instruction, the speech reading means, said storage means and Control means operatively connected to the input means, and when the control means obtains a vocabulary to be registered based on a user instruction, the control means refers to the text corresponding to reading a part of the vocabulary. the text is extracted if it contains text that contains as, after obtaining readings in the default state of the text the extracted, registered readings of the vocabulary and the vocabulary in the user dictionary for the customization and customize state, by comparing the readings in the default state of the reading and the text in the text including the vocabulary acquired by the customization state as a part thereof, both reading of the relevant text in the case of disagreement said default An information processing apparatus is provided in which readings in a state are registered in the default user dictionary.

この形態に係る情報処理装置によれば、ユーザが登録対象として指示した語彙を含むテキストが読み上げ対応テキストに含まれている場合に、当該テキストのデフォルト状態での読みを取得した後、当該語彙及び当該語彙の読みをカスタマイズ用のユーザ辞書に登録してカスタマイズ状態とし、該カスタマイズ状態での当該語彙をその一部として含むテキストの読みを取得し、該取得した読みと当該テキストのデフォルト状態での読みとを比較し、両者の読みが一致しない場合に、当該テキストのデフォルト状態での読みをデフォルト用のユーザ辞書に登録するようにしている。つまり、ユーザ指示に基づいた語彙及びその読み（例えば、「川内（カワウチ）」）の辞書登録を行った場合でも、当該語彙をその一部として含むテキスト（例えば、「薩摩川内市」）の読みがデフォルト状態での正しい読み（サツマセンダイシ）と不一致の場合（当該テキストの読みが変わってしまう場合）には、そのデフォルト状態での正しい読みも併せて辞書登録するようにしている。 According to the information processing apparatus according to this embodiment, when the user is included in the corresponding text-to-speech is text containing a vocabulary has been specified as a registration subject, after obtaining readings in the default state of the text, the vocabulary and The vocabulary reading is registered in the user dictionary for customization to be in a customized state, a reading of the text including the vocabulary in the customized state as a part thereof is acquired, and the acquired reading and the default state of the text are acquired. When the readings are compared and the readings do not match, the reading of the text in the default state is registered in the default user dictionary. In other words, even when a vocabulary based on a user instruction and its reading (for example, “Kawauchi”) are registered in a dictionary, reading of a text (for example, “Satsuma Kawauchi”) that includes the vocabulary as a part thereof is performed. Is inconsistent with the correct reading in the default state (Satsuma Sendai) (when the reading of the text changes), the correct reading in the default state is also registered in the dictionary.

これにより、読み上げ対応テキスト内での当該語彙（テキスト）の読みを常に正しい状態に維持することができる。また、従来のように全ての語彙を辞書登録する必要がないので、ユーザ辞書のサイズを小さく抑えることができ、音声合成時に必要なリソースを削減することが可能となる。 Thereby, the reading of the vocabulary (text) in the reading-ready text can always be maintained in a correct state. In addition, since it is not necessary to register all vocabularies in the dictionary as in the prior art, the size of the user dictionary can be kept small, and resources required for speech synthesis can be reduced.

また、本発明の他の形態によれば、音声読み上げ対象のテキストを解析する際に参照する言語解析辞書と、該言語解析辞書を使用しても正しく読めないテキスト及びその正しい読みをあらかじめ登録しておくためのデフォルト用とユーザ指示に基づいた語彙を登録するためのカスタマイズ用とに分かれたユーザ辞書と、音声読み上げを実行する際に参照する音声合成辞書とを用いて、仮名漢字混じりのテキストを音声データに変換して出力する音声読み上げ機能を備えた情報処理装置において、前記言語解析辞書及び前記ユーザ辞書を用いて作成される音声読み上げ対象のテキストを含む読み上げ対応テキストを記憶手段に格納しておき、ユーザ指示に基づいた登録対象の語彙を取得したときに、前記読み上げ対応テキストを参照して当該語彙をその一部として含むテキストを検索し、当該語彙をその一部として含むテキストが含まれている場合に当該テキストを抽出し、該抽出した当該テキストのデフォルト状態での読みを取得した後、当該語彙及び当該語彙の読みを前記カスタマイズ用のユーザ辞書に登録してカスタマイズ状態とし、該カスタマイズ状態で取得した当該語彙をその一部として含むテキストの読みと当該テキストの前記デフォルト状態での読みとを比較し、両者の読みが不一致の場合に当該テキストの前記デフォルト状態での読みを前記デフォルト用のユーザ辞書に登録するようにしたことを特徴とする音声読み上げ用語彙の登録方法が提供される。 Further, according to another aspect of the present invention, a language analysis dictionary to be referred to when analyzing a text to be read aloud, a text that cannot be read correctly even if the language analysis dictionary is used, and a correct reading thereof are registered in advance. Text mixed with kana and kanji using a user dictionary that is divided into a default for storing and a custom dictionary for registering vocabulary based on user instructions, and a speech synthesis dictionary that is referred to when performing speech reading the information processing apparatus having a text-to-speech function and outputs the converted into voice data, stored in the storage means corresponding text-to-speech including speech reading the text that is created by using the language analysis dictionary and the user dictionary When the vocabulary to be registered is acquired based on the user instruction, the word is referred to by referring to the reading-ready text. The search text that contains as a part, after the text is extracted if it contains text containing the vocabulary as a part, were obtained readings in the default state of the text the extracted, the and customize state reading vocabulary and the vocabulary registered in the user dictionary for the customization, the readings and in the default state of the reading and the text in the text including the vocabulary acquired by the customization state as part In comparison, there is provided a method for registering a speech-to-speech vocabulary characterized in that when the readings of the two do not match, the default reading of the text is registered in the default user dictionary.

本発明に係る情報処理装置等の他の構成上の特徴及びその詳細な処理内容については、後述する発明の実施の形態を参照しながら説明する。 Other structural features and the detailed processing contents of the information processing apparatus and the like according to the present invention will be described with reference to embodiments of the invention described later.

以下、本発明の好適な実施の形態について、添付の図面を参照しながら説明する。 DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, preferred embodiments of the invention will be described with reference to the accompanying drawings.

図４は本発明の一実施形態に係る情報処理装置を組み込んだ車載用ナビゲーション装置の構成をブロック図の形態で示したものである。 FIG. 4 is a block diagram showing the configuration of an in-vehicle navigation apparatus incorporating an information processing apparatus according to an embodiment of the present invention.

本実施形態に係る車載用ナビゲーション装置２０は、図示のように記憶媒体としてのハードディスクドライブ（ＨＤＤ）１０と、ユーザインタフェースとしての操作部１１と、ＧＰＳ受信機１２と、自立航法センサ１３と、通信機１４と、表示装置１５と、スピーカ１６と、制御部２１と、誘導経路記憶部２２と、表示装置１５に画像データを供給する画像合成部２３と、ＴＴＳ(Text To Speech)読み上げ部２４と、スピーカ１６に音声データを供給する音声出力部２５とを備えている。制御部２１は、誘導経路記憶部２２、画像合成部２３、ＴＴＳ読み上げ部２４及び音声出力部２５と共に、ナビゲーション装置本体に内蔵されている。また、本発明に関連する情報処理装置は、後述するＨＤＤ１０内の３種類の辞書及び読み上げ対応テキストと、ＴＴＳ読み上げ部２４と、これらの入出力を制御する制御部２１（その一部の機能）とにより構成されている。ＴＴＳ読み上げ部２４は、図１に示したＴＴＳ読み上げ部１と同じ構成（言語解析部２及び波形生成部３）及び機能を有している。 The in-vehicle navigation device 20 according to the present embodiment includes a hard disk drive (HDD) 10 as a storage medium, an operation unit 11 as a user interface, a GPS receiver 12, a self-contained navigation sensor 13, and a communication as illustrated. 14, display device 15, speaker 16, control unit 21, guidance route storage unit 22, image composition unit 23 that supplies image data to display device 15, TTS (Text To Speech) reading unit 24, And an audio output unit 25 for supplying audio data to the speaker 16. The control unit 21 is built in the navigation device body together with the guidance route storage unit 22, the image synthesis unit 23, the TTS reading unit 24, and the voice output unit 25. The information processing apparatus related to the present invention includes three types of dictionaries and reading-ready text in the HDD 10 to be described later, a TTS reading unit 24, and a control unit 21 (part of its functions) for controlling input / output of these. It is comprised by. The TTS reading unit 24 has the same configuration (language analysis unit 2 and waveform generation unit 3) and functions as the TTS reading unit 1 shown in FIG.

ＨＤＤ１０によって駆動されるディスク（図示せず）には、ナビゲーション機能を実行する際に使用する地図データ（地図データベース１０ａ）と、音声合成（ＴＴＳ）機能を実行する際に使用する３種類の辞書（言語解析辞書１０ｂ、ユーザ辞書１０ｃ、音声合成辞書１０ｄ）と、あらかじめ用意された読み上げ対応テキスト１０ｅとがそれぞれ割り当てられた記憶領域に格納されている。地図データベース１０ａには、各縮尺レベル（１／１２５００、１／２５０００、１／５００００等）に応じて適当な大きさの経度幅及び緯度幅に区切られた地図データが格納され、この地図データには、表示用の道路データ及び経路探索用の道路データと共に、各種施設（コンビニエンスストア、ガソリンスタンド、娯楽施設等）に関するデータ（施設の名称、読み、電話番号、住所等）が含まれており、各データは、それぞれ経緯度で表現された点（ノード）の座標集合で表されている。 A disk (not shown) driven by the HDD 10 includes map data (map database 10a) used when the navigation function is executed, and three types of dictionaries (when the speech synthesis (TTS) function is executed). A language analysis dictionary 10b, a user dictionary 10c, a speech synthesis dictionary 10d), and a text-to-speech text 10e prepared in advance are stored in assigned storage areas. The map database 10a stores map data divided into longitude and latitude widths of appropriate sizes according to each scale level (1/12500, 1/25000, 1/50000, etc.). Includes road data for display and road data for route search, as well as data related to various facilities (convenience store, gas station, entertainment facility, etc.) (name of facility, reading, telephone number, address, etc.) Each data is represented by a coordinate set of points (nodes) expressed by longitude and latitude.

また、言語解析辞書１０ｂは、音声合成処理の対象となるテキスト情報の構文解析を行う際に参照する辞書であり、あらかじめ各語彙毎に、その表記、仮名読み、アクセント情報、品詞情報等の詳細な情報を対応付けて登録したものである。ユーザ辞書１０ｃは、言語解析辞書１０ｂを使用しても正しく読めない語彙（テキスト情報）を登録しておくための辞書（ユーザによるカスタマイズが可能な辞書）であり、さらに本発明に関連する使用態様として、ユーザが登録した語彙に起因して（あるいは地図データの更新に起因して）その「読み」が変わってしまう当該語彙もしくはこれを含むテキスト情報を登録しておくためのものである。音声合成辞書１０ｄは、ＴＴＳ読み上げ部２４内の言語解析部を通して解析されたテキスト情報をその対応する音声（波形）データに変換する際に参照する辞書である。読み上げ対応テキスト１０ｅには、あらかじめデフォルト状態で正しく読むことが可能なテキスト（図２に例示したような市区町村名、道路名称など）がリスト形式で格納されているが、その後のユーザカスタマイズ状態において「読み」が変わってしまう語彙（テキスト）を正しく読めるようにしたテキストも格納される。 The language analysis dictionary 10b is a dictionary that is referred to when performing syntactic analysis of text information to be subjected to speech synthesis processing. Details such as notation, kana reading, accent information, and part-of-speech information in advance for each vocabulary. Registered in association with each other. The user dictionary 10c is a dictionary (dictionary that can be customized by the user) for registering vocabulary (text information) that cannot be read correctly even if the language analysis dictionary 10b is used. The vocabulary in which the “reading” changes due to the vocabulary registered by the user (or due to the update of the map data) or text information including the vocabulary is registered. The speech synthesis dictionary 10d is a dictionary that is referred to when text information analyzed through the language analysis unit in the TTS reading unit 24 is converted into corresponding speech (waveform) data. In the text to be read-read 10e, texts (city names, road names, etc. as illustrated in FIG. 2) that can be read correctly in a default state are stored in a list format in advance. Also stored is a text that can correctly read a vocabulary (text) in which “reading” changes.

操作部１１はナビゲーション装置本体を操作するためのものであり、例えば、リモートコントローラ（リモコン）送信機の形態を有している。かかるリモコン送信機には、後述する表示装置１５の画面に各種メニューを表示させたり、表示画面上の各種メニューや各種項目等を選択したり、選択したメニュー等を実行させたりするための各種操作キーやボタン、ジョイスティック等が設けられている。さらに操作部１１には、本発明に関連する操作キーとして、後述するように自車が現在走行中の道路の名称を音声出力させる際のトリガとして用いられる「ＴＴＳ読み上げ」キーが設けられている。なお、操作部１１の形態としては、リモコン送信機以外にも、表示装置１５の画面上に適宜設けられるタッチパネルの形態であってもよい。 The operation unit 11 is for operating the navigation apparatus body, and has, for example, a remote controller (remote control) transmitter. The remote control transmitter has various operations for displaying various menus on the screen of the display device 15 to be described later, selecting various menus and various items on the display screen, and executing the selected menus. Keys, buttons, joysticks, etc. are provided. Furthermore, the operation unit 11 is provided with a “TTS reading” key used as a trigger when the name of the road on which the vehicle is currently traveling is output as an operation key related to the present invention, as will be described later. . In addition to the remote control transmitter, the operation unit 11 may be a touch panel provided as appropriate on the screen of the display device 15.

ＧＰＳ受信機１２は、ＧＰＳ衛星から送られてくる電波（ＧＰＳ信号）を受信して自車の現在位置の経度及び緯度を検出するためのものである。自立航法センサ１３は、自車の方位や走行速度等を検出するためのものであり、３Ｄジャイロ等の角度センサ、回転数に基づいた一定の走行距離毎にパルスを発生する距離センサなどから構成されている。通信機１４は、車両外部の情報センタや友人等が所持する携帯電話機などと通信するための携帯電話機（専用のモデムが付属する）や車載電話機などからなる。 The GPS receiver 12 receives radio waves (GPS signals) transmitted from GPS satellites and detects the longitude and latitude of the current position of the vehicle. The self-contained navigation sensor 13 is for detecting the direction and traveling speed of the own vehicle, and is composed of an angle sensor such as a 3D gyro, a distance sensor that generates a pulse for every certain traveling distance based on the number of revolutions, and the like. Has been. The communicator 14 includes a mobile phone (attached with a dedicated modem), an in-vehicle phone, and the like for communicating with an information center outside the vehicle, a mobile phone held by a friend, and the like.

表示装置１５はＬＣＤモニタ等からなり、少なくとも運転席から表示画面を見ることができるようにセンターコンソールのほぼ中間位置に設置されている。この表示装置１５の画面には、制御部２１からの制御に基づき画像合成部２３を介して、ナビゲーションに係る案内情報（自車位置の周囲の地図、自車位置マーク、自車位置から目的地までの誘導経路など）が表示される。スピーカ１６は１個のみ示されているが、実際には車室内の所定の場所に所要の個数、例えば、リア席が１列の場合であれば少なくともリア席の左右の近傍とフロント席の左右の近傍にそれぞれ２個ずつ、計４個のスピーカが設置されている。このスピーカ１６は、制御部２１からの制御に基づき音声出力部２５を介して、基本的には上記のナビゲーションに係る案内情報を音声出力するものであり、さらに本発明に関連する案内情報として、ＴＴＳ読み上げ部２４を通して変換された音声データ（Ｅメール文章等のテキスト情報）を音声出力する。 The display device 15 is composed of an LCD monitor or the like, and is installed at a substantially middle position of the center console so that at least the display screen can be viewed from the driver's seat. On the screen of the display device 15, guidance information related to navigation (a map around the vehicle position, the vehicle position mark, the vehicle position mark, the To the navigation route). Although only one speaker 16 is shown, in practice, a predetermined number of speakers 16 in a predetermined position in the passenger compartment, for example, if the rear seats are in a single row, at least the vicinity of the left and right of the rear seat and the left and right of the front seat. In total, four speakers are installed in the vicinity of each. The speaker 16 basically outputs the guidance information related to the navigation through the voice output unit 25 based on the control from the control unit 21. Further, as the guidance information related to the present invention, The voice data (text information such as e-mail text) converted through the TTS reading unit 24 is output as voice.

制御部２１はマイクロコンピュータ等により構成され、種々の処理を規定した各種制御部プログラム（地図や自車位置マーク等の描画処理、経路探索処理、マップマッチング及びそれに基づいた自車位置修正処理、本発明を特徴付けるＴＴＳ読み上げ用の語彙（テキスト）の登録処理等を行うための制御プログラム）を内蔵のメモリ（図示せず）に格納している。制御部２１は、これらの制御プログラムに従って処理を行い、基本的には、ＧＰＳ受信機１２から出力される信号に基づいて自車の現在位置（自車位置）を検出したり、自立航法センサ１３から出力される信号に基づいて自車の方位や走行速度を検出したり、地図データベース１０ａから読み出した地図データを用いて設定された探索条件で出発地（自車位置）から目的地までの誘導経路を探索するなど、ナビゲーションに係る種々の処理を実行する。さらに制御部２１は、本発明に関連する処理として、後述するようにＴＴＳ読み上げ部２４と協働して言語解析及び波形生成に基づいた音声合成処理を制御すると共に、ＴＴＳ読み上げ用の語彙（テキスト）の登録に係る処理を制御する。 The control unit 21 is constituted by a microcomputer or the like, and various control unit programs that define various processes (drawing process of map and vehicle position mark, route search process, map matching and vehicle position correction process based on this, A control program for registering a vocabulary (text) for TTS reading that characterizes the invention is stored in a built-in memory (not shown). The control unit 21 performs processing in accordance with these control programs, and basically detects the current position of the own vehicle (own vehicle position) based on the signal output from the GPS receiver 12 or the self-contained navigation sensor 13. Detecting the direction and traveling speed of the vehicle based on the signal output from the vehicle, or guiding from the departure point (own vehicle position) to the destination according to the search conditions set using the map data read from the map database 10a Various processes related to navigation, such as searching for a route, are performed. Further, as a process related to the present invention, the control unit 21 controls speech synthesis processing based on language analysis and waveform generation in cooperation with the TTS reading unit 24 as described later, and also uses a vocabulary (text for TTS reading). ) Is controlled.

誘導経路記憶部２２は、制御部２１によって探索された誘導経路の出発地（例えば、自車位置）から目的地までの全てのノード（経緯度で表現された点の座標）に関するデータを格納しておくためのものである。画像合成部２３は、制御部２１により地図データベース１０ａから読み出された地図データを用いて地図画像の描画処理を行ったり、動作状況に応じて各種メニュー画面（操作画面）や自車位置マーク、カーソル等の各種マークを生成したり、制御部２１からの制御に基づいて誘導経路記憶部２２から誘導経路のデータを読み出し、当該誘導経路を相対的に目立つ表示態様（例えば、色を変える、線幅を太くするなど）で描画したりするなどの処理を行う。つまり、画像合成部２３は、地図画像に誘導経路や操作画面、各種マーク等を重ね合わせて、表示装置１５の画面に表示させる機能を有している。音声出力部２５は、制御部２１を介して供給されるデジタル音声信号（ナビゲーションに係る案内情報、ＴＴＳ読み上げ部２４を通して変換されたテキスト音声データ）をアナログ波形信号に変換し、適宜電力増幅を行ってスピーカ１６に出力するものである。 The guidance route storage unit 22 stores data relating to all nodes (coordinates of points expressed in longitude and latitude) from the starting point (for example, the vehicle position) of the guidance route searched by the control unit 21 to the destination. It is for keeping. The image compositing unit 23 performs map image drawing processing using the map data read from the map database 10a by the control unit 21, and various menu screens (operation screens), own vehicle position marks, Various kinds of marks such as a cursor are generated, or data on the guidance route is read from the guidance route storage unit 22 based on the control from the control unit 21, and the guidance route is displayed with a relatively conspicuous display mode (for example, a color change, a line (Such as making the width thicker)). That is, the image composition unit 23 has a function of superimposing a guidance route, an operation screen, various marks, and the like on the map image and displaying them on the screen of the display device 15. The voice output unit 25 converts a digital voice signal (guidance information related to navigation, text voice data converted through the TTS reading unit 24) supplied via the control unit 21 into an analog waveform signal, and appropriately performs power amplification. Output to the speaker 16.

以上のように構成された本実施形態の車載用ナビゲーション装置２０において、ＨＤＤ１０は「記憶手段」に、操作部１１は「入力手段」に、ＧＰＳ受信機１２は「自車位置検出手段」に、通信機１４は「通信手段」に、制御部２１は「制御手段」に、ＴＴＳ読み上げ部２４は「音声読み上げ手段」に、それぞれ対応している。 In the in-vehicle navigation device 20 of the present embodiment configured as described above, the HDD 10 is a “storage unit”, the operation unit 11 is an “input unit”, and the GPS receiver 12 is a “own vehicle position detection unit”. The communication device 14 corresponds to “communication means”, the control section 21 corresponds to “control means”, and the TTS reading section 24 corresponds to “voice reading means”.

以下、本実施形態に係る情報処理装置（言語解析辞書１０ｂ、ユーザ辞書１０ｃ、音声合成辞書１０ｄ、読み上げ対応テキスト１０ｅ、ＴＴＳ読み上げ部２４、制御部２１）を組み込んだ車載用ナビゲーション装置２０において行うＴＴＳ読み上げ用の語彙（テキスト）の登録に係る処理について、その一例を示す図５を参照しながら説明する。併せて、その具体例を示す図６も参照しながら補足説明する。なお、本実施形態に係る語彙（テキスト）の登録は、安全性を考慮して、車両が停車中のときに行うものとする。 Hereinafter, TTS performed in the in-vehicle navigation apparatus 20 incorporating the information processing apparatus (language analysis dictionary 10b, user dictionary 10c, speech synthesis dictionary 10d, reading-ready text 10e, TTS reading unit 24, control unit 21) according to the present embodiment. Processing related to registration of a vocabulary (text) for reading will be described with reference to FIG. In addition, a supplementary explanation will be given with reference to FIG. The vocabulary (text) according to the present embodiment is registered when the vehicle is stopped in consideration of safety.

先ず、前提条件として、ＨＤＤ１０内の読み上げ対応テキスト１０ｅには、あらかじめデフォルト状態で正しく読むことが可能なテキスト（図２に例示したような市区町村名、道路名称など）のリストが格納されているものとする。 First, as a precondition, the reading-ready text 10e in the HDD 10 stores in advance a list of texts (city names, road names, etc. as illustrated in FIG. 2) that can be correctly read in the default state. It shall be.

この状態で最初のステップＳ１では、制御部２１において、ユーザが登録したい語彙を取得する。すなわち、ユーザが、操作部１１からのリモコン操作（あるいは表示装置１５の画面上でのタッチ操作）により、音声読み上げ対象として所望の語彙を指定すると、制御部２１では、その指定された語彙及びその「読み」の情報を取得する。図６の処理Ｐ１に示す例では、登録したい語彙「川内」に対して「カワウチ」という読み仮名を指定している。制御部２１で取得した情報は、内蔵のメモリ（図示せず）に一時格納される。 In the first step S1 in this state, the control unit 21 acquires a vocabulary that the user wants to register. That is, when the user designates a desired vocabulary as a speech reading target by a remote control operation from the operation unit 11 (or a touch operation on the screen of the display device 15), the control unit 21 selects the designated vocabulary and the vocabulary Get "reading" information. In the example shown in the process P1 of FIG. 6, a reading pseudonym “Kawauchi” is designated for the vocabulary “Kawauchi” to be registered. Information acquired by the control unit 21 is temporarily stored in a built-in memory (not shown).

次のステップＳ２では、制御部２１において、ＨＤＤ１０内の言語解析辞書１０ｂ、ユーザ辞書１０ｃ及び読み上げ対応テキスト１０ｅを参照して、当該語彙（この場合、「川内」）を含むテキストを検索する（図６の処理Ｐ２）。 In the next step S2, the control unit 21 searches the text including the vocabulary (in this case, “Kawauchi”) by referring to the language analysis dictionary 10b, the user dictionary 10c, and the reading-ready text 10e in the HDD 10 (FIG. Process P2 of 6).

次のステップＳ３では、制御部２１において、その検索結果に基づき、当該語彙を含むテキストがある（ＹＥＳ）か否（ＮＯ）かを判定する。判定結果がＹＥＳの場合にはステップＳ５に進み、判定結果がＮＯの場合にはステップＳ４に進む。 In the next step S3, the control unit 21 determines whether there is a text including the vocabulary (YES) or not (NO) based on the search result. If the determination result is yes, the process proceeds to step S5, and if the determination result is no, the process proceeds to step S4.

ステップＳ４では（当該語彙を含むテキストが読み上げ対応テキスト１０ｅに含まれていない場合）、制御部２１からの制御に基づきＨＤＤ１０内のユーザ辞書１０ｃに、当該語彙「川内」をその読み「カワウチ」と共に登録する（ユーザカスタマイズ状態）。ここで行うユーザ辞書登録は、図３に例示した従来の場合と同じである。そして、本処理フローは「終了」となる。 In step S4 (when the text including the vocabulary is not included in the reading-ready text 10e), the vocabulary “Kawauchi” is read together with the reading “kawauchi” in the user dictionary 10c in the HDD 10 based on the control from the control unit 21. Register (user customization status). The user dictionary registration performed here is the same as in the conventional case illustrated in FIG. Then, this processing flow is “end”.

一方、ステップＳ５では（当該語彙を含むテキストが読み上げ対応テキスト１０ｅに含まれている場合）、制御部２１において、読み上げ対応テキスト１０ｅの中から当該語彙を含むテキストを抽出し、当該テキストのデフォルト状態での「読み」を取得する。図６の処理Ｐ３に示す例では、登録したい語彙「川内」を含むテキスト「薩摩川内市」を抽出し、デフォルト状態での「サツマセンダイシ」という「読み」を取得している。ここで得られた「読み」は正確な「読み」である。取得した情報は、制御部２１内のメモリ（図示せず）に一時格納される。 On the other hand, in step S5 (when the text including the vocabulary is included in the reading-ready text 10e), the control unit 21 extracts the text including the vocabulary from the reading-ready text 10e, and the default state of the text Get "reading" on In the example shown in the process P3 of FIG. 6, the text “Satsuma Kawauchi City” including the vocabulary “Kawauchi” to be registered is extracted, and “reading” “Satsuma Sendai” in the default state is acquired. The “reading” obtained here is an accurate “reading”. The acquired information is temporarily stored in a memory (not shown) in the control unit 21.

次のステップＳ６では、ステップＳ４で行った処理と同様にして、当該語彙「川内」をその読み「カワウチ」と共にユーザ辞書１０ｃに登録する（図６の処理Ｐ４）。 In the next step S6, the vocabulary “Kawauchi” is registered in the user dictionary 10c together with the reading “kawauchi” in the same manner as the processing performed in step S4 (processing P4 in FIG. 6).

次のステップＳ７では、制御部２１において、読み上げ対応テキスト１０ｅの中から抽出した当該語彙を含むテキストの、ユーザカスタマイズ状態での「読み」を取得する。図６の処理Ｐ５に示す例では、抽出した語彙「川内」を含むテキスト「薩摩川内市」の、ユーザカスタマイズ状態での「サツマカワウチシ」という「読み」を取得している。ここで得られた「読み」は、ユーザが登録した語彙「川内（カワウチ）」の影響を受けるため、不正確な「読み」である。取得した情報は、制御部２１内のメモリ（図示せず）に一時格納される。 In the next step S7, the control unit 21 acquires “reading” in the user customization state of the text including the vocabulary extracted from the reading-ready text 10e. In the example shown in the process P5 of FIG. 6, “reading” “Satsuma Kawauchi” in the user customization state of the text “Satsuma Kawauchi City” including the extracted vocabulary “Kawauchi” is acquired. The “reading” obtained here is inaccurate “reading” because it is influenced by the vocabulary “kawauchi” registered by the user. The acquired information is temporarily stored in a memory (not shown) in the control unit 21.

次のステップＳ８では、制御部２１において、取得した各々の「読み」、すなわち、当該テキストのデフォルト状態での「読み」とユーザカスタマイズ状態での「読み」を比較し、両者が一致する（ＹＥＳ）か否（ＮＯ）かを判定する。判定結果がＹＥＳの場合にはステップＳ９に進み、判定結果がＮＯの場合にはステップＳ１０に進む。図６の処理Ｐ６に示す例では、デフォルト状態で得られた「サツマセンダイシ」という「読み」とユーザカスタマイズ状態で得られた「サツマカワウチシ」という「読み」とを比較している。従って、この場合、判定結果はＮＯであり、ステップＳ１０に進む。 In the next step S8, the control unit 21 compares each acquired “reading”, that is, “reading” in the default state of the text with “reading” in the user customization state, and the two match (YES). ) Or not (NO). If the determination result is YES, the process proceeds to step S9, and if the determination result is NO, the process proceeds to step S10. In the example shown in the process P6 of FIG. 6, the “reading” “Satsuma Sendai” obtained in the default state is compared with the “reading” “Satsuma Kawauchi” obtained in the user customization state. Therefore, in this case, the determination result is NO, and the process proceeds to step S10.

ステップＳ９では、ステップＳ３で行った処理と同様にして、制御部２１において、当該語彙（ユーザが登録したい語彙）を含むテキストが他にある（ＹＥＳ）か否（ＮＯ）かを判定する。判定結果がＹＥＳの場合にはステップＳ５に戻って上記の処理を繰り返し、判定結果がＮＯの場合には本処理フローは「終了」となる。 In step S9, similarly to the process performed in step S3, the control unit 21 determines whether there is another text (YES) or not (NO) including the vocabulary (vocabulary that the user wants to register). If the determination result is YES, the process returns to step S5 and the above processing is repeated. If the determination result is NO, the process flow is “end”.

一方、ステップＳ１０では（取得したデフォルト状態での「読み」とユーザカスタマイズ状態での「読み」とが不一致の場合）、制御部２１からの制御に基づきＨＤＤ１０内のユーザ辞書１０ｃに、抽出された当該テキストの正しい「読み」を登録する。図６の処理Ｐ７に示す例では、抽出した当該語彙「川内」を含むテキスト「薩摩川内市」の、デフォルト状態での「サツマセンダイシ」という正しい「読み」をユーザ辞書１０ｃに登録している。ここで行うユーザ辞書登録は、本発明の特徴をなす処理である。そして、ステップＳ９に進む。 On the other hand, in step S10 (when the “reading” in the acquired default state and the “reading” in the user customization state do not match), the data is extracted in the user dictionary 10c in the HDD 10 based on the control from the control unit 21. Register the correct “reading” of the text. In the example shown in process P7 of FIG. 6, the correct “reading” of “Satsuma Sendai” in the default state of the text “Satsuma Kawauchi City” including the extracted vocabulary “Kawauchi” is registered in the user dictionary 10c. . The user dictionary registration performed here is processing that characterizes the present invention. Then, the process proceeds to step S9.

以上説明したように、本実施形態に係る車載用ナビゲーション装置２０（図４）において行うＴＴＳ読み上げ用語彙の登録方法（図５、図６）によれば、ユーザが登録対象として指示した語彙「川内（カワウチ）」を含むテキスト「薩摩川内市」が読み上げ対応テキスト１０ｅに含まれている場合に、当該テキストのデフォルト状態での読み「サツマセンダイシ」を取得するとともに、当該語彙「川内」をその読み「カワウチ」と共にユーザ辞書１０ｃに登録し（ユーザカスタマイズ状態）、このユーザカスタマイズ状態での当該テキストの読み「サツマカワウチシ」を取得してデフォルト状態での読み「サツマセンダイシ」と比較し、両者の読みが一致しない場合に（ステップＳ８の「ＮＯ」）、当該テキストのデフォルト状態での正しい読み「サツマセンダイシ」をユーザ辞書１０ｃに自動登録するようにしている。つまり、ユーザ指示に基づいた語彙のユーザ辞書登録を行った場合でも、当該語彙を含むテキストの読みがデフォルト状態での正しい読みと不一致の場合には、そのデフォルト状態での正しい読みも併せてユーザ辞書登録するようにしている。 As described above, according to the registration method (FIGS. 5 and 6) of the TTS reading vocabulary performed in the in-vehicle navigation device 20 (FIG. 4) according to the present embodiment, the vocabulary “Kawauchi” designated as the registration target by the user. When the text “Satsuma Kawauchi-shi” including “(kawauchi)” is included in the reading-ready text 10e, the reading “Satsuma Sendai” in the default state of the text is acquired and the vocabulary “Kawauchi” is Registered in the user dictionary 10c together with the reading “kawauchi” (user customization state), obtains the reading “Satsumakawauchi” of the text in this user customization state, and compares it with the reading “Satsuma Sendai” in the default state, If the readings do not match (“NO” in step S8), the text is correct in the default state So that to automatically register themselves "Satsuma Sendai death" to the user dictionary 10c. In other words, even when the user dictionary is registered for the vocabulary based on the user instruction, if the reading of the text containing the vocabulary does not match the correct reading in the default state, the correct reading in the default state is also added to the user. The dictionary is registered.

これにより、読み上げ対応テキスト１０ｅ内での当該語彙（テキスト）の読みを常に正しい状態に保つことが可能となる。また、従来のように全ての語彙を辞書登録する必要がないので、ユーザ辞書のサイズを小さく抑えることができ、音声合成（ＴＴＳ処理）時に必要なリソース（メモリ、ＣＰＵ負荷など）を削減することができる。 Thereby, the reading of the vocabulary (text) in the reading-ready text 10e can be always kept in a correct state. In addition, since it is not necessary to register all vocabularies in the dictionary as in the past, the size of the user dictionary can be kept small, and resources (memory, CPU load, etc.) required for speech synthesis (TTS processing) can be reduced. Can do.

図７は本実施形態に係る処理（図５）に基づいたユーザカスタマイズ状態でのユーザ辞書登録の一例を示したものである。図示の例では、ユーザが「川内」という語彙をその読み「カワウチ」と共に登録し（図中、Ａ，Ｂで囲んだ部分）、デフォルト状態の読み上げ対応テキスト中に当該語彙と同じ語彙「川内」を含むテキスト「薩摩川内市」が含まれているものの、デフォルト状態での正しい「読み」（サツマセンダイシ）も自動登録されているので（図中、Ｅで囲んだ部分）、ユーザカスタマイズ状態の読み上げ対応テキストでは、ユーザが追加した語彙「川内（カワウチ）」、デフォルト状態で有していたテキスト「薩摩川内市」ともに、正しく読むことができる（図中、Ｃ，Ｄで囲んだ部分）。 FIG. 7 shows an example of user dictionary registration in the user customization state based on the processing (FIG. 5) according to the present embodiment. In the illustrated example, the user registers the vocabulary “Kawauchi” along with its reading “Kawauchi” (the part enclosed by A and B in the figure), and the same vocabulary “Kawauchi” as the vocabulary in the text corresponding to reading in the default state. Although the text “Satsuma Kawauchi City” is included, the correct “reading” (Satsuma Sendai) in the default state is also automatically registered (the part surrounded by E in the figure), so the user customization state In the reading-ready text, both the vocabulary “Kawauchi” added by the user and the text “Satsuma Kawauchi City” held in the default state can be read correctly (parts surrounded by C and D in the figure).

なお、本実施形態に係る処理（図５）に基づいて読み上げ対応テキスト１０ｅに登録された語彙（テキスト）の利用方法としては、以下の態様が考えられる。 Note that, as a method of using the vocabulary (text) registered in the reading-ready text 10e based on the processing (FIG. 5) according to the present embodiment, the following modes can be considered.

その一つは、自車が現在走行中の道路の名称を音声で読み上げること（音声案内）である。具体的には、操作部１１を介してユーザ指示（「ＴＴＳ読み上げ」キーの操作）があったときに、制御部２１において、ＧＰＳ受信機１２で検出された自車位置のデータを参照して、地図データベース１０ａから自車が現在走行している道路の名称（データ）を読み出し、その読み出した道路名称（テキスト）をＴＴＳ読み上げ部２４に入力し、所要の言語解析及び波形生成に基づいた音声合成により音声データに変換したものを、音声出力部２５を介してスピーカ１６から音声出力させる。 One of them is to read out the name of the road on which the vehicle is currently traveling (voice guidance). Specifically, when there is a user instruction (operation of “TTS reading” key) through the operation unit 11, the control unit 21 refers to the data of the vehicle position detected by the GPS receiver 12. Then, the name (data) of the road on which the vehicle is currently traveling is read from the map database 10a, the read road name (text) is input to the TTS reading unit 24, and voice based on required language analysis and waveform generation is input. The sound converted by the synthesis is output as audio from the speaker 16 via the audio output unit 25.

また、他の利用態様としては、外部から受信したＥメール文章やニュース文等を音声で読み上げることが考えられる。例えば、携帯電話機（通信機１４）を介して外部からＥメールを受信したときに、制御部２１からの制御に基づき、その受信したＥメール文章（テキスト）をＴＴＳ読み上げ部２４を通して音声データに変換したものを、音声出力部２５を介してスピーカ１６から音声出力させる。かかる音声案内は、車両が走行中の場合でもユーザによる特別な操作を必要とすることなく自動的に行われ得るので、安全性の面で有用である。 As another usage mode, it is conceivable to read out an e-mail text, a news text, or the like received from the outside by voice. For example, when an e-mail is received from the outside via a mobile phone (communication device 14), the received e-mail text (text) is converted into voice data through the TTS reading unit 24 based on control from the control unit 21. The sound is output from the speaker 16 via the sound output unit 25. Such voice guidance is useful in terms of safety because it can be automatically performed without requiring a special operation by the user even when the vehicle is running.

上述した実施形態では、ユーザが登録した語彙に起因してその「読み」が変わってしまう当該語彙を含むテキスト情報を登録する場合（図５、図６）を例にとって説明したが、語彙の「読み」が変わってしまう要因は、ユーザ指示に基づいた登録だけでなく、ナビゲーション用の地図データ自体にも存在する。すなわち、ナビゲーション用の地図データは定期的に（例えば、純正品の場合は１年毎に）そのデータ更新が行われており、このデータ更新が行われるまでの期間中に、市区町村合併等により市区町村の名称が変更されている可能性がある。このような場合にも、上述した図５、図６の処理に準じた語彙の登録を行うことで、更新したデータ（「読み」が変わってしまう語彙）も正しく読み上げることができる。 In the above-described embodiment, the case has been described in which text information including the vocabulary whose “reading” changes due to the vocabulary registered by the user (FIGS. 5 and 6) is taken as an example. Factors that change "reading" exist not only in registration based on user instructions, but also in map data for navigation itself. That is, the map data for navigation is regularly updated (for example, every year for genuine products). During the period until this data update is performed, municipal mergers, etc. The name of the city may have been changed. Even in such a case, by registering the vocabulary according to the above-described processes of FIGS. 5 and 6, the updated data (vocabulary whose “reading” changes) can be read out correctly.

また、上述した実施形態では、本発明に係る情報処理装置を車載用ナビゲーション装置２０の一部として組み込んだ場合を例にとって説明したが、本発明の要旨からも明らかなように、車載用に限定されないことはもちろんである。要は、上述したように仮名漢字混じりのテキスト情報を音声で読み上げる機能（ＴＴＳ読み上げ部２４）と、該音声読み上げ機能を実行する際に参照する辞書（言語解析辞書１０ｂ、ユーザ辞書１０ｃ、音声合成辞書１０ｄ）と、あらかじめデフォルト状態で正しく読めるように設定された読み上げ対応テキスト１０ｅとを備えた情報処理装置であれば十分であり、例えば、パーソナルコンピュータ（ＰＣ）においてＥメール文章やニュース文等を音声で読み上げる場合にも本発明は同様に適用することが可能である。 In the above-described embodiment, the case where the information processing apparatus according to the present invention is incorporated as a part of the in-vehicle navigation apparatus 20 has been described as an example. However, as apparent from the gist of the present invention, the information processing apparatus is limited to in-vehicle use. Of course not. In short, as described above, a function for reading out text information mixed with kana and kanji by voice (TTS reading section 24) and a dictionary (language analysis dictionary 10b, user dictionary 10c, voice synthesis) referred to when the voice reading function is executed. An information processing device having a dictionary 10d) and a text-to-speech-reading text 10e set in advance so that it can be read correctly in a default state is sufficient. The present invention can be applied in the same way when reading aloud.

また、上述した実施形態では、ナビゲーション用の地図データベース１０ａとＴＴＳ読み上げ用の各辞書１０ｂ，１０ｃ，１０ｄ及び読み上げ対応テキスト１０ｅを格納する記憶媒体としてＨＤＤ１０を使用しているが、これに代えて、ＤＶＤドライブ（ＤＶＤ−ＲＯＭ）やＣＤドライブ（ＣＤ−ＲＯＭ）等の他の記憶媒体を使用してもよい。さらに、地図データベース１０ａと各辞書１０ｂ〜１０ｄ及び読み上げ対応テキスト１０ｅは必ずしも同じ記憶媒体に格納しておく必要はなく、適宜別々の記憶媒体に分けて格納するようにしてもよい。 In the embodiment described above, the HDD 10 is used as a storage medium for storing the map database 10a for navigation, the dictionaries 10b, 10c, 10d for reading TTS, and the text 10e for reading, but instead, Other storage media such as a DVD drive (DVD-ROM) and a CD drive (CD-ROM) may be used. Further, the map database 10a, the dictionaries 10b to 10d, and the reading-ready text 10e are not necessarily stored in the same storage medium, and may be stored separately in separate storage media as appropriate.

典型的なＴＴＳ処理の説明図である。It is explanatory drawing of a typical TTS process. 従来のＴＴＳ処理を利用したデフォルト状態でのユーザ辞書登録の一例を示す図である。It is a figure which shows an example of the user dictionary registration in the default state using the conventional TTS process. 従来のＴＴＳ処理を利用したユーザカスタマイズ状態でのユーザ辞書登録の一例を示す図である。It is a figure which shows an example of the user dictionary registration in the user customization state using the conventional TTS process. 本発明の一実施形態に係る情報処理装置を組み込んだ車載用ナビゲーション装置の構成を示すブロック図である。It is a block diagram which shows the structure of the vehicle-mounted navigation apparatus incorporating the information processing apparatus which concerns on one Embodiment of this invention. 図４の車載用ナビゲーション装置において行うＴＴＳ読み上げ用の語彙（テキスト）の登録に係る処理の一例を示すフロー図である。FIG. 5 is a flowchart showing an example of processing related to registration of a vocabulary (text) for TTS reading performed in the in-vehicle navigation device of FIG. 4. 図５の処理フローを補足説明するための図である。FIG. 6 is a diagram for supplementarily explaining the processing flow of FIG. 5. 図５の処理に基づいたユーザ辞書登録の一例を示す図である。It is a figure which shows an example of the user dictionary registration based on the process of FIG.

Explanation of symbols

１０…ＨＤＤ（記憶手段）、
１０ａ…地図データベース、
１０ｂ…言語解析辞書、
１０ｃ…ユーザ辞書、
１０ｄ…音声合成辞書、
１０ｅ…読み上げ対応テキスト、
１１…操作部（入力手段）、
１２…ＧＰＳ受信機（自車位置検出手段）、
１４…通信機（通信手段）、
１５…表示装置、
１６…スピーカ、
２０…車載用ナビゲーション装置、
２１…制御部（制御手段）、
２４…ＴＴＳ読み上げ部（音声読み上げ手段）。 10: HDD (storage means),
10a ... Map database,
10b ... language analysis dictionary,
10c ... user dictionary,
10d: Speech synthesis dictionary,
10e ... Text to be read aloud,
11 ... operation part (input means),
12 ... GPS receiver (vehicle position detection means),
14 ... Communicator (communication means),
15 ... display device,
16 ... Speaker,
20 ... In-vehicle navigation device,
21 ... control unit (control means),
24 ... TTS reading section (speech reading section).

Claims

A speech-to-speech means having a function of converting text mixed with kana and kanji into speech data and outputting;
A language analysis dictionary that is a storage means that is referred to when analyzing a text to be read aloud, a text that cannot be read correctly even if the language analysis dictionary is used, and a default for registering the correct reading in advance . And a user dictionary divided for customization for registering vocabulary based on user instructions, text-to-speech that includes the language analysis dictionary and text to be read-out created using the user dictionary, and voice-to-speech Storing a speech synthesis dictionary to be referred to when executing
Input means for inputting user instructions;
Control means operatively connected to the voice reading means, the storage means and the input means,
When the control means obtains a vocabulary to be registered based on a user instruction, if the text includes the vocabulary as a part thereof with reference to the reading-ready text, the control means extracts the text,
After obtaining the reading of the default state of extract out was the text, and customize state to register the reading of the vocabulary and the vocabulary to the user dictionary for the customized,
The reading of the text that includes the vocabulary acquired in the customization state as a part thereof is compared with the reading of the text in the default state, and if the readings of the text do not match, the reading of the text in the default state is compared. Is registered in the default user dictionary.

The control means registers the vocabulary and the reading of the vocabulary in the user dictionary for customization when the text corresponding to the vocabulary is not included with reference to the reading-ready text. The information processing apparatus according to claim 1.

Furthermore, a communication means and a speaker for outputting the voice data converted through the voice reading means as voice,
When the text information is received via the communication means, the control means converts the text information into voice data through the voice reading means, and outputs the voice data from the speaker. Item 4. The information processing apparatus according to Item 1.

Furthermore, it is equipped with a storage means that is mounted on a vehicle and stores map data, and a vehicle position detection means that detects a current position of the vehicle,
The control means acquires the name of the road on which the host vehicle is currently traveling with reference to the detected current position of the host vehicle and the map data when the user gives an instruction via the input unit; 4. The information processing apparatus according to claim 3, wherein the name information is converted into voice data through the voice reading unit, and the voice data is output as voice from the speaker.

Based on the language analysis dictionary to be referred to when analyzing the text to be read aloud, the text that cannot be read correctly even if the language analysis dictionary is used, and the default instructions for registering the correct reading in advance and user instructions A text-to-speech function that converts a kana-kanji mixed text into speech data and outputs it using a user dictionary that is divided into customizations for registering vocabularies and a speech synthesis dictionary that is referenced when speech-to-speech is executed. In an information processing apparatus comprising:
May be stored in the storage means corresponding text-to-speech including speech reading the text that is created by using the language analysis dictionary and the user dictionary,
When the vocabulary to be registered based on the user instruction is acquired, the text that includes the vocabulary as a part thereof is searched with reference to the reading-ready text,
When the text including the vocabulary as a part thereof is included, the text is extracted, and after reading the extracted text in a default state, the vocabulary and the reading of the vocabulary are used for the customization. Register in the user dictionary and customize it ,
Comparing the reading of the text that includes the vocabulary acquired in the customization state as a part thereof with the reading of the text in the default state;
A method for registering a speech-to-speech vocabulary, wherein when the readings of the two do not match, the reading of the text in the default state is registered in the default user dictionary.