JP2005122042A

JP2005122042A - Speech recognizing device, speech recognizing method, speech synthesizer, speech synthesis method, speech recognition system, speech synthesis system, speech recognition and synthesizing device, navigation system, and moving body

Info

Publication number: JP2005122042A
Application number: JP2003359434A
Authority: JP
Inventors: Kazuaki Minami; 見並　　一明
Original assignee: Toyota Motor Corp; Toyota InfoTechnology Center Co Ltd
Current assignee: Toyota Motor Corp; Toyota InfoTechnology Center Co Ltd
Priority date: 2003-10-20
Filing date: 2003-10-20
Publication date: 2005-05-12

Abstract

<P>PROBLEM TO BE SOLVED: To provide speech recognition and synthesis service, capable of changing the speech output volume/tone quality of synthesized speech, in matching with a situation and timing on, based the assumption of generation of noise beforehand, and which is made smooth and effective to the user by changing the acoustic model to be used for the speech recognition. <P>SOLUTION: A control change-over part 101 which can acquire positional information and control a speech recognizing part 7 and a speech synthesis part 6 carries out speech recognition and speech synthesis. On the basis of the positional information and locational situation data acquired by the control change-over part 101, the speech recognizing part 7 selects an acoustic model suitable for the locational situation data and the positional information, from among a plurality of acoustic models and carries out speech recognition processing. Moreover, on the basis of the positional information and the locational situation data acquired by the control change-over part 101, the speech synthesizing part 6 and a utterance volume/utterance tone quality change-over part 107 carry out adjustment of the utterance volume of the synthesized speech and/or changeover of the utterance tone quality. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

この発明は、音声認識装置、音声認識方法、音声合成装置、音声合成方法、音声認識システム、音声合成システム、音声認識合成装置およびナビゲーションシステム、ならびに移動体に関し、特に、音声対話型カーナビゲーションシステムや、音声対話型の通信型ナビゲーションシステムに適用して好適なものである。 The present invention relates to a speech recognition device, a speech recognition method, a speech synthesis device, a speech synthesis method, a speech recognition system, a speech synthesis system, a speech recognition synthesis device and a navigation system, and a mobile object. The present invention is suitable for application to a voice interactive communication navigation system.

従来、車載の音声合成および音声認識においては、特許文献１および特許文献２に記載されているものが知られている。 Conventionally, in-vehicle speech synthesis and speech recognition, those described in Patent Literature 1 and Patent Literature 2 are known.

まず、特許文献１には、自動車に搭載されたオーディオ装置において、車室内の音量を走行中の車室内の騒音レベルに対応して聴取しやすいように自動調整する音量調節システムを提供することを目的として、車室内に設置され、音響信号発生源からの音響信号に音場制御処理を施す音場処理手段を有するオーディオ装置において、車室内に設置され、車室内の室内音を検出するマイクロホンと、前記マイクロホンにより検出された室内音から、前記音場処理手段により処理された音響信号を減算処理し、車室内における騒音信号を摘出する騒音信号摘出手段と、前記騒音信号摘出手段により摘出された騒音信号に応じて音響信号の音量を制御する音量制御手段とを有する音量調節システムが記載されている。 First, Patent Document 1 provides a volume control system that automatically adjusts a volume in a vehicle interior so that it can be easily heard in accordance with a noise level in a traveling vehicle interior in an audio device mounted on an automobile. As an object, in an audio apparatus having a sound field processing means that is installed in a vehicle interior and performs sound field control processing on an acoustic signal from an acoustic signal generation source, a microphone that is installed in the vehicle interior and detects a room sound in the vehicle interior; The sound signal processed by the sound field processing means is subtracted from the room sound detected by the microphone, and a noise signal extracting means for extracting a noise signal in the passenger compartment is extracted by the noise signal extracting means. A volume control system having volume control means for controlling the volume of an acoustic signal in accordance with a noise signal is described.

また、特許文献２には、ノイズ成分の推定を高精度で行なって認識率の向上を図るとともに、ユーザにとって使い易いものにすることを目的として、ユーザが音声入力すべくトークスイッチをオン操作すると、カーオーディオ装置を消音すると共に、一定時間のノイズ推定区間を設け、定常ノイズを推定する音声認識装置が記載されている。 Japanese Patent Laid-Open No. 2004-228561 estimates the noise component with high accuracy to improve the recognition rate, and for the purpose of making the user easy to use, when the user turns on the talk switch for voice input. In addition, there is described a voice recognition device that silences a car audio device and provides a noise estimation section for a predetermined time to estimate stationary noise.

この特許文献２における音声認識装置は、具体的には、ノイズ推定区間の終了時に「ピッ」という報知音を出力して音声検出区間を開始し、ここでユーザがコマンドや目的地を音声入力する。この音声検出区間の音声入力信号から、推定したノイズ成分を除去し、音声信号を得る。音声検出区間の終了時にも報知を行うことにより音声認識処理を行って、その後トークバックを実行して、カーオーディオ装置のミュートを解除するように構成されている。
特開平１０−３０３６６９号公報特開２０００−３２２０９８号公報特開平９−９０９６３号公報 Specifically, the speech recognition apparatus in Patent Document 2 outputs a notification sound of “beep” at the end of the noise estimation section to start the speech detection section, where the user inputs a command or destination by speech. . The estimated noise component is removed from the voice input signal in the voice detection section to obtain a voice signal. It is configured to perform voice recognition processing by performing notification even at the end of the voice detection section, and then perform talkback to cancel mute of the car audio device.
Japanese Patent Laid-Open No. 10-303669 JP 2000-322098 A Japanese Patent Laid-Open No. 9-90963

しかしながら、上述の従来の音声合成および音声認識においては、次のような問題があった。 However, the above-described conventional speech synthesis and speech recognition have the following problems.

すなわち、特許文献１に記載された音量調整システムにおいては、車室内騒音レベルに対して音量を調整するものであるので、音量の調整を実行することができるのは、騒音を検出した後である。そのため、音量調整を行うタイミングが騒音の発生よりも遅れてしまい、大幅なタイムラグが発生してしまうという問題があった。 That is, in the sound volume adjustment system described in Patent Document 1, the sound volume is adjusted with respect to the vehicle interior noise level. Therefore, the sound volume can be adjusted after the noise is detected. . For this reason, there is a problem in that the timing for adjusting the volume is delayed from the generation of noise, resulting in a significant time lag.

また、特許文献２に記載された音声認識装置においては、ノイズ成分の推定を高精度に
行うことができる装置であるが、実際には、一定時間のノイズ推定区間を設けて、定常ノイズを推定している。そのため、どうしても、ノイズ成分の推定は、騒音や雑音を検出した後に行われてしまうという問題がある。 In addition, the speech recognition apparatus described in Patent Document 2 is an apparatus that can estimate noise components with high accuracy, but in practice, a noise estimation section of a certain time is provided to estimate steady noise. doing. Therefore, there is a problem that the estimation of the noise component is inevitably performed after detecting noise or noise.

本発明者の知見によれば、これらの従来技術における問題点は、雑音を未知の要素と扱っていることに起因する。 According to the knowledge of the present inventor, the problems in these conventional techniques are caused by treating noise as an unknown element.

したがって、この発明の目的は、騒音や雑音の発生を予見することにより、事前に、騒音や雑音の発生状況を想定することができ、さらに、状況に対して、音声合成の音声出力音量や音声認識における認識実行の準備を行うことができ、利用するユーザに対して、スムーズで効果的なサービスを提供することができる音声認識装置、音声認識方法、音声合成装置および音声合成方法、音声合成認識装置、およびナビゲーションシステム、ならびに移動体を提供することにある。 Therefore, the object of the present invention is to predict the noise and noise generation situation in advance by predicting the noise and noise generation. Speech recognition device, speech recognition method, speech synthesizer, speech synthesis method, speech synthesis recognition that can prepare for recognition execution in recognition and can provide smooth and effective services to users who use it An apparatus, a navigation system, and a moving body are provided.

上記目的を達成するために、この発明の第１の発明は、
音声認識可能に構成された音声認識手段と、
位置情報を取得可能に構成されているとともに、制御の切替を実行可能に構成された制御切替手段と、
複数の音響モデルを有して構成される切替用音響モデルデータベースと、
複数の音響モデルから１つの音響モデルを選択する音響モデル制御切替手段とを有し、
取得された位置情報と、位置情報に含まれる場所状況データとに基づいて、音響モデル制御切替手段により複数の音響モデルから、場所状況データおよび位置情報に適合した音響モデルを選択し、選択された音響モデルを用いて音声認識処理を実行能可能に構成されている
ことを特徴とする音声認識装置である。 In order to achieve the above object, the first invention of the present invention provides:
A voice recognition means configured to be capable of voice recognition;
Control switching means configured to be capable of acquiring position information and capable of switching control; and
A switching acoustic model database configured with a plurality of acoustic models;
Acoustic model control switching means for selecting one acoustic model from a plurality of acoustic models,
Based on the acquired location information and the location situation data included in the location information, the acoustic model control switching means selects an acoustic model suitable for the location situation data and the location information from a plurality of acoustic models, and selected The speech recognition apparatus is configured to be capable of executing speech recognition processing using an acoustic model.

この第１の発明において、典型的には、複数の稼動部から出力される出力信号を入力可能に構成され、稼動部ごとの出力信号に基づいた、それぞれの稼動部に固有の雑音情報に対する雑音抑圧特性データを格納した、雑音抑圧特性データベースをさらに有し、稼動している稼動部からの出力信号に基づいて、複数の音響モデルから選択された音響モデルに対して、雑音抑圧特性データベースの情報を加算して、音声認識処理を実行するように構成されている。 In the first aspect of the invention, typically, output signals output from a plurality of operating units are configured to be input, and noise corresponding to noise information specific to each operating unit based on the output signal for each operating unit. The noise suppression characteristic database that stores the suppression characteristic data is further provided. Information on the noise suppression characteristic database is obtained for an acoustic model selected from a plurality of acoustic models based on an output signal from an active operating unit. Are added to execute voice recognition processing.

この発明の第２の発明は、
音声を認識可能に構成された音声認識手段と、
位置情報を取得可能に構成されているとともに、制御の切替を実行可能に構成された制御切替手段と、
複数の音響モデルを有して構成される切替用音響モデルデータベースと、
複数の音響モデルから１つの音響モデルを選択する音響モデル制御切替手段とが、相互にデータを送受信可能に構成され、
取得された位置情報と、位置情報に含まれる場所状況データとに基づいて、音響モデル制御切替手段により複数の音響モデルから、場所状況データおよび位置情報に適した音響モデルを選択するように構成されている
ことを特徴とする音声認識システムである。 The second invention of this invention is:
Speech recognition means configured to recognize speech;
Control switching means configured to be capable of acquiring position information and capable of switching control; and
A switching acoustic model database configured with a plurality of acoustic models;
The acoustic model control switching means for selecting one acoustic model from a plurality of acoustic models is configured to be able to transmit / receive data to / from each other,
Based on the acquired position information and the location situation data included in the location information, the acoustic model control switching means is configured to select an acoustic model suitable for the location situation data and the location information from a plurality of acoustic models. It is a voice recognition system characterized by

この第２の発明において、典型的には、複数の稼動部からの出力信号を入力可能に構成され、稼動部ごとの出力信号に基づいた、それぞれの稼動部に固有の雑音情報に対する雑音抑圧特性データを格納した、雑音抑圧特性データベースをさらに有し、稼動している稼動部からの出力信号に基づいて、複数の音響モデルから選択された音響モデルに対して、
雑音抑圧特性データベースの情報を加算して、音声認識処理を実行するように構成されている。 In the second aspect of the invention, typically, it is configured to be able to input output signals from a plurality of operating units, and based on the output signal for each operating unit, noise suppression characteristics for noise information specific to each operating unit For the acoustic model selected from a plurality of acoustic models based on the output signal from the active operating unit, further having a noise suppression characteristics database storing data.
The information of the noise suppression characteristic database is added and the speech recognition process is executed.

この発明の第３の発明は、
音声を認識可能に構成された音声認識手段と、
複数の音響モデルを有して構成される切替用音響モデルデータベースと、
切替用音響モデルデータベースから１つの音響モデルを選択する音響モデル制御切替手段と、
場所状況データを含む位置情報を取得可能に構成されているとともに、制御の切替を実行可能に構成された制御切替手段とを用いて、
場所状況データに基づき、音響モデル制御切替手段により複数の音響モデルから、場所状況データに適合する音響モデルを選択する
ことを特徴とする音声認識方法である。 The third invention of the present invention is:
Speech recognition means configured to recognize speech;
A switching acoustic model database configured with a plurality of acoustic models;
Acoustic model control switching means for selecting one acoustic model from the acoustic model database for switching;
Using the control switching means configured to be able to acquire the position information including the location status data and capable of switching the control,
The speech recognition method is characterized in that an acoustic model suitable for the location situation data is selected from a plurality of acoustic models by the acoustic model control switching means based on the location situation data.

この第３の発明において、典型的には、複数の稼動部からの出力信号を入力可能に構成され、稼動部ごとの出力信号に基づいた、それぞれの稼動部に固有の雑音情報に対する雑音抑圧特性データを格納した雑音抑圧特性データベースをさらに有し、稼動している稼動部からの出力信号に基づいて、複数の音響モデルから選択された音響モデルに対して、雑音抑圧特性データベースの情報を加算して、音声認識処理が実行される。 In the third aspect of the invention, typically, it is configured to be able to input output signals from a plurality of operating units, and based on the output signal for each operating unit, noise suppression characteristics for noise information specific to each operating unit It also has a noise suppression characteristics database that stores data, and adds information in the noise suppression characteristics database to an acoustic model selected from multiple acoustic models based on the output signal from the active operating part. Thus, the voice recognition process is executed.

この発明の第４の発明は、
音素を合成して、音声データとして出力可能に構成された音声合成手段と、
発話音量の調節および／または発話音質の切替を実行可能に構成された発話音量音質切替手段と、
位置情報を取得可能に構成されているとともに、制御の切替を実行可能に構成された制御切替手段とを有し、
現在の位置情報と、位置情報に含まれる場所状況データに基づいて、発話音量の調節および／または発話音質の切替を実行可能に構成されている
ことを特徴とする音声合成装置である。 The fourth invention of the present invention is:
A speech synthesizer configured to synthesize phonemes and output as speech data;
An utterance volume sound quality switching means configured to be able to adjust the utterance volume and / or switch the utterance sound quality;
And having a control switching means configured to be able to acquire position information and capable of switching control,
The speech synthesizer is configured to be capable of adjusting the utterance volume and / or switching the utterance sound quality based on the current position information and the place situation data included in the position information.

この発明の第５の発明は、
音素を合成して、音声データとして出力可能に構成された音声合成手段と、発話音量の調節および／または発話音質の切替を実行可能に構成された発話音量音質切替手段と、場所状況データを含む位置情報を取得可能に構成されているとともに制御の切替を実行可能に構成された制御切替手段とが、相互にデータを送受信可能に構成され、
取得された現在の位置情報と場所状況データとに基づいて、発話音量の調節および／または発話音質の切替を実行可能に構成されている
ことを特徴とする音声合成システムである。 The fifth invention of the present invention is:
Including speech synthesis means configured to synthesize phonemes and output as speech data, speech volume sound quality switching means configured to adjust speech volume and / or switch speech quality, and location status data The control switching means configured to be able to acquire position information and to be able to execute control switching is configured to be able to transmit and receive data to and from each other,
The speech synthesis system is configured to be capable of adjusting the speech volume and / or switching the speech quality based on the acquired current position information and location status data.

この発明の第６の発明は、
音素を合成して、音声として出力可能に構成された音声合成手段と、発話音量の調節および／または発話音質の切替を実行可能に構成された発話音量音質切替手段と、場所状況データを含む位置情報を取得可能に構成されているとともに、制御の切替を実行可能に構成された制御切替手段とを用いて、
音声を出力する際に、現在の位置情報と場所状況データとに基づき、発話音量の調節および／または発話音質の切替を実行する
ことを特徴とする音声合成方法である。 The sixth invention of the present invention is:
A speech synthesis means configured to synthesize phonemes and output as speech, a speech volume sound quality switching means configured to execute speech volume adjustment and / or speech quality switching, and a position including location status data Using control switching means configured to be able to acquire information and configured to be able to perform control switching,
The speech synthesis method is characterized in that, when outputting speech, the speech volume is adjusted and / or the speech quality is switched based on the current position information and location status data.

この発明の第７の発明は、
音声を認識可能に構成された音声認識手段と、
複数の音響モデルを有して構成される切替用音響モデルデータベースと、
複数の音響モデルから１つの音響モデルを選択する音響モデル制御切替手段と、
音素を合成して、音声として出力可能に構成された音声合成手段と、
発話音量の調節および／または発話音質の切替を実行可能に構成された発話音量音質切替手段と、
位置情報を取得可能に構成されているとともに、音声認識手段と音声合成手段との制御の切替を実行可能に構成された制御切替手段とを有し、
制御切替手段により音声認識手段が選択された場合に、取得された位置情報と、位置情報に含まれる場所状況データとに基づいて、音響モデル制御切替手段により複数の音響モデルから、場所状況データおよび位置情報に適した音響モデルを選択して、音声認識処理を実行可能に構成されているとともに、
制御切替手段により音声合成手段が選択された場合に、取得された位置情報と、位置情報に含まれる場所状況データとに基づいて、発話音量の調節および／または発話音質の切替を実行し、合成音声を出力可能に構成されている
ことを特徴とする音声認識合成装置である。 The seventh invention of the present invention is:
Speech recognition means configured to recognize speech;
A switching acoustic model database configured with a plurality of acoustic models;
Acoustic model control switching means for selecting one acoustic model from a plurality of acoustic models;
Speech synthesis means configured to synthesize phonemes and output as speech;
An utterance volume sound quality switching means configured to be able to adjust the utterance volume and / or switch the utterance sound quality;
It is configured to be able to acquire position information, and has a control switching unit configured to be able to perform control switching between the voice recognition unit and the voice synthesis unit,
When the voice recognition means is selected by the control switching means, the acoustic model control switching means, based on the acquired position information and the place situation data included in the position information, from the plurality of acoustic models, the location situation data and It is configured to select an acoustic model suitable for location information and perform speech recognition processing.
When the voice synthesizing unit is selected by the control switching unit, the speech volume is adjusted and / or the voice quality is switched based on the acquired position information and the location situation data included in the position information. This is a speech recognition / synthesis apparatus characterized by being configured to be able to output speech.

この発明の第８の発明は、
移動体の移動する経路に関する位置情報を所定の通信システムを利用して提供する経路情報提供手段と、移動体に搭載されているとともに経路情報提供手段により提供された前記情報に基づく経路を所定の道路地図情報に基づいて出力可能に構成されたナビゲーション手段と、音声を認識可能に構成された音声認識手段と、位置情報を取得可能に構成されているとともに、制御の切替を実行可能に構成された制御切替手段と、複数の音響モデルを有して構成される切替用音響モデルデータベースと、複数の音響モデルから１つの音響モデルを選択する音響モデル制御切替手段との相互間において、データを入出力可能に構成され、
取得された現在の位置情報と、位置情報に含まれる場所状況データとに基づいて、音響モデル制御切替手段により複数の音響モデルから、場所状況データおよび位置情報に適した音響モデルを選択するように構成されている
ことを特徴とするナビゲーションシステムである。 The eighth invention of the present invention is:
Route information providing means for providing position information regarding a route traveled by the mobile body using a predetermined communication system; and a route based on the information provided on the mobile body and provided by the route information providing means. The navigation means configured to be output based on the road map information, the voice recognition means configured to be able to recognize voice, the position information can be acquired, and the control can be switched. The data is input between the control switching means, the switching acoustic model database configured with a plurality of acoustic models, and the acoustic model control switching means for selecting one acoustic model from the plurality of acoustic models. Configured to output,
Based on the acquired current position information and the location situation data included in the location information, the acoustic model control switching means selects an acoustic model suitable for the location situation data and location information from a plurality of acoustic models. It is a navigation system characterized by being configured.

この発明の第９の発明は、
移動体に備えられた複数の稼動部から出力される出力信号を入力可能に構成され、稼動部ごとの出力信号に基づいた、それぞれの稼動部に固有の雑音情報に対する雑音抑圧特性データを格納した、雑音抑圧特性データベースとの間においてさらにデータを入出力可能に構成され、
稼動部からの出力信号に基づいて、複数の音響モデルから選択された音響モデルに対して、雑音抑圧特性データベースの情報を加算して、音声認識処理を実行するように構成されている
ことを特徴とするナビゲーションシステムである。 The ninth aspect of the present invention is:
It is configured to be able to input output signals output from a plurality of operating units provided in the moving body, and stores noise suppression characteristic data for noise information specific to each operating unit based on the output signals for each operating unit. , Configured to be able to input and output data to and from the noise suppression characteristics database,
Based on the output signal from the operating unit, it is configured to add the information of the noise suppression characteristics database to the acoustic model selected from a plurality of acoustic models and execute the speech recognition process. It is a navigation system.

この発明の第１０の発明は、
移動体の移動する経路に関する位置情報を所定の通信システムを利用して提供する経路情報提供手段と、移動体に搭載されているとともに経路情報提供手段により提供された前記情報に基づく経路を所定の道路地図情報に基づいて出力可能に構成されたナビゲーション手段と、音素を合成して音声として出力可能に構成された音声合成手段と、発話音量の調節および／または発話音質の切替を実行可能に構成された発話音量音質切替手段と、位置情報を取得可能に構成されているとともに、制御の切替を実行可能に構成された制御切替手段との相互において、データを入出力可能に構成され、
現在の位置情報と、位置情報に含まれる場所状況データに基づいて、発話音量の調節および／または発話音質の切替を実行可能に構成されている
ことを特徴とするナビゲーションシステムである。 The tenth aspect of the present invention is:
Route information providing means for providing position information regarding a route traveled by the mobile body using a predetermined communication system; and a route based on the information provided on the mobile body and provided by the route information providing means. Navigation means configured to output based on road map information, speech synthesis means configured to synthesize phonemes and output as speech, and adjust speech volume and / or switch speech quality The utterance volume sound quality switching means is configured to be able to acquire position information and is configured to be able to input / output data between the control switching means configured to be able to perform control switching,
The navigation system is configured to be capable of adjusting the speech volume and / or switching the speech quality based on the current location information and the location situation data included in the location information.

この発明の第１１の発明は、
移動体の移動する経路に関する位置情報を所定の通信システムを利用して提供する経路情報提供手段と、
移動体に搭載されているとともに経路情報提供手段により提供された前記情報に基づく経路を所定の道路地図情報に基づいて出力可能に構成されたナビゲーション手段と、
音声を認識可能に構成された音声認識手段と、
複数の音響モデルを有して構成される切替用音響モデルデータベースと、
複数の音響モデルから１つの音響モデルを選択する音響モデル制御切替手段と、
音素を合成して、音声として出力可能に構成された音声合成手段と、
発話音量の調節および／または発話音質の切替を実行可能に構成された発話音量音質切替手段と、
位置情報を取得可能に構成されているとともに、音声認識手段と音声合成手段とにおける制御の切替を実行可能に構成された制御切替手段とが、相互にデータを入出力可能に構成され、
制御切替手段により音声認識手段が選択された場合に、取得された位置情報と、位置情報に含まれる場所状況データとに基づいて、音響モデル制御切替手段により複数の音響モデルから、場所状況データおよび位置情報に適した音響モデルを選択して、音声認識処理を実行するように構成されているとともに、
制御切替手段により音声合成手段が選択された場合に、取得された位置情報と、位置情報に含まれる場所状況データとに基づいて、発話音量の調節および／または発話音質の切替を実行して、合成された音声を出力するように構成されている
ことを特徴とするナビゲーションシステムである。 The eleventh aspect of the present invention is:
Route information providing means for providing position information relating to a route along which the mobile body moves using a predetermined communication system;
Navigation means mounted on a mobile body and configured to be able to output a route based on the information provided by the route information providing means based on predetermined road map information;
Speech recognition means configured to recognize speech;
A switching acoustic model database configured with a plurality of acoustic models;
Acoustic model control switching means for selecting one acoustic model from a plurality of acoustic models;
Speech synthesis means configured to synthesize phonemes and output as speech;
An utterance volume sound quality switching means configured to be able to adjust the utterance volume and / or switch the utterance sound quality;
The position change information is configured to be acquired, and the control switching unit configured to be able to execute control switching between the voice recognition unit and the voice synthesis unit is configured to be able to input and output data mutually.
When the voice recognition means is selected by the control switching means, the acoustic model control switching means, based on the acquired position information and the place situation data included in the position information, from the plurality of acoustic models, the location situation data and It is configured to select the acoustic model suitable for the location information and execute the speech recognition process,
When the speech synthesizer is selected by the control switching unit, based on the acquired position information and the place situation data included in the position information, the utterance volume is adjusted and / or the utterance sound quality is changed. The navigation system is configured to output synthesized speech.

この発明の第１２の発明は、
移動する経路に関する位置情報を所定の通信システムを利用して提供する経路情報提供手段と、経路情報提供手段により提供された前記情報に基づく経路を所定の道路地図情報に基づいて出力可能に構成されたナビゲーション手段と、音声を認識可能に構成された音声認識手段と、位置情報を取得可能に構成されているとともに制御の切替を実行可能に構成された制御切替手段と、複数の音響モデルを有して構成される切替用音響モデルデータベースと、複数の音響モデルから１つの音響モデルを選択する音響モデル制御切替手段との相互でデータを入出力可能に構成され、取得された位置情報と、位置情報に含まれる場所状況データとに基づいて、音響モデル制御切替手段により複数の音響モデルから、場所状況データおよび位置情報に適した音響モデルを選択するように構成されたナビゲーションシステムを備えた
ことを特徴とする移動体である。 The twelfth aspect of the present invention is
Route information providing means for providing position information regarding a moving route using a predetermined communication system, and a route based on the information provided by the route information providing means can be output based on predetermined road map information. Navigation means, voice recognition means configured to recognize voice, control switching means configured to be able to acquire position information and control switching, and a plurality of acoustic models. Configured to be able to input / output data between the acoustic model database for switching configured as described above and the acoustic model control switching means for selecting one acoustic model from a plurality of acoustic models. Based on the location status data included in the information, the acoustic model control switching means applies the location status data and the location information from a plurality of acoustic models. A moving body, comprising the navigation system configured to select an acoustic model.

この第１２の発明において、典型的には、移動体は、複数の稼動部を有し、複数の稼動部から出力される出力信号を入力可能に構成され、稼動部ごとの出力信号に基づく、複数の稼動部の各稼動部に固有の雑音情報に対する雑音抑圧特性データを格納した雑音抑圧特性データベースとの間においてさらにデータを入出力可能に構成され、稼動部からの出力信号に基づいて、複数の音響モデルから選択された音響モデルに対して、雑音抑圧特性データベースの情報を加算して、音声認識処理を実行するように構成されている。 In the twelfth aspect of the invention, typically, the moving body has a plurality of operating units, is configured to be able to input output signals output from the plurality of operating units, and is based on an output signal for each operating unit. It is configured to be able to further input / output data to / from the noise suppression characteristic database storing noise suppression characteristic data for noise information specific to each active part of multiple active parts, and based on output signals from the active parts, multiple The information of the noise suppression characteristic database is added to the acoustic model selected from these acoustic models, and the speech recognition process is executed.

この発明の第１３の発明は、
移動体の移動する経路に関する位置情報を所定の通信システムを利用して提供する経路情報提供手段と、移動体に搭載されているとともに経路情報提供手段により提供された前記情報に基づく経路を所定の道路地図情報に基づいて出力可能に構成されたナビゲーション手段と、音素を合成して音声として出力可能に構成された音声合成手段と、発話音量の
調節および／または発話音質の切替を実行可能に構成された発話音量音質切替手段と、位置情報を取得可能に構成されているとともに、制御の切替を実行可能に構成された制御切替手段との相互間において、データの入出力可能に構成され、
現在の位置情報と、位置情報に含まれる場所状況データに基づいて、発話音量の調節および／または発話音質の切替を実行可能に構成されたナビゲーションシステムを備えている
ことを特徴とする移動体である。 The thirteenth invention of the present invention is
Route information providing means for providing position information regarding a route traveled by the mobile body using a predetermined communication system; and a route based on the information provided on the mobile body and provided by the route information providing means. Navigation means configured to output based on road map information, speech synthesis means configured to synthesize phonemes and output as speech, and adjust speech volume and / or switch speech quality The utterance volume sound quality switching means is configured to be able to acquire position information and is configured to be able to input and output data between the control switching means configured to be able to perform control switching.
A mobile unit comprising a navigation system configured to be able to adjust the speech volume and / or switch the speech quality based on the current location information and the location status data included in the location information. is there.

この発明の第１４の発明は、
経路に関する位置情報を所定の通信システムを利用して提供する経路情報提供手段と、
経路情報提供手段により提供された前記情報に基づく経路を所定の道路地図情報に基づいて出力可能に構成されたナビゲーション手段と、
音声を認識可能に構成された音声認識手段と、
複数の音響モデルを有して構成される切替用音響モデルデータベースと、
複数の音響モデルから１つの音響モデルを選択する音響モデル制御切替手段と、
音素を合成して、音声として出力可能に構成された音声合成手段と、
発話音量の調節および／または発話音質の切替を実行可能に構成された発話音量音質切替手段と、
位置情報を取得可能に構成されているとともに、音声認識手段と音声合成手段とにおける制御の切替を実行可能に構成された制御切替手段とを有し、
制御切替手段により音声認識手段が選択された場合に、取得された位置情報と、位置情報に含まれる場所状況データとに基づいて、音響モデル制御切替手段により複数の音響モデルから、場所状況データおよび位置情報に適した音響モデルを選択して、音声認識処理を実行可能に構成されているとともに、
制御切替手段により音声合成手段が選択された場合に、取得された位置情報と、位置情報に含まれる場所状況データとに基づいて、発話音量の調節および／または発話音質の切替を実行して、合成された音声を出力可能に構成されたナビゲーションシステムを備えている
ことを特徴とする移動体である。 The fourteenth aspect of the present invention is:
Route information providing means for providing position information about the route using a predetermined communication system;
A navigation means configured to be able to output a route based on the information provided by the route information providing means based on predetermined road map information;
Speech recognition means configured to recognize speech;
A switching acoustic model database configured with a plurality of acoustic models;
Acoustic model control switching means for selecting one acoustic model from a plurality of acoustic models;
Speech synthesis means configured to synthesize phonemes and output as speech;
An utterance volume sound quality switching means configured to be able to adjust the utterance volume and / or switch the utterance sound quality;
It is configured to be able to acquire position information, and has a control switching unit configured to execute control switching between the voice recognition unit and the voice synthesis unit,
When the voice recognition means is selected by the control switching means, the acoustic model control switching means, based on the acquired position information and the place situation data included in the position information, from the plurality of acoustic models, the location situation data and It is configured to select an acoustic model suitable for location information and perform speech recognition processing.
When the speech synthesizer is selected by the control switching unit, based on the acquired position information and the place situation data included in the position information, the utterance volume is adjusted and / or the utterance sound quality is changed. A mobile object characterized by comprising a navigation system configured to be able to output synthesized speech.

この発明において、音声合成処理を実行する場合と音声認識処理を実行する場合とは、互いに独立させて実行させることも可能であり、同時に並行処理させるようにすることも可能である。すなわち、制御切替手段により、音声認識手段による音声認識処理と音声合成手段による音声合成処理とを択一的に実行させたり、同時に実行させたりすることが可能である。 In the present invention, the case where the speech synthesis process is executed and the case where the speech recognition process is executed can be executed independently of each other, or can be executed concurrently. That is, the control switching unit can alternatively execute the voice recognition process by the voice recognition unit and the voice synthesis process by the voice synthesis unit, or can execute them simultaneously.

上述のように構成されたこの発明によれば、取得された位置情報と場所状況データとに基づいて、複数の音響モデルから、場所状況データに適合した音響モデルを選択して音声認識処理や音声合成処理を実行していることにより、時間的な誤差を生じることなく、場所に応じて、正確な音声認識処理やユーザに聞きやすい合成音声の出力を実行することが可能である。 According to the present invention configured as described above, based on the acquired position information and location situation data, an acoustic model suitable for the location situation data is selected from a plurality of acoustic models, and voice recognition processing or voice By executing the synthesis process, it is possible to execute an accurate voice recognition process or output a synthesized voice that is easy to hear for the user, without causing a time error.

以上説明したように、この発明によれば、取得した位置情報と、この位置情報に含まれる場所状況データとに基づいて、音響モデル制御切替手段により複数の音響モデルから、場所状況データおよび位置情報に適合した音響モデルを選択するように構成していることにより、騒音や雑音などが発生する前に、これらの騒音や雑音の発生を想定して、状況に応じた音声合成の音声出力音量の変更や音声認識において使用する音響モデルの変更を準備することができ、利用するユーザに対して、スムーズで効果的な音声合成サービスや音声認識サービスを提供することができる。 As described above, according to the present invention, location status data and location information are obtained from a plurality of acoustic models by the acoustic model control switching means based on the acquired location information and location status data included in the location information. By selecting an acoustic model that conforms to the above, it is assumed that such noise and noise will be generated before the noise and noise are generated. It is possible to prepare changes and changes to the acoustic model used in speech recognition, and to provide a smooth and effective speech synthesis service and speech recognition service to the user to use.

以下、この発明の一実施形態について図面を参照しながら説明する。なお、以下の一実施形態の全図においては、同一または対応する部分には同一の符号を付す。図１に、この一実施形態による車両に設置されたカーナビゲーション装置のシステムの全体構成を示す。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings. In all the drawings of the following embodiment, the same or corresponding parts are denoted by the same reference numerals. FIG. 1 shows an overall configuration of a system of a car navigation device installed in a vehicle according to this embodiment.

（ナビゲーションシステム）
図１に示すように、この一実施形態による車両搭載型のナビゲーション装置１においては、音声を入力可能なマイク２と、音声を出力可能なスピーカ３と、ナビゲーション装置１に所望の指示を入力可能な入力手段としての入力部４と、データを出力する出力手段としての表示部５と接続されて構成されている。 (Navigation system)
As shown in FIG. 1, in the vehicle-mounted navigation device 1 according to this embodiment, a microphone 2 capable of inputting sound, a speaker 3 capable of outputting sound, and a desired instruction can be input to the navigation device 1. It is configured to be connected to an input unit 4 as a simple input unit and a display unit 5 as an output unit for outputting data.

このナビゲーション装置１は、内部に音声認識可能な音声認識手段および音響モデルの選択および切替を行う音響モデル制御切替部としての音声認識部７と音声合成可能な音声合成手段としての音声合成部６とを有している。また、入力部４から入力されたデータを処理したり表示部５に所望のデータを表示したり、そのほかのさまざまなナビゲーションを実行したりするための、ナビゲーション手段の制御部としてのナビＥＣＵ８が設けられている。また、位置測定を行うセンサを備えたＧＰＳ９と接続されて、現在位置測定部１０により車両の現在位置のデータを一意に測定可能に構成されている。 The navigation device 1 includes a voice recognition unit 7 capable of voice recognition, a voice recognition unit 7 serving as an acoustic model control switching unit for selecting and switching an acoustic model, and a voice synthesis unit 6 serving as a voice synthesis unit capable of voice synthesis. have. In addition, a navigation ECU 8 is provided as a control unit of the navigation means for processing data input from the input unit 4, displaying desired data on the display unit 5, and executing various other navigations. It has been. In addition, it is connected to a GPS 9 having a sensor for measuring the position, and is configured so that the current position measurement unit 10 can uniquely measure the current position data of the vehicle.

また、サーバ２０とナビゲーションのための有益な情報を提供する情報提供サービスプロバイダ（以下、ＡＳＰ）３０は、ネットワーク網３１を介して互いに接続されており、種々の交通情報のデータを送受信可能に構成されている。 In addition, the server 20 and an information providing service provider (hereinafter referred to as ASP) 30 that provides useful information for navigation are connected to each other via a network 31 so that various traffic information data can be transmitted and received. Has been.

サーバ２０は、音声通信処理部２１とデータ通信処理部２２とを有して構成されている。また、音声通信処理部２１は、音声合成処理部２１ａおよび音声認識処理部２１ｂを有して構成されている。なお、これらの処理部以外にも、たとえば音声・データ同期制御部などの、その他機能を実現するための種々の処理部（図示せず）が設けられている。 The server 20 includes a voice communication processing unit 21 and a data communication processing unit 22. The voice communication processing unit 21 includes a voice synthesis processing unit 21a and a voice recognition processing unit 21b. In addition to these processing units, various processing units (not shown) for realizing other functions such as a voice / data synchronization control unit are provided.

また、情報提供ＡＳＰ群である交通情報サーバ３０は、たとえば地図に関するデータベースである地図情報データベース３０ａに基づいて経路情報を提供可能な地図情報ＡＳＰや、交通に関するデータベースである交通情報データベース３０ｂに基づいて道路の混雑状況や事故状況を提供可能な交通情報ＡＳＰや、各種イベントの情報を提供可能なイベント情報データベース３０ｃを有するイベント情報ＡＳＰなどを有して構成されている。 The traffic information server 30 that is an information providing ASP group is based on, for example, map information ASP that can provide route information based on a map information database 30a that is a database related to maps, or a traffic information database 30b that is a database related to traffic. It includes traffic information ASP that can provide road congestion and accident conditions, and event information ASP that has an event information database 30c that can provide information on various events.

なお、この一実施形態においては、これらの情報のうちの地図情報データベース３０ａおよび交通情報データベース３０ｂから供給される位置情報データや、この位置情報データに含まれる場所状況データを用いて、ナビゲーション装置により走行経路のナビゲートが行われる。 In this embodiment, the navigation apparatus uses the position information data supplied from the map information database 30a and the traffic information database 30b and the location status data included in the position information data. The travel route is navigated.

また、ナビゲーション装置１は通信部１１を用いて通信回路網１２を経由してサーバ２０と接続し、音声やデータの送受信を実行可能に構成されている。なお、この通信部１１を車両から脱着可能に構成して、乗車時においてはナビゲーション装置１の周辺に設置し、それ以外は利用者が携帯して利用可能に構成することも可能である。 The navigation device 1 is connected to the server 20 via the communication network 12 using the communication unit 11, and is configured to be able to execute transmission and reception of voice and data. The communication unit 11 may be configured to be detachable from the vehicle, and may be installed around the navigation device 1 at the time of boarding. Otherwise, the user may carry and use the communication unit 11.

（音声認識合成装置）
次に、以上のように構成されたこの一実施形態によるカーナビゲーションシステムに設けられた音声認識部および音声合成部について説明する。図２に、この一実施形態による音声認識部および音声合成部を有するカーナビゲーションシステムおよび車両との接続関
係を示す。 (Speech recognition synthesizer)
Next, a voice recognition unit and a voice synthesis unit provided in the car navigation system according to the embodiment configured as described above will be described. FIG. 2 shows a car navigation system having a voice recognition unit and a voice synthesis unit according to this embodiment and a connection relationship with the vehicle.

図２に示すように、この一実施形態による音声対話型のカーナビゲーションシステムにおいては、音声認識合成装置１００を有して構成されている。なお、図２に示すカーナビゲーションシステムにおける音声認識合成装置以外の構成については、情報処理を実行するナビＥＣＵ８のみを示す。 As shown in FIG. 2, the voice interactive car navigation system according to the embodiment includes a speech recognition / synthesis device 100. Note that only the navigation ECU 8 that performs information processing is shown for the configuration other than the speech recognition and synthesis apparatus in the car navigation system shown in FIG.

また、車両（図示せず）には、エアコンを制御可能に構成されたエアコンＥＣＵ１０２、ワイパの駆動およびこの駆動の際にワイパ駆動信号を出力するワイパ駆動部１０３、エンジンの少なくとも回転数を検知可能に構成されたエンジンＥＣＵ１０４、車両におけるパワーウィンドウの開閉駆動およびこの際に、開閉のための制御信号を出力するパワーウィンドウ駆動部１０５、および方向指示器であるウィンカの点滅駆動の際の制御信号を出力するウィンカ駆動部１０６が設けられている。 In addition, in a vehicle (not shown), an air conditioner ECU 102 configured to be able to control an air conditioner, a wiper drive, a wiper drive unit 103 that outputs a wiper drive signal during this drive, and at least the rotational speed of the engine can be detected. The engine ECU 104 configured as described above, the power window opening / closing drive in the vehicle, and the power window driving unit 105 that outputs a control signal for opening / closing at this time, and the control signal at the time of blinking driving of the winker as the direction indicator A winker driving unit 106 for outputting is provided.

また、音声認識合成装置１００においては、制御切替手段としての音声関係の音声制御切替部１０１、音声認識部７および音声合成部６が設けられている。 In the speech recognition / synthesis apparatus 100, a speech-related speech control switching unit 101, a speech recognition unit 7 and a speech synthesis unit 6 are provided as control switching means.

音声制御切替部１０１は、ナビＥＣＵ８、エアコンＥＣＵ１０２、ワイパ駆動部１０３、エンジンＥＣＵ１０４、パワーウィンドウ駆動部１０５、ウィンカ駆動部１０６に接続されている。なお、必要に応じて、これら以外の稼動部や駆動に接続させることも可能である。 The voice control switching unit 101 is connected to the navigation ECU 8, the air conditioner ECU 102, the wiper driving unit 103, the engine ECU 104, the power window driving unit 105, and the winker driving unit 106. In addition, it is also possible to connect to operation parts and drives other than these as needed.

また、音声認識合成装置１００において、音声制御切替部１０１は、音声認識部７および音声合成部６に接続されている。 In the speech recognition / synthesis apparatus 100, the speech control switching unit 101 is connected to the speech recognition unit 7 and the speech synthesis unit 6.

そして、このカーナビゲーションに対する指示などのために、ユーザが発した音声が入力された場合、音声制御切替部１０１により音声認識部７に切り換えられて音声認識処理が実行される。他方、ユーザに対する経路案内などのために、ユーザに向けて音声が出力される場合には、音声制御切替部１０１により音声合成部６に切り換えられ、音声合成処理が実行される。 When a voice uttered by the user is input for an instruction to the car navigation, the voice control switching unit 101 switches to the voice recognition unit 7 to execute a voice recognition process. On the other hand, when voice is output to the user for route guidance to the user, the voice control switching unit 101 switches to the voice synthesis unit 6 to execute voice synthesis processing.

音声認識部７は、記録手段としてのたとえばハードディスクなどに記憶されて格納された音響モデルデータベース１２１との間で、データを入出力可能に構成されている。この音響モデルデータベース１２１は、音響モデル切替データベース（音響モデル切替ＤＢ）２２および音響モデル加算データベース（音響モデル加算ＤＢ）２３を有して構成されている。 The voice recognition unit 7 is configured to be able to input and output data with the acoustic model database 121 stored and stored in, for example, a hard disk as a recording unit. The acoustic model database 121 includes an acoustic model switching database (acoustic model switching DB) 22 and an acoustic model addition database (acoustic model addition DB) 23.

また、音響モデル切替ＤＢ１２２は、音声認識処理における種々の状況に対応した複数の音響モデルのデータが格納されて構成されている。この一実施形態において、複数の音響モデルとしては、高速道路を走行する状況に対応した高速道音響モデル１２２ａ、一般道路を走行する状況に対応した一般道音響モデル１２２ｂ、上り坂を走行する状況に対応した上り坂音響モデル１２２ｃ、山岳路を走行する状況に対応した山岳路音響モデル１２２ｄを有して構成されている。 The acoustic model switching DB 122 is configured by storing data of a plurality of acoustic models corresponding to various situations in the speech recognition processing. In this embodiment, the plurality of acoustic models include a highway acoustic model 122a corresponding to a situation of traveling on a highway, a general road acoustic model 122b corresponding to a situation of traveling on a general road, and a situation of traveling on an uphill. A corresponding uphill acoustic model 122c and a mountain road acoustic model 122d corresponding to a situation of traveling on a mountain road are provided.

なお、音響モデルとして対応する状況としては、必ずしもこれらの状況に限られるものではなく、可能な限り種々の状況を想定した音響モデルを構築して、音響モデル切替ＤＢ１２２に格納しておくことが望ましい。 Note that the situation corresponding to the acoustic model is not necessarily limited to these situations, and it is desirable to construct an acoustic model assuming various situations as much as possible and store it in the acoustic model switching DB 122. .

具体的にたとえば、道路工事が行われている道路を走行する状況に対応した道路工事音響モデル、降雨中での走行状況に対応した降雨音響モデル、降雪中での走行状況に対応し
た降雪音響モデル、タイヤチェーンを装着して走行する状況に対応したタイヤチェーン装着音響モデル、非舗装路を走行する状況に対応した非舗装路音響モデル、砂浜を走行する状況に対応した砂浜走行音響モデル、商店街を走行する状況に対応した商店街音響モデルなどを挙げることができる。 Specifically, for example, a road construction acoustic model corresponding to a situation where the road construction is being performed, a rain acoustic model corresponding to a running situation during rain, and a snowfall acoustic model corresponding to a running situation during snowfall , Tire chain mounted acoustic model corresponding to the situation where the tire chain is mounted, unpaved road acoustic model corresponding to the condition where the tire is traveling on the unpaved road, sandy beach acoustic model corresponding to the condition where the tire is driven on the sandy beach, shopping street The shopping street acoustic model corresponding to the situation of traveling can be listed.

そして、この一実施形態においては、音声認識部７により、音響モデル切替ＤＢ１２２に格納されている複数の音響モデルから、音声認識処理が実行される状況に最も適合した音響モデルが１つ選択されて、音声認識処理における音響モデルとして用いられる。 In this embodiment, the speech recognition unit 7 selects one acoustic model most suitable for the situation in which speech recognition processing is executed from the plurality of acoustic models stored in the acoustic model switching DB 122. It is used as an acoustic model in speech recognition processing.

また、音響モデル加算ＤＢ１２３は、音声認識部７により、音響モデル切替ＤＢ１２２に格納された複数の音響モデルのうちの１つの音響モデルが選択されて利用されるときに、選択された音響モデルに、雑音抑圧特性として加算されるデータが格納されたデータベースである。 In addition, the acoustic model addition DB 123 is used when the acoustic recognition unit 7 selects and uses one acoustic model among the plurality of acoustic models stored in the acoustic model switching DB 122. It is a database in which data to be added as noise suppression characteristics is stored.

この音響モデル加算データベース１２３は、車両に搭載された種々の機構における駆動時に発生する雑音を低減可能に構成された、複数の雑音抑圧特性のデータから構築されている。すなわち、車両に搭載されたたとえばエンジンやエアコンの稼動においては特定の雑音が生じ、これらの雑音特性は、あらかじめ認知可能な雑音である。 The acoustic model addition database 123 is constructed from a plurality of noise suppression characteristics data configured to be able to reduce noise generated during driving in various mechanisms mounted on the vehicle. That is, for example, specific noise is generated when an engine or an air conditioner mounted on a vehicle is operated, and these noise characteristics are noises that can be recognized in advance.

たとえば、エンジンの駆動により発生する雑音は、エンジンの回転数に応じて雑音の周波数も変わるが、この周波数の変化は回転数に依存する程度の雑音であり、あらかじめ測定してその特性を認識しておくことが可能な雑音である。 For example, the noise generated by driving the engine changes the frequency of the noise depending on the engine speed. This frequency change is a noise that depends on the engine speed, and is measured in advance to recognize its characteristics. Noise that can be kept.

また、エアコンを稼動させたときに生じる雑音は、エアコンから出される風量などの種々の要素によって変化するが、この雑音も、あらかじめ測定してその特性を認識可能な、いわゆる既知の雑音である。 The noise generated when the air conditioner is operated varies depending on various factors such as the amount of air emitted from the air conditioner. This noise is also known so-called noise that can be measured in advance and its characteristics can be recognized.

同様に、ワイパを駆動させたときに生じる雑音や、パワーウィンドウを稼動させて窓を開閉する際に発生する雑音、またはウィンカの稼働時の雑音についても、既知の雑音である。 Similarly, noise generated when the wiper is driven, noise generated when the power window is operated to open / close the window, or noise when the winker is operating are also known noises.

これらの機構による雑音抑圧特性は、駆動により発生する雑音の特性であるため、エアコンＥＣＵ１０２、ワイパ駆動部１０３、エンジンＥＣＵ１０４、パワーウィンドウ駆動部１０５およびウィンカ駆動部１０６から出力される駆動信号が入力されることにより、その稼動状態を検知することが可能である。 Since the noise suppression characteristics by these mechanisms are characteristics of noise generated by driving, drive signals output from the air conditioner ECU 102, the wiper driving unit 103, the engine ECU 104, the power window driving unit 105, and the winker driving unit 106 are input. Thus, it is possible to detect the operating state.

この一実施形態においては、これらのエアコンＥＣＵ１０２、ワイパ駆動部１０３、エンジンＥＣＵ１０４、パワーウィンドウ駆動部１０５およびウィンカ駆動部１０６から出力される駆動信号は、音声制御切替部１０１に供給される。そして、この駆動信号に応じて、音声認識部７により、音響モデル加算ＤＢ１２３から該当する雑音抑圧特性のデータが抽出され、そのとき使用されている音響モデルに加算されて、既知の雑音に関する雑音を抑圧することが可能となるので、雑音抑圧をより効果的に行うことが可能となる。 In this embodiment, drive signals output from the air conditioner ECU 102, the wiper drive unit 103, the engine ECU 104, the power window drive unit 105, and the winker drive unit 106 are supplied to the voice control switching unit 101. Then, in accordance with this drive signal, the speech recognition unit 7 extracts the corresponding noise suppression characteristic data from the acoustic model addition DB 123 and adds it to the acoustic model used at that time, thereby adding noise related to known noise. Since it becomes possible to suppress, it becomes possible to perform noise suppression more effectively.

（音声認識処理）
次に、以上のように構成されたこの一実施形態による音声認識装置を搭載したカーナビゲーションシステムにおける、音声認識処理方法について説明する。図３に、この一実施形態による音声認識処理方法を示す。 (Voice recognition processing)
Next, a voice recognition processing method in the car navigation system equipped with the voice recognition device according to the embodiment configured as described above will be described. FIG. 3 shows a speech recognition processing method according to this embodiment.

図３に示すように、この一実施形態による音声認識方法においては、まず、ステップＳＴ１において、車載ＬＡＮ（ローカルエリアネットワーク）を介して、現在の車両の位置
情報および走行路情報が音声制御切替部１０１に供給される。 As shown in FIG. 3, in the speech recognition method according to this embodiment, first, in step ST1, the current vehicle position information and travel route information are transmitted via the in-vehicle LAN (local area network) to the voice control switching unit. 101.

すなわち、ナビゲーションシステムの現在位置測定部１０により、車両の現在位置が取得されるとともに、交通情報サーバ３０の交通情報データベース３０ｂから通信部１１を通じて、交通情報データに含まれる現在位置以降に走行する走行路の情報、いわゆる道路状況データが取得される。これにより、音響モデル切替ＤＢ１２２に格納された音響モデルの選択、索出に必要な情報を得ることが可能となる。 That is, the current position of the vehicle is acquired by the current position measurement unit 10 of the navigation system, and the vehicle travels after the current position included in the traffic information data from the traffic information database 30b of the traffic information server 30 through the communication unit 11. Road information, so-called road condition data is acquired. As a result, it is possible to obtain information necessary for selecting and searching for the acoustic model stored in the acoustic model switching DB 122.

次に、ステップＳＴ２に移行して、同様に車載ＬＡＮを介して、現在の車両において駆動している機器の動作情報が、音声制御切替部１０１に供給される。すなわち、エアコンＥＣＵ１０２、ワイパ駆動部１０３、エンジンＥＣＵ１０４、パワーウィンドウ駆動部１０５、ウィンカ駆動部１０６などから駆動信号が音声制御切替部１０１に供給される。 Next, the process proceeds to step ST <b> 2, and similarly, the operation information of the device driven in the current vehicle is supplied to the voice control switching unit 101 via the in-vehicle LAN. That is, drive signals are supplied to the audio control switching unit 101 from the air conditioner ECU 102, the wiper drive unit 103, the engine ECU 104, the power window drive unit 105, the winker drive unit 106, and the like.

すなわち、たとえば車両が走行中の場合には、エンジンＥＣＵ１０４からエンジン各部に、回転数のデータを含む制御信号が供給されている。この場合においては、音声制御切替部１０１にも同様に、回転数データを含む制御信号が供給されている。さらに、たとえば走行中に降雨に遭遇した場合、通常、ワイパが駆動される。そのため、ワイパ駆動部１０３からワイパ（図示せず）および音声制御切替部１０１に駆動信号が供給される。 That is, for example, when the vehicle is running, the engine ECU 104 supplies a control signal including rotation speed data to each part of the engine. In this case, the audio control switching unit 101 is similarly supplied with a control signal including rotation speed data. Further, for example, when rain is encountered during traveling, the wiper is normally driven. Therefore, a drive signal is supplied from the wiper driving unit 103 to the wiper (not shown) and the audio control switching unit 101.

以上のステップＳＴ１およびステップＳＴ２により、車両の現在位置が認識され、車両が直後に走行する走行路情報を取得することができる。なお、ステップＳＴ１およびステップＳＴ２の処理は、逆順に実行することも可能である。 Through the above-described steps ST1 and ST2, the current position of the vehicle is recognized, and travel path information on which the vehicle travels immediately afterward can be acquired. Note that the processing of step ST1 and step ST2 can be executed in reverse order.

次に、ステップＳＴ３に移行して、車両の走行において、直後の走行路や走行状態、すなわち走行状況に変更が生じるか否かの判定が行われる。ここで、直後に走行路の変更がされない場合には、後述するステップＳＴ５に移行する。他方、直後に走行路が変更される場合には、ステップＳＴ４に移行する。 Next, the process proceeds to step ST3, and it is determined whether or not there is a change in the immediately following traveling path or traveling state, that is, the traveling state, during traveling of the vehicle. If the travel route is not changed immediately after that, the process proceeds to step ST5 described later. On the other hand, when the travel route is changed immediately after that, the process proceeds to step ST4.

ステップＳＴ４においては、ステップＳＴ１において取得した走行路情報に基づいて、発生すると予測される雑音下におけるもっとも適した音響モデルが索出されて、音声認識部７により音響モデルの切替が行われる。 In step ST4, the most suitable acoustic model under the noise predicted to be generated is searched based on the travel route information acquired in step ST1, and the acoustic recognition unit 7 switches the acoustic model.

具体的に、たとえば、ステップＳＴ１において車両が一般道を走行している状態から、直後に高速道を走行する走行路情報が取得された場合、このステップＳＴ３において、音声認識処理において使用される音響モデルが、一般道音響モデル１２２ｂから高速道音響モデル１２２ａに切り替えられる。そして、この音響モデルの切替は、車両が一般道から高速道に進入するタイミングに合わせて、実行される。 Specifically, for example, when travel path information for traveling on a highway is acquired immediately after the vehicle is traveling on a general road in step ST1, the sound used in the speech recognition process in step ST3. The model is switched from the general road acoustic model 122b to the highway acoustic model 122a. The switching of the acoustic model is executed in accordance with the timing when the vehicle enters the highway from the general road.

その後、ステップＳＴ５に移行して、ステップＳＴ２において音声認識部７が取得した駆動信号によって、音響モデル加算ＤＢ１２３における雑音抑圧特性のデータ（図２においては、エンジン音雑音抑圧特性データ１２３ａ、エアコン雑音抑圧特性データ１２３ｂ、ワイパ雑音抑圧特性データ１２３ｃ、窓開閉雑音抑圧特性データ１２３ｄ、ウィンカ雑音抑圧特性データ１２３ｅ）から、該当する雑音抑圧特性データ（たとえば、エンジン音雑音抑圧特性データ１２３ａ、ワイパ雑音抑圧特性データ１２３ｃ）が索出される。 Thereafter, the process proceeds to step ST5, and the noise suppression characteristic data in the acoustic model addition DB 123 (in FIG. 2, engine noise noise suppression characteristic data 123a, air conditioner noise suppression is determined by the drive signal acquired by the speech recognition unit 7 in step ST2. From the characteristic data 123b, the wiper noise suppression characteristic data 123c, the window opening / closing noise suppression characteristic data 123d, and the winker noise suppression characteristic data 123e), the corresponding noise suppression characteristic data (for example, engine sound noise suppression characteristic data 123a, wiper noise suppression characteristic data). 123c) is retrieved.

このようにして索出された雑音抑圧特性データは、その段階で使用されている所定の音響モデルに、エンジン音雑音抑圧特性データ１２３ａなどが加算されて、音声認識処理の実行が継続される。具体的に、たとえば、一般道音響モデル１２２ｂから切り換えられた場合には、高速道音響モデル１２２ａが用いられている場合に、このエンジンの回転数に応じた雑音を抑圧するためのエンジン音雑音抑圧特性が、高速道音響モデル１２２ａに加
算されて、音声認識処理の実行が継続される。 The noise suppression characteristic data retrieved in this way is added with the engine sound noise suppression characteristic data 123a and the like to the predetermined acoustic model used at that stage, and the execution of the speech recognition process is continued. Specifically, for example, when switching from the general road acoustic model 122b, when the expressway acoustic model 122a is used, the engine noise suppression for suppressing noise according to the engine speed is used. The characteristic is added to the expressway acoustic model 122a, and the execution of the speech recognition process is continued.

以上のようにして、現在の位置情報と、その後に走行する走行路の情報とに基づいて、走行路情報に基づいた雑音の発生状況を事前に想定して、この雑音に対して最も適切な音響モデルが索出され、音声認識部７により適切な音響モデルが選択されて音声認識処理が実行される。 As described above, on the basis of the current position information and the information of the travel route that travels after that, the noise occurrence state based on the travel route information is assumed in advance, and the most appropriate for this noise An acoustic model is searched out, and an appropriate acoustic model is selected by the speech recognition unit 7 to execute speech recognition processing.

（音声合成処理）
次に、音声合成処理について説明する。図４に、この一実施形態による音声合成処理のフローチャートを示す。 (Speech synthesis processing)
Next, speech synthesis processing will be described. FIG. 4 shows a flowchart of the speech synthesis process according to this embodiment.

図４に示すように、この一実施形態においては、音声認識処理におけると同様に、ステップＳＴ１１において車載ＬＡＮを通じて、現在位置情報および走行路情報を取得するとともに、ステップＳＴ１２において、エアコンＥＣＵ１０２、ワイパ駆動部１０３、エンジンＥＣＵ１０４、パワーウィンドウ駆動部１０５、ウィンカ駆動部１０６などから音声制御切替部１０１に駆動信号が供給される。 As shown in FIG. 4, in this embodiment, as in the voice recognition process, the current position information and the travel route information are acquired through the in-vehicle LAN in step ST11, and in step ST12, the air conditioner ECU 102 and the wiper drive are acquired. Drive signals are supplied to the voice control switching unit 101 from the unit 103, the engine ECU 104, the power window driving unit 105, the winker driving unit 106, and the like.

次に、ステップＳＴ１３に移行して、走行状況の変更が生じるか、駆動部の動作のオン／オフが変化するかうちの、少なくとも一方の状況が生じるか否かの判定が行われる。 Next, the process proceeds to step ST13, in which it is determined whether or not at least one of a situation in which a change in the driving situation occurs or an on / off state of the operation of the driving unit changes occurs.

判定の結果、走行路の変更もなく、駆動部の動作に変化が生じない場合には、そのときの設定状況のままで音声合成処理が実行される。他方、音声認識処理時における場合と同様に、直後に走行状況が変更されるか、各種駆動部の動作に変化が生じた場合には、ステップＳＴ１４に移行する。 As a result of the determination, if there is no change in the travel path and no change in the operation of the drive unit, the speech synthesis process is executed with the setting state at that time. On the other hand, as in the case of the voice recognition process, when the traveling state is changed immediately or when the operation of various driving units changes, the process proceeds to step ST14.

ステップＳＴ１４においては、場所状況データにおける変更路の変更および／または、駆動信号に基づいたそれぞれの駆動部の動作に応じて、発話音量および／または発話音質の調整が実行される。 In step ST14, the utterance volume and / or the utterance sound quality are adjusted in accordance with the change of the change path in the location status data and / or the operation of each drive unit based on the drive signal.

すなわち、一般道を走行している車両における走行経路が直後に山岳道に変わる状況を含む位置情報および場所状況データが音声制御切替部１０１に供給された場合、一般道から山岳道に走行経路が変わると、雑音が増加する場合が多い。そこで、音声制御切替部１０１からの信号が音声合成部６に供給され、音声合成部６からの制御信号に基づいて、発話音量／発話音質切替部１０７によりスピーカ３から出力される音量の増減が行われる。なお、この音量の増減の代わりに、音質の切替を行うことも可能であり、たとえば、男性の音声から女性の音声に切り換えるようにすることも可能である。 That is, when position information and location status data including a situation in which a travel route in a vehicle traveling on a general road immediately changes to a mountain road is supplied to the voice control switching unit 101, the travel route is from the general road to the mountain road. If it changes, noise often increases. Therefore, the signal from the voice control switching unit 101 is supplied to the voice synthesis unit 6, and the volume of sound output from the speaker 3 by the utterance volume / speech sound quality switching unit 107 is increased or decreased based on the control signal from the voice synthesis unit 6. Done. Note that, instead of increasing or decreasing the volume, it is also possible to switch the sound quality. For example, it is possible to switch from a male voice to a female voice.

また、たとえば、走行中に降雨に遭遇し、ワイパを駆動させる場合においても、ワイパ駆動部１０３からワイパの駆動信号が出力され、音声制御切替部１０１に供給される。音声制御切替部１０１に供給された駆動信号は、音声合成処理時において、音声合成部６に供給される。この駆動信号を取得した音声合成部６により、発話音量／発話音質切替部１０７に制御信号が供給される。そして、発話音量／発話音質切替部１０７により発話音量の調整が行われる。なお、これらの駆動部の変化に伴う発話音量の調整の代わりに発話音質の切り換えを行うことも可能であるが、それぞれの駆動部の状況の変化は、短時間で生じるものであるため、発話音量の調整を採用するのが望ましい。 Further, for example, even when rain is encountered during driving and the wiper is driven, a wiper driving signal is output from the wiper driving unit 103 and supplied to the voice control switching unit 101. The drive signal supplied to the voice control switching unit 101 is supplied to the voice synthesis unit 6 during the voice synthesis process. The voice synthesis unit 6 that has acquired the drive signal supplies a control signal to the speech volume / speech sound quality switching unit 107. Then, the utterance volume / speech sound quality switching unit 107 adjusts the utterance volume. Note that it is possible to switch the speech quality instead of adjusting the speech volume accompanying the change of these drive units, but the change in the status of each drive unit occurs in a short time, so the speech It is desirable to employ volume adjustment.

以上のようにして、この発明による音声認識処理および音声合成処理が実行される。これらの音声認識処理や音声合成処理は、並行して実行することも可能であり、適時切り換えて実行することも可能である。 As described above, the speech recognition processing and speech synthesis processing according to the present invention are executed. These speech recognition processing and speech synthesis processing can be executed in parallel, or can be executed by switching timely.

以上、この発明の一実施形態について具体的に説明したが、この発明は、上述の一実施形態に限定されるものではなく、この発明の技術的思想に基づく各種の変形が可能である。 The embodiment of the present invention has been specifically described above, but the present invention is not limited to the above-described embodiment, and various modifications based on the technical idea of the present invention are possible.

たとえば、上述の一実施形態において挙げた音響モデルや雑音抑圧特性データはあくまでも例に過ぎず、必要に応じてこれと異なる音響モデルや雑音抑圧特性データを用いてもよい。 For example, the acoustic model and noise suppression characteristic data listed in the above-described embodiment are merely examples, and different acoustic models and noise suppression characteristic data may be used as necessary.

たとえば、上述の一実施形態においては、車両に搭載する音声認識合成装置１００に音声合成部６および音声認識部７を設けて、それぞれ音声合成処理および音声認識処理を実行するようにした、いわゆるローカル型の音声認識合成装置を採用しているが、図５に示すように、サーバ２０における音声合成処理部２１ａおよび音声認識処理部２１ｂにおいて、それぞれ本発明による音声合成処理および音声認識処理を実行することが可能である。 For example, in the above-described embodiment, the speech synthesis / synthesis device 100 mounted on the vehicle is provided with the speech synthesis unit 6 and the speech recognition unit 7 to execute speech synthesis processing and speech recognition processing, respectively. 5, the speech synthesis processing unit 21 a and the speech recognition processing unit 21 b in the server 20 respectively execute speech synthesis processing and speech recognition processing according to the present invention, as shown in FIG. 5. It is possible.

すなわち、上述の一実施形態における車両に搭載されたナビゲーションの音声認識合成装置に格納された音響モデル切替データベースおよび音響モデル加算データベースを、サーバ２０の音声認識処理部２１ｂにおける所定の記憶部（図示せず）に格納しておき、図２に示す音声認識部７による音声認識処理を、サーバ２０における音声認識処理部２１ｂに実行させるようにし、音声合成部６による音声合成処理を音声合成処理部２１ａに実行させるようにする。 That is, the acoustic model switching database and the acoustic model addition database stored in the speech recognition / synthesis apparatus for navigation mounted on the vehicle in the above-described embodiment are stored in a predetermined storage unit (not shown) in the speech recognition processing unit 21b of the server 20. 2), the speech recognition processing unit 21b in the server 20 executes the speech recognition processing by the speech recognition unit 7 shown in FIG. 2, and the speech synthesis processing by the speech synthesis unit 6 is performed by the speech synthesis processing unit 21a. To make it run.

そして、マイク２から入力された音声認識処理に用いられる入力データや、スピーカ３から出力される音声合成処理が施された出力データは、通信部１１および通信回路網１２を介して、サーバ２０において、送受信するように構成することが可能である。 Then, the input data used for the speech recognition process input from the microphone 2 and the output data subjected to the speech synthesis process output from the speaker 3 are transmitted to the server 20 via the communication unit 11 and the communication network 12. Can be configured to transmit and receive.

このような、分散型で構成した場合、車両には、通常のナビゲーション装置および外部と通信可能な携帯端末や携帯電話が設置されているのみであり、実際の処理は、サーバ２０において実行される。そのため、通常の携帯電話を用いた通話や、ナビゲーション装置の操作を大きく変更することなく、音声認識処理による操作を導入したり、合成音声による走行案内やイベント案内などを採用したりすることが可能となる。 In the case of such a distributed configuration, the vehicle only has a normal navigation device and a portable terminal or a mobile phone that can communicate with the outside, and the actual processing is executed in the server 20. . Therefore, it is possible to introduce operations using voice recognition processing, adopt driving guidance and event guidance using synthesized speech, etc. without significantly changing the operation of a normal mobile phone or navigation device. It becomes.

また、上述の一実施形態においては、１つの装置内に、音声合成部および音声認識部を設けているが、これらの音声合成部と音声認識部とを別々のハードウェアによって構成することも可能であり、この場合には、音声認識装置と音声合成装置を別体で車両に設置したり、一方の装置のみを車両に設置したりする。 In the above-described embodiment, the speech synthesizer and the speech recognizer are provided in one device. However, the speech synthesizer and the speech recognizer can be configured by separate hardware. In this case, the speech recognition device and the speech synthesizer are separately installed in the vehicle, or only one device is installed in the vehicle.

この発明の一実施形態による通信型ナビゲーションシステムの構成を示す図である。It is a figure which shows the structure of the communication type navigation system by one Embodiment of this invention. この発明の一実施形態による音声認識合成装置を示すブロック図である。It is a block diagram which shows the speech recognition synthesizer by one Embodiment of this invention. この発明の一実施形態による音声認識処理方法を示すフローチャートである。It is a flowchart which shows the speech recognition processing method by one Embodiment of this invention. この発明の一実施形態による音声合成処理方法を示すフローチャートである。It is a flowchart which shows the speech synthesis processing method by one Embodiment of this invention. この発明の一実施形態の他の例による通信型ナビゲーションシステムの構成を示す図である。It is a figure which shows the structure of the communication type navigation system by the other example of one Embodiment of this invention.

Explanation of symbols

１ナビゲーション装置
２マイク
３スピーカ
４入力部
５表示部
６音声合成部
７音声認識部
８ナビＥＣＵ
９ＧＰＳ
１０現在位置測定部
１１通信部
１２通信回路網
２０サーバ
２１音声通信処理部
２１ａ音声合成処理部
２１ｂ音声認識処理部
２２データ通信処理部
３０交通情報サーバ
３０ａ地図情報データベース
３０ｂ交通情報データベース
３０ｃイベント情報データベース
３１ネットワーク網
１００音声認識合成装置
１０１音声制御切替部
１０２エアコンＥＣＵ
１０３ワイパ駆動部
１０４エンジンＥＣＵ
１０５パワーウィンドウ駆動部
１０６ウィンカ駆動部
１０７発話音質切替部
１２１音響モデルデータベース
１２２音響モデル切替データベース（音響モデル切替ＤＢ）
１２２ａ高速道音響モデル
１２２ｂ一般道音響モデル
１２２ｃ上り坂音響モデル
１２２ｄ山岳路音響モデル
１２３音響モデル加算データベース（音響モデル加算ＤＢ）
１２３ａエンジン音雑音抑圧特性データ
１２３ｂエアコン雑音抑圧特性データ
１２３ｃ雑音抑圧特性データ
１２３ｄ窓開閉雑音抑圧特性データ
１２３ｅウィンカ雑音抑圧特性データ DESCRIPTION OF SYMBOLS 1 Navigation apparatus 2 Microphone 3 Speaker 4 Input part 5 Display part 6 Speech synthesizer part 7 Voice recognition part 8 Navigation ECU
9 GPS
DESCRIPTION OF SYMBOLS 10 Current position measuring part 11 Communication part 12 Communication network 20 Server 21 Voice communication process part 21a Speech synthesis process part 21b Speech recognition process part 22 Data communication process part 30 Traffic information server 30a Map information database 30b Traffic information database 30c Event information database 31 Network Network 100 Speech Recognition / Synthesizer 101 Voice Control Switching Unit 102 Air Conditioner ECU
103 Wiper drive unit 104 Engine ECU
105 power window driving unit 106 winker driving unit 107 utterance sound quality switching unit 121 acoustic model database 122 acoustic model switching database (acoustic model switching DB)
122a Expressway acoustic model 122b General road acoustic model 122c Uphill acoustic model 122d Mountain road acoustic model 123 Acoustic model addition database (acoustic model addition DB)
123a Engine sound noise suppression characteristic data 123b Air conditioner noise suppression characteristic data 123c Noise suppression characteristic data 123d Window opening / closing noise suppression characteristic data 123e Winker noise suppression characteristic data

Claims

A voice recognition means configured to be capable of voice recognition;
Control switching means configured to be capable of acquiring position information and capable of switching control; and
A switching acoustic model database configured with a plurality of acoustic models;
Acoustic model control switching means for selecting one acoustic model from the plurality of acoustic models,
Based on the acquired position information and the location situation data included in the location information, an acoustic model adapted to the location situation data and the location information is obtained from the plurality of acoustic models by the acoustic model control switching means. A speech recognition apparatus that is configured to be capable of performing speech recognition processing using the selected acoustic model.

It is configured to be able to input output signals output from multiple operating parts,
Based on the output signal for each operating part, further comprising a noise suppression characteristic database storing noise suppression characteristic data for noise information specific to each operating part,
Based on an output signal from the operating unit that is in operation, the information of the noise suppression characteristic database is added to the acoustic model selected from the plurality of acoustic models, and the speech recognition process is executed. The speech recognition apparatus according to claim 1, wherein:

Speech recognition means configured to recognize speech;
Control switching means configured to be capable of acquiring position information and capable of switching control; and
A switching acoustic model database configured with a plurality of acoustic models;
The acoustic model control switching means for selecting one acoustic model from the plurality of acoustic models is configured to transmit and receive data to and from each other.
Based on the acquired position information and the location situation data included in the location information, an acoustic model suitable for the location situation data and the location information is obtained from the plurality of acoustic models by the acoustic model control switching means. A speech recognition system characterized by being configured to select.

It is configured to be able to input output signals from multiple operating parts,
Based on the output signal for each operating part, further comprising a noise suppression characteristic database storing noise suppression characteristic data for noise information specific to each operating part,
Based on the output signal from the operating part in operation, the information of the noise suppression characteristic database is added to the acoustic model selected from the plurality of acoustic models, and the speech recognition process is executed. The speech recognition system according to claim 3, wherein the speech recognition system is configured.

Speech recognition means configured to recognize speech;
A switching acoustic model database configured with a plurality of acoustic models;
Acoustic model control switching means for selecting one acoustic model from the acoustic model database for switching;
Using the control switching means configured to be able to acquire the position information including the location status data and capable of switching the control,
A speech recognition method, wherein an acoustic model that matches the location status data is selected from the plurality of acoustic models by the acoustic model control switching means based on the location status data.

It is configured to be able to input output signals from multiple operating parts,
Further comprising a noise suppression characteristic database storing noise suppression characteristic data for noise information specific to each of the operating units based on the output signal of each operating unit;
Based on the output signal from the operating section in operation, the information in the noise suppression characteristic database is added to the acoustic model selected from the plurality of acoustic models, and the speech recognition process is executed. The speech recognition method according to claim 5, wherein:

A speech synthesizer configured to synthesize phonemes and output as speech data;
An utterance volume sound quality switching means configured to be able to adjust the utterance volume and / or switch the utterance sound quality;
And having a control switching means configured to be able to acquire position information and capable of switching control,
A speech synthesizer characterized in that the speech volume adjustment and / or speech quality switching can be performed based on the current location information and the location status data included in the location information.

Including speech synthesis means configured to synthesize phonemes and output as speech data, speech volume sound quality switching means configured to adjust speech volume and / or switch speech quality, and location status data The control switching means configured to be able to acquire position information and to be able to execute control switching is configured to be able to transmit and receive data to and from each other,
A speech synthesizing system configured to be able to adjust the utterance volume and / or switch the utterance sound quality based on the acquired current position information and the location status data.

A speech synthesis means configured to synthesize phonemes and output as speech, a speech volume sound quality switching means configured to execute speech volume adjustment and / or speech quality switching, and a position including location status data Using control switching means configured to be able to acquire information and configured to be able to perform control switching,
A speech synthesizing method characterized in that, when outputting speech, the speech volume is adjusted and / or the speech quality is switched based on the current position information and the location status data.

Speech recognition means configured to recognize speech;
A switching acoustic model database configured with a plurality of acoustic models;
Acoustic model control switching means for selecting one acoustic model from the plurality of acoustic models;
Speech synthesis means configured to synthesize phonemes and output as speech;
An utterance volume sound quality switching means configured to be able to adjust the utterance volume and / or switch the utterance sound quality;
And a control switching unit configured to be able to acquire position information, and configured to execute control switching between the voice recognition unit and the voice synthesis unit,
When a voice recognition unit is selected by the control switching unit, the acoustic model control switching unit selects from the plurality of acoustic models based on the acquired position information and the place situation data included in the position information. In addition, the acoustic model suitable for the location status data and the location information is selected and the voice recognition process can be executed.
When the speech synthesizer is selected by the control switching unit, the utterance volume adjustment and / or utterance sound quality switching is performed based on the acquired position information and the place situation data included in the position information. A speech recognition synthesizer that is configured to execute and output synthesized speech.

Route information providing means for providing position information relating to a route along which the mobile body moves using a predetermined communication system, and the route based on the information mounted on the mobile body and provided by the route information providing means Navigation means configured to be output based on predetermined road map information, voice recognition means configured to be able to recognize voice, and the position information can be acquired, and control switching is performed Between the control switching means configured to be possible, the acoustic model database for switching configured with a plurality of acoustic models, and the acoustic model control switching means for selecting one acoustic model from the plurality of acoustic models Is configured to be able to input and output data,
Based on the acquired current location information and the location status data included in the location information, the acoustic model control switching means uses the acoustic model control switching means to generate an acoustic suitable for the location status data and the location information. A navigation system that is configured to select a model.

Noise suppression characteristic data for noise information specific to each operating unit, based on an output signal for each operating unit, configured to be able to input output signals output from a plurality of operating units provided in the moving body Is configured to be able to search further noise suppression characteristics data from the noise suppression characteristics database,
Based on the output signal from the operating unit, the information of the searched noise suppression characteristic database is added to the acoustic model selected from the plurality of acoustic models, and the speech recognition process is executed. It is comprised by these. The navigation system of Claim 11 characterized by the above-mentioned.

Route information providing means for providing position information relating to a route along which the mobile body moves using a predetermined communication system, and the route based on the information mounted on the mobile body and provided by the route information providing means Navigation means configured to be output based on predetermined road map information, speech synthesis means configured to be able to output phonemes as speech, and adjustment of speech volume and / or switching of speech quality It is configured to enable input / output of data between the utterance volume sound quality switching means configured to be capable of acquiring position information and the control switching means configured to be able to perform control switching. ,
A navigation system configured to be capable of adjusting the utterance volume and / or switching the utterance sound quality on the basis of current position information and place situation data included in the position information.

Route information providing means for providing position information relating to a route along which the mobile body moves using a predetermined communication system;
Navigation means mounted on the mobile body and configured to output the route based on the information provided by the route information providing unit based on predetermined road map information;
Speech recognition means configured to recognize speech;
A switching acoustic model database configured with a plurality of acoustic models;
Acoustic model control switching means for selecting one acoustic model from the plurality of acoustic models;
Speech synthesis means configured to synthesize phonemes and output as speech;
An utterance volume sound quality switching means configured to be able to adjust the utterance volume and / or switch the utterance sound quality;
The position change information is configured to be acquired, and the control switching unit configured to be able to execute control switching between the voice recognition unit and the voice synthesis unit is configured to be able to input and output data to and from each other.
When a voice recognition unit is selected by the control switching unit, the acoustic model control switching unit selects from the plurality of acoustic models based on the acquired position information and the place situation data included in the position information. The voice recognition process is performed by selecting an acoustic model suitable for the location situation data and the position information, and
When the speech synthesizer is selected by the control switching unit, the utterance volume adjustment and / or utterance sound quality switching is performed based on the acquired position information and the place situation data included in the position information. A navigation system that is configured to execute and output synthesized speech.

Route information providing means for providing position information about a moving route using a predetermined communication system, and the route based on the information provided by the route information providing means can be output based on predetermined road map information A configured navigation means; a voice recognition means configured to be able to recognize voice; a control switching means configured to be able to acquire the position information and capable of switching control; and a plurality of sound An acoustic model database for switching configured with a model and an acoustic model control switching means for selecting one acoustic model from the plurality of acoustic models are configured to be able to input and output data between each other, and the acquired Based on the position information and the location situation data included in the position information, from the plurality of acoustic models by the acoustic model control switching means, Moving body, comprising the navigation system configured to select an acoustic model suitable for the serial location status data and the position information.

Has multiple working parts,
It is configured to be able to input output signals output from the plurality of operating units, and stores noise suppression characteristic data for noise information specific to each operating unit of the plurality of operating units based on the output signal for each operating unit. It is configured to be able to input / output data to / from the noise suppression characteristics database,
Based on the output signal from the operating unit, the information of the noise suppression characteristic database is added to the acoustic model selected from the plurality of acoustic models, and the voice recognition process can be executed. The moving body according to claim 15, characterized in that:

Route information providing means for providing position information relating to a route along which the mobile body moves using a predetermined communication system, and the route based on the information mounted on the mobile body and provided by the route information providing means Navigation means configured to be output based on predetermined road map information, speech synthesis means configured to be able to output phonemes as speech, and adjustment of speech volume and / or switching of speech quality It is configured to enable input / output of data between the utterance volume sound quality switching means configured to be able to acquire position information and the control switching means configured to be able to perform control switching. And
A navigation system configured to execute the adjustment of the utterance volume and / or the switching of the utterance sound quality based on the current position information and the location situation data included in the position information. body.

Route information providing means for providing position information about the route using a predetermined communication system;
Navigation means configured to be able to output the route based on the information provided by the route information providing means based on predetermined road map information;
Speech recognition means configured to recognize speech;
A switching acoustic model database configured with a plurality of acoustic models;
Acoustic model control switching means for selecting one acoustic model from the plurality of acoustic models;
Speech synthesis means configured to synthesize phonemes and output as speech;
An utterance volume sound quality switching means configured to be able to adjust the utterance volume and / or switch the utterance sound quality;
Control information is configured to be able to acquire position information, and control switching means configured to be able to execute control switching between the voice recognition means and the voice synthesis means,
When a voice recognition unit is selected by the control switching unit, the acoustic model control switching unit selects from the plurality of acoustic models based on the acquired position information and the place situation data included in the position information. In addition, the acoustic model suitable for the location status data and the location information is selected and the voice recognition process can be executed.
When the speech synthesizer is selected by the control switching unit, the utterance volume adjustment and / or utterance sound quality switching is performed based on the acquired position information and the place situation data included in the position information. A moving body comprising a navigation system configured to execute and output synthesized speech.