JP4425718B2

JP4425718B2 - Voice recognition device for vehicles

Info

Publication number: JP4425718B2
Application number: JP2004175666A
Authority: JP
Inventors: 達哉京光; 俊哉鹿野
Original assignee: Honda Motor Co Ltd
Current assignee: Honda Motor Co Ltd
Priority date: 2004-06-14
Filing date: 2004-06-14
Publication date: 2010-03-03
Anticipated expiration: 2024-06-14
Also published as: JP2005352397A

Description

本発明は、入力された音声を認識して機械が理解できる情報に変換する車両用音声認識装置に関する。 The present invention relates to a vehicle voice recognition apparatus that recognizes input voice and converts it into information that can be understood by a machine.

従来、入力された音声を認識して機械が理解できる情報に変換する音声認識装置には、例えば音声の平均スペクトルに類似した、性質の明らかなノイズを入力音声に付加し、ノイズを付加した入力音声から得られる特徴パラメータと音声の標準パターンとの照合を行なうことによって音声を認識するようにしたものがある。具体的には、性質のよく分ったノイズ（人の声に近い性質のノイズ）を入力音声に積極的に加えることによって、マイクから混入する環境騒音や入力系に重畳する電気的ノイズ（白色雑音に近い）など、性質のよく分らないノイズの影響を軽減する。これにより、入力音声にマイクから種々の環境騒音が混入した場合や、入力回路の電気的ノイズが重畳した場合においても、安定した高い認識率の得られる音声認識装置を実現することが可能となる（例えば、特許文献１参照。）。
特開平６−４３８９２号公報 Conventionally, a speech recognition device that recognizes input speech and converts it into information that can be understood by the machine is, for example, by adding noise with obvious characteristics similar to the average spectrum of speech to the input speech and adding the noise. Some voices are recognized by collating feature parameters obtained from voice with a standard pattern of voice. Specifically, by actively adding well-known noise (noise close to human voice) to the input voice, environmental noise mixed from the microphone and electrical noise superimposed on the input system (white) Reduce the effects of noise that is not well understood, such as (close to noise). Thereby, even when various environmental noises are mixed into the input voice from the microphone or when the electrical noise of the input circuit is superimposed, it is possible to realize a voice recognition device that can obtain a stable and high recognition rate. (For example, refer to Patent Document 1).
JP-A-6-43892

ところで、従来の音声認識装置では、例えば音声認識装置を自動車等の車両に搭載した場合についての考慮がされていないという問題がある。具体的には、音声認識装置を自動車等の車両に搭載した場合には、車両が停止している時と車両が走行している時とでは車室内のノイズ量が異なり、特に車両が走行している場合には車室内に常に定常ノイズが存在しているため、従来の音声認識装置のように、入力音声に常にノイズを積極的に付加しても付加されたノイズの効果が少なく無駄が多いという問題がある。
一方、運手席と助手席の乗員が同時に音声を発することがあり、例えば運転席の乗員の音声を認識しようとする場合、助手席の乗員の音声は、運転席の乗員の音声の音声認識を妨害する非定常ノイズとなるため、特に車両が停止しているような場合には車室内が静かなので、積極的にこのような非定常ノイズを打ち消すために、従来の音声認識装置のように定常ノイズを付加したいという要求もある。 By the way, in the conventional speech recognition apparatus, there is a problem that consideration is not given to the case where the speech recognition apparatus is mounted on a vehicle such as an automobile. Specifically, when the speech recognition device is mounted on a vehicle such as an automobile, the amount of noise in the passenger compartment differs between when the vehicle is stopped and when the vehicle is running. If there is always a steady noise in the passenger compartment, the added noise is less effective and wasteful even if the noise is always positively added to the input voice as in the conventional voice recognition device. There is a problem that there are many.
On the other hand, the passenger in the passenger seat and the passenger in the passenger seat may make a voice at the same time. For example, when trying to recognize the voice of the passenger in the driver seat, the voice of the passenger in the passenger seat is recognized by the voice of the driver in the driver seat. Since the interior of the vehicle is quiet especially when the vehicle is stopped, it is possible to actively cancel out such unsteady noise as in the case of a conventional voice recognition device. There is also a demand to add stationary noise.

本発明は、上記課題に鑑みてなされたもので、周囲の環境に応じて効率的かつ正確に音声を認識可能な車両用音声認識装置を提供することを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to provide a vehicle voice recognition device capable of recognizing voice efficiently and accurately according to the surrounding environment.

上記課題を解決するために、請求項１の発明に係る車両用音声認識装置は、音声による入力が可能な音声入力手段（例えば後述する実施例の音声入力部４）と、前記音声入力手段により入力された音声の音声認識を実行する音声認識手段（例えば後述する実施例の音声認識部６、または音声認識部７）と、車室内の定常ノイズ量が所定値以上か否かを判定する定常ノイズ量判断手段（例えば後述する実施例の車内定常ノイズ判断部１）と、前記車室内の定常ノイズ量が所定値よりも小さいと前記定常ノイズ量判断手段が判断した場合に、前記音声入力手段により入力された音声に定常ノイズを加算する必要があるか否かを判断する定常ノイズ加算判断手段（例えば後述する実施例のノイズ加算要否判断部２）と、前記音声入力手段により入力された音声に定常ノイズを加算する必要があると前記定常ノイズ加算判断手段が判断した場合に、前記音声入力手段により入力された音声に定常ノイズを加算する定常ノイズ加算手段（例えば後述する実施例の定常ノイズ発生部３及び定常ノイズ加算部５）とを備え、前記定常ノイズ加算判断手段が、前記車室内の乗員を検知する乗員検知手段（例えば後述する実施例の着座センサ等）を備え、前記乗員検知手段により検知された前記車室内の乗員の数、もしくは前記車室内の乗員の位置に基づいて、前記音声入力手段により入力された音声に定常ノイズを加算する必要があるか否かを判断することを特徴とする。 In order to solve the above problems, a vehicle voice recognition apparatus according to the invention of claim 1 includes a voice input unit (for example, a voice input unit 4 in an embodiment described later) capable of inputting by voice and the voice input unit. A voice recognition means (for example, a voice recognition unit 6 or a voice recognition unit 7 in an embodiment to be described later) that performs voice recognition of the input voice and a steady state that determines whether or not the steady noise amount in the vehicle interior is equal to or greater than a predetermined value. When the steady noise amount determining means determines that the steady noise amount in the vehicle interior is smaller than a predetermined value (for example, in-vehicle steady noise determining unit 1 in an embodiment to be described later) and the steady noise amount determining means. A stationary noise addition judging means for judging whether or not stationary noise needs to be added to the voice inputted by (for example, a noise addition necessity judging section 2 in an embodiment to be described later), and input by the voice input means. When the stationary noise addition determining means determines that it is necessary to add stationary noise to the received voice, stationary noise adding means (for example, an embodiment described later) adds the stationary noise to the voice input by the voice input means. A stationary noise generating unit 3 and a stationary noise adding unit 5), and the stationary noise addition determining unit includes an occupant detecting unit (for example, a seating sensor of an embodiment described later) for detecting an occupant in the vehicle interior, Based on the number of occupants in the passenger compartment detected by the occupant detection means or the position of the occupants in the passenger compartment, whether or not it is necessary to add stationary noise to the voice input by the voice input means. It is characterized by judging.

以上の構成を備えた車両用音声認識装置は、音声入力手段により入力された音声の音声認識を音声認識手段により実行する際に、車室内の定常ノイズ量が所定値よりも小さいと定常ノイズ量判断手段が判断した場合は、定常ノイズ加算判断手段が音声入力手段により入力された音声に定常ノイズを加算する必要があるか否かを判断する。そして、音声入力手段により入力された音声に定常ノイズを加算する必要があると定常ノイズ加算判断手段が判断した場合には、定常ノイズ加算手段が音声入力手段により入力された音声に定常ノイズを加算してから音声認識を実行することで、車室内の定常ノイズ量が所定値よりも小さく、かつ定常ノイズを加算する必要がある場合には、定常ノイズが加算された音声の音声認識を実行し、それ以外の場合には、音声入力手段により入力されたそのままの音声の音声認識を実行することができる。 The vehicle speech recognition apparatus having the above configuration is configured such that when the speech recognition unit executes speech recognition of the speech input by the speech input unit, the steady noise amount is smaller than a predetermined value. If the determination means determines, the steady noise addition determination means determines whether or not the steady noise needs to be added to the voice input by the voice input means. When the stationary noise addition determining unit determines that it is necessary to add stationary noise to the voice input by the voice input unit, the stationary noise adding unit adds the stationary noise to the voice input by the voice input unit. If the steady noise amount in the passenger compartment is smaller than the predetermined value and it is necessary to add the steady noise, the voice recognition of the voice with the steady noise added is executed. In other cases, it is possible to execute voice recognition of the voice as it is inputted by the voice input means.

請求項２の発明に係る車両用音声認識装置は、請求項１に記載の車両用音声認識装置において、前記定常ノイズ加算判断手段が、前記車室内の乗員を検知する乗員検知手段（例えば後述する実施例の着座センサ等）を備え、前記乗員検知手段により検知された前記車室内の乗員の数、もしくは前記車室内の乗員の位置に基づいて、前記音声入力手段により入力された音声に定常ノイズを加算する必要があるか否かを判断することを特徴とする。 According to a second aspect of the present invention, there is provided a vehicular voice recognition apparatus according to the first aspect, wherein the stationary noise addition determining means detects an occupant in the vehicle compartment (for example, described later). A seating sensor of the embodiment), and the noise input to the voice input means based on the number of passengers detected by the passenger detection means or the position of the passengers in the passenger compartment. It is characterized by determining whether it is necessary to add.

また、定常ノイズ加算判断手段が、乗員検知手段により検知された車室内の乗員の数、もしくは車室内の乗員の位置に基づいて、入力された音声に定常ノイズを加算する必要があるか否かを判断することで、音声認識するべき話者以外の車室内の乗員が発話する音声、すなわち音声認識するべき話者の音声の音声認識を妨害する非定常ノイズの発生及びその影響を予測して、必要な時に入力された音声に定常ノイズを加算することができる。 Further, stationary noise addition determining means, whether the number of passengers in the cabin, which is detected by the passenger detecting means or on the basis of the occupant position in the vehicle compartment, it is necessary to add the stationary noise in the input speech Therefore, it is possible to predict the occurrence of non-stationary noise that interferes with the voice recognition of the voice of the passenger in the vehicle other than the speaker who should recognize the voice, that is, the voice of the speaker who should recognize the voice, and its influence. The stationary noise can be added to the input voice when necessary.

請求項２の発明に係る車両用音声認識装置は、請求項１に記載の車両用音声認識装置において、前記定常ノイズ加算判断手段が、前記車室内に備えられた音響機器の動作状態に基づいて、前記音声入力手段により入力された音声に定常ノイズを加算する必要があるか否かを判断することを特徴とする。 According to a second aspect of the present invention, there is provided the vehicular voice recognition device according to the first aspect, wherein the stationary noise addition determining means is based on an operating state of an acoustic device provided in the vehicle interior. It is determined whether or not it is necessary to add stationary noise to the voice input by the voice input means.

以上の構成を備えた車両用音声認識装置は、定常ノイズ加算判断手段が、車室内に備えられた音響機器の動作状態に基づいて、入力された音声に定常ノイズを加算する必要があるか否かを判断することで、車室内の音響機器が出力する音声、すなわち音声認識するべき話者の音声の音声認識を妨害する非定常ノイズの発生を予測して、必要な時に入力された音声に定常ノイズを加算することができる。 In the vehicle speech recognition apparatus having the above configuration, whether or not the steady noise addition determination unit needs to add steady noise to the input voice based on the operating state of the acoustic device provided in the vehicle interior. By predicting the occurrence of non-stationary noise that interferes with the voice recognition of the voice of the speaker that should be recognized, that is, the voice output by the acoustic equipment in the vehicle interior, the voice input when necessary Stationary noise can be added.

請求項３の発明に係る車両用音声認識装置は、請求項１又は２に記載の車両用音声認識装置において、前記音声認識手段が、音声に関する複数の標準パターンを記憶する標準パターン記憶手段（例えば後述する実施例の標準パターン格納部１５）と、前記標準パターン記憶手段から、音声認識に利用する標準パターンを選択する標準パターン選択手段（例えば後述する実施例の標準パターン選択部１６）と、前記音声入力手段により入力された音声と前記標準パターン選択手段により選択された前記標準パターンとを照合して音声を認識する音声照合手段（例えば後述する実施例の照合部１４）とを備え、前記標準パターン選択手段が、前記定常ノイズ量判断手段及び前記定常ノイズ加算判断手段の判断結果に基づいて、前記標準パターンを選択することを特徴とする。 According to a third aspect of the present invention, there is provided a vehicular voice recognition apparatus according to the first or second aspect , wherein the voice recognition means stores a plurality of standard patterns related to voice (for example, A standard pattern storage unit 15) of an embodiment to be described later, a standard pattern selection unit (for example, a standard pattern selection unit 16 of an example to be described later) for selecting a standard pattern used for speech recognition from the standard pattern storage unit, Voice collation means (for example, collation unit 14 in an embodiment to be described later) that recognizes voice by collating the voice input by voice input means with the standard pattern selected by the standard pattern selection means; A pattern selection unit selects the standard pattern based on the determination results of the stationary noise amount determination unit and the stationary noise addition determination unit. Characterized in that it-option.

以上の構成を備えた車両用音声認識装置は、音声認識手段に備えられた標準パターン選択手段が、定常ノイズ量判断手段及び定常ノイズ加算判断手段の判断結果に基づいて、音声に関する複数の標準パターンを記憶する標準パターン記憶手段から、音声認識に利用する標準パターンを選択し、音声照合手段が、音声入力手段により入力された音声と標準パターン選択手段により選択された標準パターンとを照合して音声を認識することで、定常ノイズ量判断手段及び定常ノイズ加算判断手段の判断結果に基づいて、音声に対する定常ノイズの付加の有無を判断し、音声認識に利用する標準パターンを適切なものに変更することができる。 In the vehicle speech recognition apparatus having the above-described configuration, the standard pattern selection unit included in the speech recognition unit includes a plurality of standard patterns related to speech based on the determination results of the steady noise amount determination unit and the steady noise addition determination unit. The standard pattern used for voice recognition is selected from the standard pattern storage means for storing the voice, and the voice collation means collates the voice input by the voice input means with the standard pattern selected by the standard pattern selection means to obtain the voice. Is recognized based on the determination results of the steady noise amount determination means and the steady noise addition determination means, and whether or not stationary noise is added to the speech is determined, and the standard pattern used for speech recognition is changed to an appropriate one. be able to.

請求項１に記載の車両用音声認識装置によれば、車室内の定常ノイズ量が所定値よりも小さく、かつ定常ノイズを加算する必要がある場合には、定常ノイズが加算された音声の音声認識を実行し、それ以外の場合には、音声入力手段により入力されたそのままの音声の音声認識を実行することができる。
従って、車室内が静かで定常ノイズが少なく、音声認識するべき話者の音声の音声認識を妨害する非定常ノイズが目立つ場合にのみ定常ノイズを付加して音声認識を実行し、車室内が静かで更に非定常ノイズも発生していない場合や、車室内に十分な定常ノイズが発生している場合等、定常ノイズを付加する必要がない場合には不必要にノイズを付加することなく音声認識を実行するような、音声認識装置の周囲の環境に応じて効率的かつ正確に音声を認識可能な車両用音声認識装置を実現することができるという効果が得られる。 According to the vehicle voice recognition device of claim 1, when the steady noise amount in the vehicle compartment is smaller than the predetermined value and the steady noise needs to be added, the voice of the voice to which the steady noise is added. In other cases, it is possible to execute voice recognition of the voice as it is input by the voice input means.
Therefore, voice recognition is performed by adding stationary noise only when the vehicle interior is quiet and there is little stationary noise, and non-stationary noise that interferes with the speech recognition of the speaker to be recognized is conspicuous. If there is no need to add stationary noise, such as when there is no unsteady noise, or when there is sufficient stationary noise in the passenger compartment, voice recognition is performed without adding unnecessary noise. Thus, there is an effect that it is possible to realize a vehicle voice recognition device that can recognize voice efficiently and accurately according to the environment around the voice recognition device.

さらに、車室内の乗員の数、もしくは車室内の乗員の位置に基づき、音声認識するべき話者の音声の音声認識を妨害する非定常ノイズの発生及びその影響を予測して、必要な時に入力された音声に定常ノイズを加算することができる。
従って、車室内の状態を適切に判断し、必要な時に入力された音声に定常ノイズを付加して、音声認識率を向上させることができるという効果が得られる。 Furthermore , based on the number of passengers in the passenger compartment or the position of the passengers in the passenger compartment, the occurrence of non-stationary noise that interferes with the voice recognition of the speaker who should recognize the speech and its effects are predicted and input when necessary. Stationary noise can be added to the generated voice.
Therefore, it is possible to appropriately determine the state of the passenger compartment, add stationary noise to the input voice when necessary, and improve the voice recognition rate.

請求項２に記載の車両用音声認識装置によれば、車室内に備えられた音響機器の動作状態に基づき、音声認識するべき話者の音声の音声認識を妨害する非定常ノイズの発生を予測して、必要な時に入力された音声に定常ノイズを加算することができる。
従って、車室内の状態を適切に判断し、必要な時に入力された音声に定常ノイズを付加して、音声認識率を向上させることができるという効果が得られる。 According to the vehicular voice recognition device according to claim 2 , the occurrence of unsteady noise that interferes with the voice recognition of the voice of the speaker to be voice-recognized is predicted based on the operating state of the acoustic device provided in the vehicle interior. Thus, stationary noise can be added to the input voice when necessary.
Therefore, it is possible to appropriately determine the state of the passenger compartment, add stationary noise to the input voice when necessary, and improve the voice recognition rate.

請求項３に記載の車両用音声認識装置によれば、定常ノイズ量判断手段及び定常ノイズ加算判断手段の判断結果に基づいて、音声に対する定常ノイズの付加の有無を判断し、音声認識に利用する標準パターンを適切なものに変更することができる。
従って、例えば車室内が静かで、更に音声認識するべき話者の音声の音声認識を妨害する非定常ノイズが発生していない時は、音声認識に静かな場合に用いる標準パターンを利用し、音声に定常ノイズが付加されている場合には、音声認識に定常ノイズを加算して作成した標準パターンを利用するような、入力されてくる音声や音声認識装置の周囲の環境に応じた適切な標準パターンを利用して更に音声認識率を向上させ、効率的かつ正確に音声を認識可能な車両用音声認識装置を実現することができるという効果が得られる。 According to the vehicle voice recognition apparatus of the third aspect , the presence / absence of addition of stationary noise to the voice is determined based on the determination results of the stationary noise amount determination means and the stationary noise addition determination means, and is used for voice recognition. The standard pattern can be changed to an appropriate one.
Therefore, for example, when the vehicle interior is quiet and there is no non-stationary noise that interferes with the speech recognition of the speaker to be recognized, the standard pattern used when the speech recognition is quiet is used. Appropriate standard according to the input voice and the environment around the voice recognition device, such as using a standard pattern created by adding steady noise to voice recognition. The effect of further improving the voice recognition rate using the pattern and realizing the vehicle voice recognition device capable of recognizing the voice efficiently and accurately is obtained.

以下、図面を参照して本発明の実施例について説明する。 Embodiments of the present invention will be described below with reference to the drawings.

（全体構成）
図１は、本発明の第１の実施例の車両用音声認識装置の構成を示すブロック図である。
図１において、本実施例の車両用音声認識装置は、該車両用音声認識装置が搭載された車両における車室内の定常ノイズの発生量を判断する車内定常ノイズ判断部１と、車内定常ノイズ判断部１の判断結果、及び同乗者の有無や人数等から推定した音声認識するべき話者以外の別人の音声等の音声認識の妨げとなる非定常ノイズの有無から、入力される音声に性質の明らかな定常ノイズを加算する必要があるか否かを判断するノイズ加算要否判断部２とを備えている。 (overall structure)
FIG. 1 is a block diagram showing the configuration of a vehicle speech recognition apparatus according to a first embodiment of the present invention.
In FIG. 1, the vehicle speech recognition apparatus according to the present embodiment includes an in-vehicle steady noise determination unit 1 that determines the amount of steady noise generated in a vehicle compartment in a vehicle equipped with the vehicle speech recognition apparatus, and an in-vehicle steady noise determination. Depending on the judgment result of part 1 and the presence or absence of passengers and the presence or absence of non-stationary noise that hinders speech recognition of other people other than the speaker who should be recognized by speech estimation based on the number of passengers, etc. And a noise addition necessity determination unit 2 that determines whether or not it is necessary to add clear stationary noise.

また、本実施例の車両用音声認識装置は、入力される音声に加算するための性質の明らかな定常ノイズを生成する定常ノイズ発生部３と、音声を取得するためのマイクロフォン等を備えた音声入力部４と、ノイズ加算要否判断部２の判断結果に基づいて、定常ノイズ発生部３の生成する定常ノイズを音声入力部４により入力された音声に加算する定常ノイズ加算部５と、定常ノイズ加算部５の出力する音声を、予め記憶している標準パターンと比較して音声認識する音声認識部６とを備えている。 In addition, the vehicle speech recognition apparatus according to the present embodiment includes a stationary noise generating unit 3 that generates stationary noise with a clear property to be added to input speech, a microphone that acquires a speech, and the like. Based on the determination result of the input unit 4, the noise addition necessity determination unit 2, the stationary noise addition unit 5 that adds the stationary noise generated by the stationary noise generation unit 3 to the voice input by the voice input unit 4, A speech recognition unit 6 that recognizes speech by comparing the speech output from the noise addition unit 5 with a standard pattern stored in advance is provided.

また、音声認識部６について更に詳細に説明すると、音声認識部６は、定常ノイズ加算部５の出力する音声を、例えば線形予測分析を利用して分析する分析部１１と、分析部１１の出力する分析結果から、例えばＬＰＣ（線形予測）ケプストラム係数を音声の特徴パラメータとして求める特徴パラメータ抽出部１２と、特徴パラメータ抽出部１２の出力する特徴パラメータと比較するための音声の標準パターンを予め記憶している標準パターン格納部１３とを備えている。 The speech recognition unit 6 will be described in more detail. The speech recognition unit 6 analyzes the speech output from the stationary noise addition unit 5 using, for example, linear prediction analysis, and the output of the analysis unit 11. For example, a feature parameter extraction unit 12 that obtains, for example, an LPC (linear prediction) cepstrum coefficient as a feature parameter of speech, and a speech standard pattern for comparison with the feature parameter output from the feature parameter extraction unit 12 are stored in advance. The standard pattern storage unit 13 is provided.

また、音声認識部６は、特徴パラメータ抽出部１２の出力する特徴パラメータの時系列データと標準パターン格納部１３の出力する音声の標準パターンとのパターンマッチングを行い、特徴パラメータとの類似度が最大になる標準パターンに対応する音声を音声認識結果として出力する照合部１４とを備えている。なお、標準パターン格納部１３に記憶する音声の標準パターンは、認識対象とする各音声に対して、標準パターン作成用データを用いて予め作成しておく。 In addition, the speech recognition unit 6 performs pattern matching between the time series data of the feature parameters output from the feature parameter extraction unit 12 and the standard pattern of speech output from the standard pattern storage unit 13, and the degree of similarity with the feature parameters is maximized. And a collation unit 14 that outputs a voice corresponding to the standard pattern as a voice recognition result. Note that the standard pattern of speech stored in the standard pattern storage unit 13 is created in advance using standard pattern creation data for each speech to be recognized.

（音声認識処理）
次に、図面を参照して、本実施例の車両用音声認識装置の音声認識処理について説明する。図２は、本実施例の車両用音声認識装置の音声認識処理動作を示すフローチャートである。
図２において、車内定常ノイズ判断部１は、本実施例の車両用音声認識装置の利用者が発話を行うためにトークスイッチを押下して音声を入力した際に（ステップＳ１）、車室内の定常ノイズ量を測定し、車室内の定常ノイズ量が所定値以上であるか否かを判定する（ステップＳ２）。具体的には、例えばトークスイッチが押下されて利用者の発話が開始された時の車室内の定常ノイズ量を測定すると共に、利用者の音声と車室内の定常ノイズとのＳ／Ｎ比がしきい値ＴＨ１以下であるか否かを判定し、利用者の音声と車室内の定常ノイズとのＳ／Ｎ比がしきい値ＴＨ１以下である場合には、車室内の定常ノイズ量が所定値以上であると判定する。なお、直接的に車両の走行速度Ｖｓが所定値ＴＨ２以上であるか否かを判定するようにしても良い。 (Voice recognition processing)
Next, with reference to the drawings, the voice recognition processing of the vehicle voice recognition apparatus of the present embodiment will be described. FIG. 2 is a flowchart showing the voice recognition processing operation of the vehicle voice recognition apparatus of the present embodiment.
In FIG. 2, when the user of the vehicle speech recognition apparatus of this embodiment presses the talk switch and inputs a voice in order to speak (step S <b> 1), the vehicle interior steady noise determination unit 1 The steady noise amount is measured, and it is determined whether or not the steady noise amount in the passenger compartment is equal to or greater than a predetermined value (step S2). Specifically, for example, the steady noise amount in the passenger compartment when the talk switch is pressed and the user's utterance is started is measured, and the S / N ratio between the user's voice and the stationary noise in the passenger compartment is determined. It is determined whether or not the threshold value TH1 is equal to or less than the threshold value TH1, and when the S / N ratio between the user's voice and the steady noise in the vehicle interior is equal to or less than the threshold value TH1, the steady noise amount in the vehicle interior is predetermined. Determined to be greater than or equal to the value. Note that it may be directly determined whether or not the traveling speed Vs of the vehicle is equal to or greater than a predetermined value TH2.

そして、ステップＳ２において、車内定常ノイズ判断部１が車室内の定常ノイズ量は所定値以上ではないと判断した場合（ステップＳ２のＮＯ）、次にノイズ加算要否判断部２が、音声入力部４により入力された音声に性質の明らかな定常ノイズを加算する必要があるか否かを判断する（ステップＳ３）。具体的には、ノイズ加算要否判断部２は、例えば車両の座席に設置された着座センサ等の乗員検知手段を利用して、車室内の乗員の数を検知すると共に、音声認識するべき話者（車両用音声認識装置の利用者）以外の乗員（同乗者）の数や車室内の乗員の位置に基づいて、入力された音声に性質の明らかな定常ノイズを加算する必要があるか否かを判断する。 In step S2, if the vehicle interior steady noise determination unit 1 determines that the amount of steady noise in the vehicle interior is not equal to or greater than a predetermined value (NO in step S2), then the noise addition necessity determination unit 2 performs the voice input unit. 4 determines whether it is necessary to add stationary noise with a clear nature to the voice input in step 4 (step S3). Specifically, the noise addition necessity determination unit 2 detects the number of passengers in the passenger compartment using a passenger detection means such as a seating sensor installed in a vehicle seat, for example, and a speech to be recognized. Whether it is necessary to add stationary noise with obvious characteristics to the input speech based on the number of passengers (passengers) other than passengers (users of the vehicle voice recognition device) and the positions of passengers in the passenger compartment Determine whether.

例えば、車両用音声認識装置の利用者以外の乗員が存在すれば、音声認識するべき話者以外の別人の音声が、音声認識するべき話者の音声の音声認識を妨害する非定常ノイズとして入力される可能性があるので、これを打ち消すために性質の明らかな定常ノイズを加算する必要があると判断する。 For example, if there is an occupant other than the user of the vehicle voice recognition device, the voice of another person other than the speaker who should recognize the voice is input as non-stationary noise that interferes with the voice recognition of the voice of the speaker who should recognize the voice. In order to cancel this, it is determined that it is necessary to add stationary noise with obvious properties.

また、このような場合でも、例えば車両用音声認識装置のマイクロフォン（音声入力部４）が車室内前方のインストルメントパネルやセンタコンソールに設置され、音声認識するべき話者（車両用音声認識装置の利用者）以外の乗員が、車両用音声認識装置のマイクロフォン（音声入力部４）から遠い後部座席に存在する場合は、音声認識を妨害する非定常ノイズの影響は少ないと推定できるので、ノイズ加算要否判断部２は、入力された音声に定常ノイズを加算する必要はないと判断することができる。なお、車室内の乗員の位置に基づいて入力された音声に定常ノイズを加算する必要があるか否かを判断する場合は、実際の車両用音声認識装置のマイクロフォン（音声入力部４）の設置位置と乗員の位置とから判断する。 Even in such a case, for example, a microphone (speech input unit 4) of a vehicle speech recognition device is installed on an instrument panel or a center console in front of the passenger compartment, and a speaker (a vehicle speech recognition device When a passenger other than the user is present in the rear seat far from the microphone (speech input unit 4) of the vehicle speech recognition device, it can be estimated that the influence of non-stationary noise that disturbs speech recognition is small. The necessity determination unit 2 can determine that it is not necessary to add stationary noise to the input voice. When it is determined whether it is necessary to add stationary noise to the input voice based on the position of the passenger in the vehicle compartment, the microphone (voice input unit 4) of the actual vehicle voice recognition device is installed. Judging from the position and the position of the occupant.

また、ノイズ加算要否判断部２は、入力された音声に定常ノイズを加算する必要があるか否かを、音声入力部４に同時に入力される音声の話者の数に基づいて判断しても良い。すなわち、音声認識するべき話者（車両用音声認識装置の利用者）以外の乗員が車室内に存在しても、この乗員が発話していなければ、音声認識するべき話者の音声の音声認識を妨害する非定常ノイズは存在しない。そこで、音声入力部４に同時に入力される音声の話者の数が一人と認識できる場合、ノイズ加算要否判断部２は、入力された音声に定常ノイズを加算する必要はないと判断する。また、音声入力部４に同時に入力される音声の話者の数が複数と認識できる場合、ノイズ加算要否判断部２は、入力された音声に定常ノイズを加算する必要があると判断する。 Further, the noise addition necessity determination unit 2 determines whether or not it is necessary to add stationary noise to the input voice based on the number of voice speakers simultaneously input to the voice input unit 4. Also good. That is, even if there is an occupant other than the speaker (user of the vehicle speech recognition device) that should be speech-recognized, if the occupant is not speaking, the speech recognition of the speaker to be speech-recognized. There is no non-stationary noise that interferes. Therefore, when it is possible to recognize that the number of voice speakers simultaneously input to the voice input unit 4 is one, the noise addition necessity determination unit 2 determines that it is not necessary to add stationary noise to the input voice. When the number of voice speakers simultaneously input to the voice input unit 4 can be recognized as plural, the noise addition necessity determination unit 2 determines that it is necessary to add stationary noise to the input voice.

また、上述の説明では、音声認識を妨害する非定常ノイズの発生源は、音声認識するべき話者（車両用音声認識装置の利用者）以外の乗員として説明したが、テレビやラジオ、あるいはオーディオ等、車室内に備えられた音響機器の出力音声も音声認識を妨害する非定常ノイズの１つとして考えることができるので、ノイズ加算要否判断部２は、入力された音声に定常ノイズを加算する必要があるか否かを、車室内に備えられた音響機器の動作状態に基づいて判断しても良い。そこで、車室内に備えられた音響機器が作動していない場合、ノイズ加算要否判断部２は、入力された音声に定常ノイズを加算する必要はないと判断する。また、車室内に備えられた音響機器が作動している場合、ノイズ加算要否判断部２は、入力された音声に定常ノイズを加算する必要があると判断する。 In the above description, the source of non-stationary noise that interferes with speech recognition has been described as an occupant other than a speaker (a user of a vehicle speech recognition device) that should be speech-recognized. Since the output sound of the acoustic equipment provided in the vehicle interior can also be considered as one of the non-stationary noises that disturb the speech recognition, the noise addition necessity determination unit 2 adds the steady noise to the input sounds. Whether or not it is necessary to do so may be determined based on the operating state of the audio equipment provided in the passenger compartment. Therefore, when the acoustic device provided in the passenger compartment is not operating, the noise addition necessity determination unit 2 determines that it is not necessary to add stationary noise to the input voice. Moreover, when the audio equipment provided in the vehicle interior is operating, the noise addition necessity determination unit 2 determines that it is necessary to add stationary noise to the input voice.

なお、ノイズ加算要否判断部２は、入力された音声に定常ノイズを加算する必要があるか否かを、上述の音声認識するべき話者以外の乗員の数や位置、あるいは音声入力部４に同時に入力される音声の話者の数、あるいは車室内に備えられた音響機器の動作状態のいずれかの条件だけで判断しても良いし、これらの条件を組み合わせて判断しても良い。 The noise addition necessity determination unit 2 determines whether or not it is necessary to add stationary noise to the input voice. The number or position of passengers other than the speaker to be voice-recognized or the voice input unit 4 Judgment may be made based only on the condition of the number of voice speakers that are input simultaneously or the operating state of the acoustic equipment provided in the vehicle interior, or a combination of these conditions.

一方、ステップＳ３において、ノイズ加算要否判断部２が音声入力部４により入力された音声に定常ノイズを加算する必要があると判断した場合（ステップＳ３のＹＥＳ）、定常ノイズ加算部５が、音声入力部４により入力された音声に、定常ノイズ発生部３の生成する性質の明らかな定常ノイズを加算する（ステップＳ４）。具体的には、入力された音声と加算する定常ノイズを示す図３のように、図３（ａ）に示す入力された音声の信号に、図３（ｂ）に示すような定常ノイズを加算して、図３（ａ）に示す信号のＡ部あるいはＢ部にある非定常ノイズが音声認識に与える影響を軽減する。 On the other hand, when the noise addition necessity determination unit 2 determines in step S3 that it is necessary to add stationary noise to the voice input by the voice input unit 4 (YES in step S3), the stationary noise addition unit 5 Stationary noise with obvious properties generated by the stationary noise generator 3 is added to the voice input by the voice input unit 4 (step S4). Specifically, as shown in FIG. 3 showing the stationary noise to be added to the input voice, the stationary noise as shown in FIG. 3B is added to the input voice signal shown in FIG. Thus, the influence of the non-stationary noise in the A part or B part of the signal shown in FIG.

また、音声入力部４により入力された音声に、定常ノイズ加算部５により定常ノイズを加算することができたら、分析部１１が線形予測分析を行い、特徴パラメータ抽出部１２が入力された音声の特徴パラメータを抽出する（ステップＳ５）。
そして、音声の特徴パラメータを抽出することができたら、照合部１４が、特徴パラメータ抽出部１２の出力する特徴パラメータの時系列データと標準パターン格納部１３の出力する音声の標準パターンとのパターンマッチングを行い（ステップＳ６）、特徴パラメータとの類似度が最大になる標準パターンに対応する音声を音声認識結果として出力し（ステップＳ７）、音声認識処理を終了する。 If the stationary noise can be added to the speech input by the speech input unit 4 by the stationary noise addition unit 5, the analysis unit 11 performs linear prediction analysis, and the feature parameter extraction unit 12 inputs the speech. Feature parameters are extracted (step S5).
When the speech feature parameters can be extracted, the matching unit 14 performs pattern matching between the feature parameter time-series data output from the feature parameter extraction unit 12 and the speech standard pattern output from the standard pattern storage unit 13. (Step S6), the voice corresponding to the standard pattern that maximizes the similarity to the feature parameter is output as the voice recognition result (step S7), and the voice recognition process is terminated.

また、ステップＳ２において、車内定常ノイズ判断部１が車室内の定常ノイズ量は所定値以上であると判断した場合は（ステップＳ２のＹＥＳ）、入力された音声に定常ノイズの加算は行わず、ステップＳ５において、そのまま分析部１１が音声入力部４により入力された音声の線形予測分析を行い、特徴パラメータ抽出部１２が入力された音声の特徴パラメータを抽出する（ステップＳ５）。 In step S2, if the in-vehicle steady noise determining unit 1 determines that the amount of steady noise in the vehicle interior is equal to or greater than a predetermined value (YES in step S2), the steady noise is not added to the input voice. In step S5, the analysis unit 11 performs linear prediction analysis of the speech input by the speech input unit 4 as it is, and the feature parameter extraction unit 12 extracts the feature parameters of the input speech (step S5).

また、ステップＳ３において、ノイズ加算要否判断部２が音声入力部４により入力された音声に定常ノイズを加算する必要はないと判断した場合も（ステップＳ３のＮＯ）、入力された音声に定常ノイズの加算は行わず、ステップＳ５において、そのまま分析部１１が音声入力部４により入力された音声の線形予測分析を行い、特徴パラメータ抽出部１２が入力された音声の特徴パラメータを抽出する（ステップＳ５）。 In step S3, when the noise addition necessity determination unit 2 determines that it is not necessary to add stationary noise to the voice input by the voice input unit 4 (NO in step S3), the input voice is steady. In step S5, noise is not added, and the analysis unit 11 performs linear prediction analysis of the speech input by the speech input unit 4 as it is, and the feature parameter extraction unit 12 extracts the feature parameters of the input speech (step S5). S5).

以上説明したように、本実施例の車両用音声認識装置は、車内定常ノイズ判断部１が車室内の定常ノイズ量は所定値以上ではないと判断した場合、ノイズ加算要否判断部２が、音声認識するべき話者以外の乗員の数や位置、あるいは音声入力部４に同時に入力される音声の話者の数、あるいは車室内に備えられた音響機器の動作状態等から、音声入力部４により入力された音声に、性質の明らかな定常ノイズを加算する必要があるか否かを判断する。そして、もし入力された音声に定常ノイズを加算する必要があると判断された場合、定常ノイズ加算部５が、入力された音声に定常ノイズ発生部３の生成する性質の明らかな定常ノイズを加算する。次に、分析部１１が線形予測分析を行い、特徴パラメータ抽出部１２が入力された音声の特徴パラメータを抽出する。そして、照合部１４が、特徴パラメータ抽出部１２の出力する特徴パラメータの時系列データと標準パターン格納部１３の出力する音声の標準パターンとのパターンマッチングを行い、特徴パラメータとの類似度が最大になる標準パターンに対応する音声を音声認識結果として出力する。 As described above, in the vehicle speech recognition apparatus according to the present embodiment, when the in-vehicle steady noise determination unit 1 determines that the amount of steady noise in the vehicle interior is not equal to or greater than a predetermined value, the noise addition necessity determination unit 2 The voice input unit 4 is determined based on the number and positions of passengers other than the speakers to be recognized by voice, the number of voice speakers simultaneously input to the voice input unit 4, or the operating state of the acoustic equipment provided in the vehicle interior. To determine whether it is necessary to add stationary noise with a clear nature to the input voice. If it is determined that it is necessary to add stationary noise to the input speech, the stationary noise adding unit 5 adds the stationary noise that is clearly generated by the stationary noise generating unit 3 to the input speech. To do. Next, the analysis unit 11 performs linear prediction analysis, and the feature parameter extraction unit 12 extracts the feature parameters of the input speech. Then, the matching unit 14 performs pattern matching between the time series data of the feature parameter output from the feature parameter extraction unit 12 and the standard pattern of the voice output from the standard pattern storage unit 13, and the similarity with the feature parameter is maximized. The voice corresponding to the standard pattern is output as the voice recognition result.

これにより、車室内の定常ノイズ量が所定値よりも小さく、かつ定常ノイズを加算する必要がある場合には、定常ノイズが加算された音声の音声認識を実行し、それ以外の場合には、音声入力部４により入力されたそのままの音声の音声認識を実行することができる。
従って、例えば車両が停止している場合等、車室内が静かで定常ノイズが少なく、同乗者の音声等、音声認識するべき話者の音声の音声認識を妨害する非定常ノイズが目立つ場合にのみ定常ノイズを付加して音声を認識し、例えば車室内が静かで更に非定常ノイズも発生していない場合や、あるいは車両が走行しているために十分な定常ノイズが発生している場合等、定常ノイズを付加する必要がない場合には不必要にノイズを付加することなく音声を認識するような、音声認識装置の周囲の環境に応じて効率的かつ正確に音声を認識可能な車両用音声認識装置を実現することができるという効果が得られる。 Thereby, when the steady noise amount in the passenger compartment is smaller than the predetermined value and it is necessary to add the steady noise, the voice recognition of the voice to which the steady noise is added is executed, and in other cases, Voice recognition of the voice as it is input by the voice input unit 4 can be executed.
Therefore, for example, when the vehicle is stationary, the vehicle interior is quiet and there is little steady noise, and only when there is noticeable non-stationary noise that interferes with voice recognition of the speaker's voice, such as the passenger's voice. Recognize voice by adding stationary noise, for example, when the vehicle interior is quiet and no non-stationary noise is generated, or when sufficient stationary noise is generated because the vehicle is running, etc. Voice for vehicles that can recognize voice efficiently and accurately according to the surrounding environment of the voice recognition device, such as recognizing voice without adding noise unnecessarily when it is not necessary to add stationary noise The effect that a recognition device can be realized is obtained.

また、ノイズ加算要否判断部２が、音声認識するべき話者以外の乗員の数や位置、あるいは音声入力部４に同時に入力される音声の話者の数、あるいは車室内に備えられた音響機器の動作状態等から、音声入力部４により入力された音声に、性質の明らかな定常ノイズを加算する必要があるか否かを判断することで、同乗者の音声や車室内の音響機器が出力する音声等、音声認識するべき話者の音声の音声認識を妨害する非定常ノイズの発生及びその影響を予測したり、非定常ノイズの発生を直接検知して、必要な時に入力された音声に定常ノイズを加算することができる。
従って、定常ノイズを付加するべき状態をより正確に判断し、入力された音声に定常ノイズを付加することで、音声認識率を向上させることができるという効果が得られる。 Further, the noise addition necessity determination unit 2 determines the number and position of passengers other than the speaker to be recognized by voice, the number of voice speakers input simultaneously to the voice input unit 4, or the sound provided in the passenger compartment. By determining whether or not it is necessary to add stationary noise with obvious characteristics to the voice input by the voice input unit 4 based on the operating state of the equipment, the passenger's voice and the acoustic equipment in the passenger compartment Voice that is input when necessary by predicting the occurrence and effects of non-stationary noise that interferes with the voice recognition of the speaker's voice that should be recognized, such as the output voice Stationary noise can be added to.
Therefore, it is possible to improve the speech recognition rate by more accurately determining the state to which stationary noise should be added and adding stationary noise to the input speech.

次に、本発明の第２の実施例について説明する。
（全体構成）
図４は、本発明の第２の実施例の車両用音声認識装置の構成を示すブロック図である。
図４において、本実施例の車両用音声認識装置は、車内定常ノイズ判断部１と、ノイズ加算要否判断部２と、定常ノイズ発生部３と、音声入力部４と、定常ノイズ加算部５と、音声認識部７とを備えている。ここで、車内定常ノイズ判断部１と、ノイズ加算要否判断部２と、定常ノイズ発生部３と、音声入力部４と、定常ノイズ加算部５は、図１に示す本発明の第１の実施例の車両用音声認識装置を構成する構成要素と同一なので、説明は省略する。 Next, a second embodiment of the present invention will be described.
(overall structure)
FIG. 4 is a block diagram showing the configuration of the vehicle voice recognition apparatus according to the second embodiment of the present invention.
In FIG. 4, the vehicle speech recognition apparatus according to the present embodiment includes a vehicle interior steady noise determination unit 1, a noise addition necessity determination unit 2, a steady noise generation unit 3, a speech input unit 4, and a steady noise addition unit 5. And a voice recognition unit 7. Here, the in-vehicle steady noise determining unit 1, the noise addition necessity determining unit 2, the steady noise generating unit 3, the voice input unit 4, and the steady noise adding unit 5 are the first of the present invention shown in FIG. Since it is the same as the component which comprises the speech recognition apparatus for vehicles of an Example, description is abbreviate | omitted.

一方、音声認識部７について更に詳細に説明すると、音声認識部７は、定常ノイズ加算部５の出力する音声を、例えば線形予測分析を利用して分析する分析部１１と、分析部１１の出力する分析結果から、例えばＬＰＣ（線形予測）ケプストラム係数を音声の特徴パラメータとして求める特徴パラメータ抽出部１２と、特徴パラメータ抽出部１２の出力する特徴パラメータと比較するための音声の標準パターンを予め記憶している標準パターン格納部１５とを備えている。 On the other hand, the speech recognition unit 7 will be described in more detail. The speech recognition unit 7 analyzes the speech output from the stationary noise addition unit 5 using, for example, linear prediction analysis, and the output of the analysis unit 11. For example, a feature parameter extraction unit 12 that obtains, for example, an LPC (linear prediction) cepstrum coefficient as a feature parameter of speech, and a speech standard pattern for comparison with the feature parameter output from the feature parameter extraction unit 12 are stored in advance. The standard pattern storage unit 15 is provided.

なお、標準パターン格納部１５に予め記憶しておく音声の標準パターンには、認識対象とする各音声に対して、何も手を加えない標準パターン作成用データを用いて予め作成しておくものと、認識対象とする各音声に対して、定常ノイズ発生部３で生成される定常ノイズと同質のものを一定の割合で加えた標準パターン作成用データを用いて予め作成しておくものと、複数種類の標準パターンを用意する。 In addition, the standard pattern of the voice stored in advance in the standard pattern storage unit 15 is created in advance using standard pattern creation data that does not change anything for each voice to be recognized. And for each voice to be recognized, created in advance using standard pattern creation data obtained by adding the same quality of stationary noise generated by the stationary noise generating unit 3 at a constant rate, Prepare multiple types of standard patterns.

また、音声認識部７は、車内定常ノイズ判断部１、及びノイズ加算要否判断部２の判断結果に基づいて、標準パターン格納部１５に、標準パターン格納部１５に記憶された複数種類の標準パターンの中のどの標準パターンを音声認識に利用するかを指示する標準パターン選択部１６と、特徴パラメータ抽出部１２の出力する特徴パラメータの時系列データと標準パターン選択部１６の選択に基づいて標準パターン格納部１５が出力する音声の標準パターンとのパターンマッチングを行い、特徴パラメータとの類似度が最大になる標準パターンに対応する音声を音声認識結果として出力する照合部１４とを備えている。 The voice recognition unit 7 also includes a plurality of types of standards stored in the standard pattern storage unit 15 in the standard pattern storage unit 15 based on the determination results of the in-vehicle steady noise determination unit 1 and the noise addition necessity determination unit 2. A standard pattern selection unit 16 for instructing which standard pattern in the pattern is used for speech recognition, time series data of feature parameters output from the feature parameter extraction unit 12, and a standard pattern selection unit 16 based on the selection A matching unit 14 is provided that performs pattern matching with a standard pattern of speech output from the pattern storage unit 15 and outputs speech corresponding to a standard pattern that maximizes the similarity to the feature parameter as a speech recognition result.

（音声認識処理）
次に、図面を参照して、本実施例の車両用音声認識装置の音声認識処理について説明する。図５は、本実施例の車両用音声認識装置の音声認識処理動作を示すフローチャートである。
図５において、車内定常ノイズ判断部１は、本実施例の車両用音声認識装置の利用者が発話を行うためにトークスイッチを押下して音声を入力した際に（ステップＳ１１）、車室内の定常ノイズ量を測定し、車室内の定常ノイズ量が所定値以上であるか否かを判定する（ステップＳ１２）。なお、ステップＳ１２における車室内の定常ノイズ量の判断方法は、第１の実施例の車両用音声認識装置のステップＳ２の処理と同一とする。 (Voice recognition processing)
Next, with reference to the drawings, the voice recognition processing of the vehicle voice recognition apparatus of the present embodiment will be described. FIG. 5 is a flowchart showing the voice recognition processing operation of the vehicle voice recognition apparatus of the present embodiment.
In FIG. 5, when the user of the vehicle speech recognition apparatus of the present embodiment presses the talk switch and inputs a voice in order to speak (step S11), the in-vehicle steady noise determination unit 1 The steady noise amount is measured, and it is determined whether or not the steady noise amount in the passenger compartment is a predetermined value or more (step S12). Note that the determination method of the steady noise amount in the vehicle interior in step S12 is the same as the processing in step S2 of the vehicle speech recognition apparatus of the first embodiment.

そして、ステップＳ１２において、車内定常ノイズ判断部１が車室内の定常ノイズ量は所定値以上ではないと判断した場合（ステップＳ１２のＮＯ）、次にノイズ加算要否判断部２が、音声入力部４により入力された音声に性質の明らかな定常ノイズを加算する必要があるか否かを判断する（ステップＳ１３）。なお、ステップＳ１３における入力された音声に対する定常ノイズの加算の要否の判断方法は、第１の実施例の車両用音声認識装置のステップＳ３の処理と同一とする。 In step S12, if the vehicle interior steady noise determination unit 1 determines that the amount of steady noise in the vehicle interior is not greater than or equal to a predetermined value (NO in step S12), then the noise addition necessity determination unit 2 performs the voice input unit. 4 determines whether it is necessary to add stationary noise with a clear nature to the voice input in step 4 (step S13). Note that the method for determining whether or not stationary noise needs to be added to the input speech in step S13 is the same as the processing in step S3 of the vehicle speech recognition apparatus of the first embodiment.

また、ステップＳ１３において、ノイズ加算要否判断部２が音声入力部４により入力された音声に定常ノイズを加算する必要があると判断した場合（ステップＳ１３のＹＥＳ）、第１の実施例において図３を参照して説明したように、定常ノイズ加算部５が、音声入力部４により入力された音声に、定常ノイズ発生部３の生成する性質の明らかな定常ノイズを加算する（ステップＳ１４）。
また、音声入力部４により入力された音声に、定常ノイズ加算部５により定常ノイズを加算することができたら、分析部１１が線形予測分析を行い、特徴パラメータ抽出部１２が入力された音声の特徴パラメータを抽出する（ステップＳ１５）。 Also, in step S13, when the noise addition necessity determination unit 2 determines that it is necessary to add stationary noise to the voice input by the voice input unit 4 (YES in step S13), FIG. As described with reference to FIG. 3, the stationary noise adding unit 5 adds the stationary noise that is clearly generated by the stationary noise generating unit 3 to the voice input by the voice input unit 4 (step S14).
If the stationary noise can be added to the speech input by the speech input unit 4 by the stationary noise addition unit 5, the analysis unit 11 performs linear prediction analysis, and the feature parameter extraction unit 12 inputs the speech. Feature parameters are extracted (step S15).

また、ステップＳ１２において、車内定常ノイズ判断部１が車室内の定常ノイズ量は所定値以上であると判断した場合は（ステップＳ１２のＹＥＳ）、入力された音声に定常ノイズの加算は行わず、ステップＳ１５において、そのまま分析部１１が音声入力部４により入力された音声の線形予測分析を行い、特徴パラメータ抽出部１２が入力された音声の特徴パラメータを抽出する（ステップＳ１５）。 In step S12, when the vehicle interior stationary noise determination unit 1 determines that the amount of stationary noise in the vehicle interior is equal to or greater than a predetermined value (YES in step S12), the stationary noise is not added to the input voice. In step S15, the analysis unit 11 performs the linear prediction analysis of the speech input by the speech input unit 4 as it is, and the feature parameter extraction unit 12 extracts the feature parameter of the input speech (step S15).

一方、ステップＳ１４において音声入力部４により入力された音声に定常ノイズが加算された場合、あるいはステップＳ１２において車室内の定常ノイズ量は所定値以上であると判断された場合は、定常ノイズが付加されている音声が音声認識の対象となるので、標準パターン選択部１６は、車内定常ノイズ判断部１、あるいはノイズ加算要否判断部２の判断結果に基づいて、標準パターン格納部１５に、標準パターン格納部１５に記憶された複数種類の標準パターンの中から、定常ノイズ発生部３で生成される定常ノイズと同質のものを一定の割合で加えた標準パターン作成用データから作成した標準パターン１を音声認識に利用するように指示する（ステップＳ１６）。 On the other hand, when the stationary noise is added to the voice input by the voice input unit 4 in step S14, or when it is determined in step S12 that the stationary noise amount in the vehicle interior is equal to or greater than the predetermined value, the stationary noise is added. Therefore, the standard pattern selection unit 16 stores the standard pattern storage unit 15 in the standard pattern storage unit 15 based on the determination result of the in-vehicle steady noise determination unit 1 or the noise addition necessity determination unit 2. A standard pattern 1 created from standard pattern creation data obtained by adding, at a constant rate, the same quality as the stationary noise generated by the stationary noise generating unit 3 from a plurality of types of standard patterns stored in the pattern storage unit 15 Is used for voice recognition (step S16).

そして、音声の特徴パラメータを抽出し、音声認識に利用する音声の標準パターンを指定することができたら、照合部１４が、特徴パラメータ抽出部１２の出力する特徴パラメータの時系列データと標準パターン選択部１６の選択に基づいて標準パターン格納部１５が出力する音声の標準パターン１とのパターンマッチングを行い（ステップＳ１７）、特徴パラメータとの類似度が最大になる標準パターンに対応する音声を音声認識結果として出力し（ステップＳ１８）、音声認識処理を終了する。 Then, after extracting the speech feature parameters and designating the speech standard pattern to be used for speech recognition, the collating unit 14 selects the feature parameter time-series data output from the feature parameter extracting unit 12 and the standard pattern selection. Based on the selection of the unit 16, pattern matching with the standard pattern 1 of the voice output from the standard pattern storage unit 15 is performed (step S17), and the voice corresponding to the standard pattern having the maximum similarity with the feature parameter is recognized as voice. As a result (step S18), the speech recognition process is terminated.

また、ステップＳ１３において、ノイズ加算要否判断部２が音声入力部４により入力された音声に定常ノイズを加算する必要はないと判断した場合（ステップＳ１３のＮＯ）、入力された音声に定常ノイズの加算は行わず、そのまま分析部１１が音声入力部４により入力された音声の線形予測分析を行い、特徴パラメータ抽出部１２が入力された音声の特徴パラメータを抽出する（ステップＳ１９）。 In Step S13, when the noise addition necessity determination unit 2 determines that it is not necessary to add stationary noise to the voice input by the voice input unit 4 (NO in Step S13), the stationary noise is added to the input voice. The analysis unit 11 performs the linear prediction analysis of the speech input by the speech input unit 4 as it is, and the feature parameter extraction unit 12 extracts the feature parameter of the input speech (step S19).

なお、ステップＳ１３において音声入力部４により入力された音声に定常ノイズを加算する必要はないと判断された場合は、定常ノイズが加算されていない音声が音声認識の対象となるので、標準パターン選択部１６は、ノイズ加算要否判断部２の判断結果に基づいて、標準パターン格納部１５に、標準パターン格納部１５に記憶された複数種類の標準パターンの中から、何も手を加えない標準パターン作成用データから作成した標準パターン２を音声認識に利用するように指示する（ステップＳ２０）。 If it is determined that it is not necessary to add stationary noise to the voice input by the voice input unit 4 in step S13, the voice to which the stationary noise is not added is subject to voice recognition. Based on the determination result of the noise addition necessity determination unit 2, the unit 16 causes the standard pattern storage unit 15 to change the standard from among a plurality of types of standard patterns stored in the standard pattern storage unit 15. The standard pattern 2 created from the pattern creation data is instructed to be used for voice recognition (step S20).

そして、音声の特徴パラメータを抽出し、音声認識に利用する音声の標準パターンを指定することができたら、照合部１４が、特徴パラメータ抽出部１２の出力する特徴パラメータの時系列データと標準パターン選択部１６の選択に基づいて標準パターン格納部１５が出力する音声の標準パターン２とのパターンマッチングを行い（ステップＳ２１）、ステップＳ１８において特徴パラメータとの類似度が最大になる標準パターンに対応する音声を音声認識結果として出力し（ステップＳ１８）、音声認識処理を終了する。 Then, after extracting the speech feature parameters and designating the speech standard pattern to be used for speech recognition, the collating unit 14 selects the feature parameter time-series data output from the feature parameter extracting unit 12 and the standard pattern selection. Based on the selection of the unit 16, pattern matching with the standard pattern 2 of the voice output from the standard pattern storage unit 15 is performed (step S21), and the voice corresponding to the standard pattern that maximizes the similarity with the feature parameter in step S18 Is output as a voice recognition result (step S18), and the voice recognition process is terminated.

以上説明したように、本実施例の車両用音声認識装置は、第１の実施例の車両用音声認識装置と同様に、車内定常ノイズ判断部１が車室内の定常ノイズ量は所定値以上ではないと判断した場合、ノイズ加算要否判断部２が、音声入力部４により入力された音声に、性質の明らかな定常ノイズを加算する必要があるか否かを判断する。そして、もし入力された音声に定常ノイズを加算する必要があると判断された場合、定常ノイズ加算部５が、入力された音声に定常ノイズ発生部３の生成する性質の明らかな定常ノイズを加算する。次に、分析部１１が線形予測分析を行い、特徴パラメータ抽出部１２が入力された音声の特徴パラメータを抽出する。一方、もし入力された音声に定常ノイズを加算する必要がないと判断された場合、入力された音声に定常ノイズの加算は行わず、そのまま分析部１１が入力された音声の線形予測分析を行い、特徴パラメータ抽出部１２が入力された音声の特徴パラメータを抽出する。 As described above, the vehicular speech recognition apparatus of the present embodiment is similar to the vehicular speech recognition apparatus of the first embodiment in that the steady noise amount in the vehicle interior 1 is equal to or greater than a predetermined value. When it is determined that there is no noise, the noise addition necessity determination unit 2 determines whether it is necessary to add stationary noise with a clear nature to the voice input by the voice input unit 4. If it is determined that it is necessary to add stationary noise to the input speech, the stationary noise adding unit 5 adds the stationary noise that is clearly generated by the stationary noise generating unit 3 to the input speech. To do. Next, the analysis unit 11 performs linear prediction analysis, and the feature parameter extraction unit 12 extracts the feature parameters of the input speech. On the other hand, if it is determined that it is not necessary to add stationary noise to the input speech, the stationary noise is not added to the input speech, and the analysis unit 11 performs linear prediction analysis of the input speech as it is. The feature parameter extraction unit 12 extracts the feature parameters of the input speech.

また、本実施例の車両用音声認識装置では、入力された音声に定常ノイズを加算する必要があると判断された場合は、標準パターン選択部１６が標準パターン格納部１５に対して、定常ノイズ発生部３で生成される定常ノイズと同質のものを一定の割合で加えた標準パターン作成用データから作成した標準パターン１を音声認識に利用するように指示する。一方、入力された音声に定常ノイズを加算する必要がないと判断された場合には、何も手を加えない標準パターン作成用データから作成した標準パターン２を音声認識に利用するように指示する。そして、照合部１４が、特徴パラメータ抽出部１２の出力する特徴パラメータの時系列データと標準パターン格納部１５の出力する音声の標準パターン１とのパターンマッチングを行い、特徴パラメータとの類似度が最大になる標準パターンに対応する音声を音声認識結果として出力する。 In the vehicle speech recognition apparatus according to the present embodiment, when it is determined that it is necessary to add stationary noise to the input voice, the standard pattern selection unit 16 makes a stationary noise to the standard pattern storage unit 15. An instruction is given to use the standard pattern 1 created from the data for creating the standard pattern in which the same quality as the stationary noise generated by the generating unit 3 is added at a constant rate for speech recognition. On the other hand, when it is determined that it is not necessary to add stationary noise to the input voice, an instruction is given to use the standard pattern 2 created from the standard pattern creation data without any modification for voice recognition. . Then, the matching unit 14 performs pattern matching between the time series data of the feature parameter output from the feature parameter extraction unit 12 and the standard pattern 1 of the voice output from the standard pattern storage unit 15, and the degree of similarity with the feature parameter is maximized. The voice corresponding to the standard pattern is output as the voice recognition result.

これにより、車室内の定常ノイズ量が所定値よりも小さく、かつ定常ノイズを加算する必要がある場合には、定常ノイズが加算された音声の音声認識を実行し、それ以外の場合には、音声入力部４により入力されたそのままの音声の音声認識を実行すると共に、車内定常ノイズ判断部１及びノイズ加算要否判断部２の判断結果に基づいて、音声に対する定常ノイズの付加の有無を判断し、音声認識に利用する標準パターンを適切なものに変更することができる。
従って、第１の実施例と同様に、車室内が静かで、同乗者の音声等、音声認識するべき話者の音声の音声認識を妨害する非定常ノイズが目立つ場合には定常ノイズを付加して音声を認識し、非定常ノイズが発生していない時や車両が走行しているために十分な定常ノイズが発生している場合等、定常ノイズを付加する必要がない場合には不必要にノイズを付加することなく音声を認識すると共に、音声に定常ノイズが付加されていない場合には、音声認識に静かな場合に用いる標準パターンを利用し、音声に定常ノイズが付加されている場合には、音声認識に定常ノイズを加算して作成した標準パターンを利用するような、入力されてくる音声や音声認識装置の周囲の環境に応じた適切な標準パターンを利用して更に音声認識率を向上させて、効率的かつ正確に音声を認識可能な車両用音声認識装置を実現することができるという効果が得られる。 Thereby, when the steady noise amount in the passenger compartment is smaller than the predetermined value and it is necessary to add the steady noise, the voice recognition of the voice to which the steady noise is added is executed, and in other cases, While performing speech recognition of the speech as it is input by the speech input unit 4, it is determined whether or not stationary noise is added to the speech based on the determination results of the in-vehicle steady noise determining unit 1 and the noise addition necessity determining unit 2. Thus, the standard pattern used for speech recognition can be changed to an appropriate one.
Therefore, as in the first embodiment, when the vehicle interior is quiet and unsteady noise that disturbs the voice recognition of the speaker to be recognized, such as the passenger's voice, is noticeable, the stationary noise is added. This is unnecessary when there is no need to add steady noise, such as when there is no steady noise or when there is sufficient steady noise because the vehicle is running. When recognizing speech without adding noise and when steady noise is not added to the speech, use a standard pattern that is used when the speech is quiet, and when stationary noise is added to the speech Uses a standard pattern created by adding stationary noise to speech recognition and uses an appropriate standard pattern according to the input speech and the surrounding environment of the speech recognition device. Improve Effect that efficiently and accurately can be realized recognizable speech recognition device for a vehicle audio.

本発明の第１の実施例の車両用音声認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the speech recognition apparatus for vehicles of 1st Example of this invention. 同実施例の車両用音声認識装置の音声認識処理動作を示すフローチャートである。It is a flowchart which shows the speech recognition process operation | movement of the speech recognition apparatus for vehicles of the Example. 入力された音声と加算する定常ノイズを示す図である。It is a figure which shows the stationary noise added with the input audio | voice. 本発明の第２の実施例の車両用音声認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the speech recognition apparatus for vehicles of 2nd Example of this invention. 同実施例の車両用音声認識装置の音声認識処理動作を示すフローチャートである。It is a flowchart which shows the speech recognition process operation | movement of the speech recognition apparatus for vehicles of the Example.

Explanation of symbols

１車内定常ノイズ判断部（定常ノイズ量判断手段）
２ノイズ加算要否判断部（定常ノイズ加算判断手段）
３定常ノイズ発生部（定常ノイズ加算手段）
４音声入力部（音声入力手段）
５定常ノイズ加算部（定常ノイズ加算手段）
６、７音声認識部（音声認識手段）
１４照合部（音声照合手段）
１５標準パターン格納部（標準パターン記憶手段）
１６標準パターン選択部（標準パターン選択手段）

1 Car interior steady noise judgment section (steady noise amount judgment means)
2 Noise addition necessity judgment section (stationary noise addition judgment means)
3 Stationary noise generator (stationary noise addition means)
4 Voice input part (voice input means)
5 Stationary noise addition unit (stationary noise addition means)
6, 7 Voice recognition unit (voice recognition means)
14 Verification unit (voice verification means)
15 Standard pattern storage (standard pattern storage means)
16 Standard pattern selection unit (standard pattern selection means)

Claims

Voice input means capable of voice input;
Voice recognition means for executing voice recognition of the voice input by the voice input means;
Steady noise amount determination means for determining whether or not the steady noise amount in the passenger compartment is equal to or greater than a predetermined value;
When the steady noise amount determining means determines that the steady noise amount in the passenger compartment is smaller than a predetermined value, it is determined whether or not the steady noise needs to be added to the voice input by the voice input means. Stationary noise addition determination means;
Stationary noise addition means for adding stationary noise to the voice input by the voice input means when the stationary noise addition judgment means determines that it is necessary to add stationary noise to the voice input by the voice input means It equipped with a door,
The stationary noise addition determining means is
Occupant detection means for detecting an occupant in the passenger compartment,
Based on the number of occupants in the passenger compartment detected by the occupant detection means or the position of the occupants in the passenger compartment, whether or not it is necessary to add stationary noise to the voice input by the voice input means. A vehicle speech recognition apparatus characterized by determining .

The stationary noise addition determining means determines whether or not it is necessary to add stationary noise to the voice input by the voice input means, based on the operating state of the acoustic device provided in the vehicle interior. The vehicular voice recognition device according to claim 1, wherein

The voice recognition means
Standard pattern storage means for storing a plurality of standard patterns related to speech;
Standard pattern selection means for selecting a standard pattern used for speech recognition from the standard pattern storage means;
Voice collating means for recognizing voice by collating the voice input by the voice input means and the standard pattern selected by the standard pattern selecting means;
The standard pattern selecting means, on the basis of the determination result of the stationary noise amount determining means and the stationary noise addition determining unit, a vehicle for speech recognition according to claim 1 or 2, characterized in that selects the reference pattern apparatus.