JP5106889B2

JP5106889B2 - Audio output device

Info

Publication number: JP5106889B2
Application number: JP2007061361A
Authority: JP
Inventors: 洋平薮田; 徹丸本
Original assignee: Alpine Electronics Inc
Current assignee: Alpine Electronics Inc
Priority date: 2006-05-23
Filing date: 2007-03-12
Publication date: 2012-12-26
Anticipated expiration: 2027-03-12
Also published as: JP2008003562A

Abstract

<P>PROBLEM TO BE SOLVED: To provide a voice output apparatus which adjusts gain of output voice according to surrounding noise, without depending on a hardware structure of a voice output stage. <P>SOLUTION: In an output control section 21 of a voice output application section 2, regarding each divided voice data in which voice data D are divided for each time length Ts, a voice data PD to which gain adjustment according to a current volume level of surrounding noise is sequentially performed on a divided voice data for each time interval of the time length Ts, by using a gain adjustment section, is generated, while HDR in which an address and a size of the PD are indicated, is informed, and thereby, processing for requesting outputting of voice which generated voice data express, to a sound driver (output) 11, is performed. The sound driver (output) 11 sequentially outputs voice expressed by each voice data indicated by informed HDR express, from a speaker 6 via a sound output device 5. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、周囲騒音の変化に応じて、出力する音声のゲインを、ユーザの音声に対する聴感が良好に維持されるように動的に変化させる音声出力装置に関するものである。 The present invention relates to an audio output device that dynamically changes an output audio gain according to a change in ambient noise so that a user's audio perception is well maintained.

周囲騒音の変化に応じて、出力する音声の周波数ゲイン特性を、ユーザの音声に対する聴感が良好に維持されるように動的に変化させる音声出力装置としては、周囲騒音に応じて、出力する音声のゲイン特性を調整する装置が知られている（たとえば、特許文献１）。
特開2004-23481号公報 As an audio output device that dynamically changes the frequency gain characteristics of the output audio in accordance with changes in the ambient noise so that the user's audio perception is well maintained, the audio output in accordance with the ambient noise A device that adjusts the gain characteristic of this is known (for example, Patent Document 1).
JP 2004-23481 A

前述したような音声出力装置における周囲騒音に応じた出力音声のゲイン特性の調整は、音声を出力しつつ、当該出力する音声に対して、リアルタイムに周囲騒音に応じたゲイン調整を行うことにより行われている。
ここで、このように音声を出力しつつリアルタイムなゲイン調整を行うためには、音声の出力段において、当該ゲイン調整をハードウエアまたはソフトウエアによって行う必要がある。しかしながら、周囲騒音に応じたゲイン調整を行う特段の素子や回路を備えていない装置では、当該ゲイン調整をハードウエアによって行うことはできない。一方で、ソフトウエアによって当該ゲイン調整を実現するためには、音声出力段のハードウエアの種類毎に、当該ハードウエアの構成に応じた処理によって周囲騒音に応じたゲイン調整を行うソフトウエアを用意する必要が生じる。 The adjustment of the gain characteristic of the output sound in accordance with the ambient noise in the sound output apparatus as described above is performed by performing gain adjustment in real time on the output sound while outputting the sound. It has been broken.
Here, in order to perform gain adjustment in real time while outputting sound in this way, it is necessary to perform the gain adjustment by hardware or software at the sound output stage. However, in a device that does not include a special element or circuit that performs gain adjustment according to ambient noise, the gain adjustment cannot be performed by hardware. On the other hand, in order to realize the gain adjustment by software, for each type of hardware in the audio output stage, prepare software that performs gain adjustment according to the ambient noise by processing according to the hardware configuration. Need to do.

そこで、本発明は、周囲騒音に応じた出力音声のゲイン調整を、音声出力段のハードウエア構成にかかわらず実現できる音声出力装置の構成を提供することを課題とする。 Therefore, an object of the present invention is to provide a configuration of an audio output device that can realize gain adjustment of an output audio according to ambient noise regardless of the hardware configuration of the audio output stage.

前記課題達成のために、本発明は、音声を出力する音声出力装置を、メモリと、発行された音声出力要求で出力音声データとして指定された、前記メモリに格納されている音声データを読み出して、読み出した音声データが表す音声を出力する音声出力部と、周囲騒音の音量レベルを検出する騒音レベル検出手段と、出力すべき音声を表す音声データである対象音声データを、所定時間長の時間区間毎の音声を表す音声データである分割音声データに分割し、各分割音声データを、前記所定時間毎に、順次、対象分割音声データとし、当該対象分割音声データが表す音声を、当該時点において前記騒音レベル検出手段が検出している周囲騒音の音量レベルに応じたゲインで増幅した音声を表すゲイン調整後音声データを生成し、生成したゲイン調整後音声データを前記メモリに格納すると共に、格納したゲイン調整後音声データを前記出力音声データとして指定した前記音声出力要求を前記音声出力部に発行する音声出力処理手段とを備えて構成したものである。 In order to achieve the above object, the present invention provides a voice output device that outputs voice by reading out voice data stored in the memory and designated as output voice data in an issued voice output request. A voice output unit that outputs a voice represented by the read voice data, a noise level detection unit that detects a volume level of ambient noise, and target voice data that is voice data that represents the voice to be output. Dividing into divided voice data that is voice data representing voice for each section, each divided voice data is sequentially set as target divided voice data every predetermined time, and the voice represented by the target divided voice data is Generating gain-adjusted audio data representing audio amplified with a gain corresponding to the volume level of ambient noise detected by the noise level detection means, Audio output processing means for storing the adjusted audio data in the memory and issuing the audio output request specifying the stored gain-adjusted audio data as the output audio data to the audio output unit It is.

このような音声出力装置によれば、音声出力処理手段によって、音声を出力する音声出力部に供給する音声データ自体を、周囲騒音に応じたゲイン調整が施されたものとしているので、音声出力部の構成に関わらずに、周囲騒音に応じた出力音声のゲイン調整を実現することができる。 According to such an audio output device, since the audio data itself supplied to the audio output unit that outputs the audio is subjected to gain adjustment according to the ambient noise by the audio output processing unit, the audio output unit Regardless of the configuration, it is possible to achieve gain adjustment of output sound in accordance with ambient noise.

ここで、このように、音声を出力する音声出力部に供給する音声データ自体を、周囲騒音に応じたゲイン調整を施したものとした場合、音声データにゲイン調整を施した時点と、当該音声データが表す音声が出力される時点との間に生じる時間差が問題となる。出力される音声に施されたゲイン調整において考慮された周囲騒音が、当該時間差分過去のものとなってしまうからである。 Here, when the audio data itself supplied to the audio output unit that outputs audio is subjected to gain adjustment according to ambient noise, the time when the gain adjustment is performed on the audio data, and the audio The time difference that occurs between the time when the voice represented by the data is output becomes a problem. This is because the ambient noise considered in the gain adjustment performed on the output sound becomes the time difference in the past.

しかしながら、本音声出力装置によれば、音声出力処理手段において、音声データのゲイン調整と音声データの音声出力部への供給を、出力すべき音声を表す音声データである対象音声データを分割した分割音声データを単位として行うことができるので、このような時間差を短縮化して、ほぼ現在の周囲騒音の状況に応じたゲイン調整が施された音声を出力することができるようになる。なお、近接する時間の周囲騒音は近似していると考えられるので、このように分割音声データ単位にゲイン調整を施した音声は、分割音声データが表す音声の時間長を充分に短くとることにより、実用上、現在の周囲騒音の状況に応じたゲイン調整が施された音声として用いることができる。 However, according to the present audio output device, in the audio output processing means, the gain adjustment of the audio data and the supply of the audio data to the audio output unit are divided by dividing the target audio data that is the audio data representing the audio to be output. Since audio data can be used as a unit, such a time difference can be shortened, and an audio with gain adjusted according to the current ambient noise condition can be output. In addition, since it is considered that the ambient noise in the adjacent time is approximated, the sound that has been gain-adjusted in this way in the divided sound data unit can be obtained by sufficiently shortening the time length of the sound represented by the divided sound data. Practically, it can be used as a sound that has been gain-adjusted according to the current ambient noise situation.

ここで、より具体的には、このような音声出力装置は、マイクロフォンと、前記マイクロフォンを用いてピックアップした周囲の音声を表す周囲音声データを出力するサウンド入力装置と、メモリと、音声を出力するサウンド出力装置と、発行された音声出力要求を受け入れ、受け入れた音声出力要求で出力音声データとして指定された、前記メモリに格納されている音声データを読み出して、読み出した音声データが表す音声を前記サウンド出力装置に出力させるサウンドドライバと、前記サウンド入力装置が出力する周囲音声データが表す周囲騒音の音量レベルを検出する騒音レベル検出手段と、出力すべき音声を表す音声データである対象音声データを、所定時間長の時間区間毎の音声を表す音声データである分割音声データに分割し、各分割音声データを、前記所定時間毎に、順次、対象分割音声データとし、当該対象分割音声データが表す音声を、当該時点において前記騒音レベル検出手段が検出している周囲騒音の音量レベルに応じたゲインで増幅した音声を表すゲイン調整後音声データを生成し、生成したゲイン調整後音声データを前記メモリに格納すると共に、格納したゲイン調整後音声データを前記出力音声データとして指定した前記音声出力要求を前記サウンドドライバに発行する音声出力処理手段とを備えた音声出力装置として構成するようにしてよい。 More specifically, such an audio output device outputs a microphone, a sound input device that outputs ambient audio data representing ambient audio picked up using the microphone, a memory, and audio. The sound output device accepts the issued voice output request, reads the voice data stored in the memory specified as the output voice data in the accepted voice output request, and reads the voice represented by the read voice data A sound driver to be output to the sound output device; noise level detection means for detecting a volume level of ambient noise represented by the ambient audio data output by the sound input device; and target audio data that is audio data representing the audio to be output. , Divided into divided audio data that is audio data representing audio for each time interval of a predetermined time length. Each divided voice data is sequentially set as target divided voice data at each predetermined time, and the voice represented by the target divided voice data is determined according to the volume level of ambient noise detected by the noise level detecting means at the time. Generating the sound data after gain adjustment representing the sound amplified by the gain, storing the generated sound data after gain adjustment in the memory, and specifying the stored sound data after gain adjustment as the output sound data You may make it comprise as an audio | voice output apparatus provided with the audio | voice output process means which issues a request | requirement to the said sound driver.

このように構成した場合には、サウンド出力装置やサウンドドライバによらずに、ほぼ現在の周囲騒音に応じた出力音声のゲイン調整を実現することができる。
ここで、以上の各音声出力装置は、前記騒音レベル検出手段を、少なくとも前記音声出力要求が発行されるまで、常時、直近過去の前記所定時間長分の周囲音声を表す周囲音声データを保持する周囲騒音保持手段と、前記音声出力要求が発行されたならば、当該時点で、前記周囲騒音保持手段に保持されている前記直近過去の前記所定時間長分の周囲音声データの音量レベルを、前記検出する周囲騒音の音量レベルとして算定すると共に、以降、前記所定時間長の時間区間の経過毎に、当該直近に経過した前記所定時間長の時間区間の周囲音声の音量レベルを、前記検出する周囲騒音の音量レベルとして算定する音量レベル算定手段とより構成し、前記音声出力処理手段において、前記音声出力要求が発行されたならば、前記音量レベル算定手段が、前記周囲騒音の音量レベルを算定する度に、各分割音声データを、順次、対象分割音声データとし、前記対象分割音声データが表す音声を、当該算定された周囲騒音の音量レベルに応じたゲインで増幅した音声を表すゲイン調整後音声データを生成するようにしてもよい。 When configured in this way, it is possible to realize gain adjustment of the output sound almost in accordance with the current ambient noise, regardless of the sound output device or the sound driver.
Here, each of the above audio output devices always holds the ambient audio data representing the ambient audio for the predetermined past length of time in the past until at least the audio output request is issued. If the ambient noise holding means and the voice output request are issued, at that time, the volume level of ambient voice data for the predetermined time length in the latest past held in the ambient noise holding means is Calculated as the volume level of the ambient noise to be detected, and thereafter, for each passage of the time period of the predetermined time length, the volume level of the ambient sound in the time period of the predetermined time length that has passed most recently A volume level calculating means for calculating the volume level of noise, and if the audio output request is issued in the audio output processing means, the volume level is calculated. Each time the stage calculates the volume level of the ambient noise, each divided voice data is sequentially set as target divided voice data, and the voice represented by the target divided voice data is set according to the calculated volume level of the ambient noise. It is also possible to generate gain-adjusted audio data representing the audio amplified by the gain.

このようにすることにより、音声出力要求が発行されしだい、即座に、騒音レベル検出手段において、常時保持するようにした直近過去の前記所定時間長分の周囲音声データを用いて、周囲騒音の音量レベルを検出することができる。したがって、音声出力要求の発生直後から、前記音声出力処理手段において、ゲイン調整後音声データの生成格納や音声出力要求の発行を行って、すみやかに音声を出力することができるようになる。 In this way, as soon as an audio output request is issued, the noise level detection means immediately uses the ambient audio data for the predetermined time length in the most recent past, which is always held in the noise level detection means. The level can be detected. Accordingly, immediately after the generation of the audio output request, the audio output processing means can generate and store the gain-adjusted audio data and issue the audio output request, and can immediately output the audio.

なお、以上の音声出力装置は、前記騒音レベル検出手段において、周波数帯域毎に、周囲騒音の音量レベルを検出し、前記音声出力処理手段において、前記対象分割音声データが表す音声を、周波数帯域毎に、当該時点において前記騒音レベル検出手段が検出している周囲騒音の各周波数帯域の音量レベルに応じたゲインで増幅した音声を表すゲイン調整後音声データを生成するように構成してもよい。 In the above sound output device, the noise level detection means detects the volume level of ambient noise for each frequency band, and the sound output processing means detects the sound represented by the target divided sound data for each frequency band. In addition, it may be configured to generate gain-adjusted sound data representing sound amplified by a gain corresponding to the volume level of each frequency band of ambient noise detected by the noise level detection means at the time.

また、以上の当該音声出力装置は、自動車に搭載されるものであってよい。 Moreover, the said audio | voice output apparatus mentioned above may be mounted in a motor vehicle.

以上のように、本発明によれば、周囲騒音に応じた出力音声のゲイン調整を、音声出力段のハードウエア構成にかかわらず実現できる。 As described above, according to the present invention, the gain adjustment of the output sound according to the ambient noise can be realized regardless of the hardware configuration of the sound output stage.

以下、本発明に係る音声出力装置の実施形態について、車載の音声出力装置への適用を例にとり説明する。
まず、第１の実施形態について説明する。
図１に、音声出力装置の構成を示す。
図示するように、本音声出力装置は、オペレーティングシステム１、音声出力アプリケーション２、サウンド入力装置３、マイクロフォン４、サウンド出力装置５、スピーカ６、音声データメモリ７、出力用バッファメモリ８とを備えている。
但し、以上のような音声出力装置は、ハードウエア構成としては、ＣＰＵやメモリや外部記憶装置などを備えた一般的な電子計算機の構成を有している。また、当該電子計算機の音声入出力用のハードウエアとして、以上のサウンド入力装置３やマイクロフォン４やサウンド出力装置５やスピーカ６を備えているものである。そして、前述したオペレーションシステムや音声出力アプリケーション２や音声データメモリ７や出力用バッファメモリ８は、ＣＰＵが予め用意されたプログラムを実行することにより、当該電子計算機上に、プロセスや記憶資源として具現化されるものである。 Hereinafter, embodiments of the audio output device according to the present invention will be described taking application to an in-vehicle audio output device as an example.
First, the first embodiment will be described.
FIG. 1 shows the configuration of the audio output device.
As shown in the figure, the audio output device includes an operating system 1, an audio output application 2, a sound input device 3, a microphone 4, a sound output device 5, a speaker 6, an audio data memory 7, and an output buffer memory 8. Yes.
However, the audio output apparatus as described above has a general electronic computer configuration including a CPU, a memory, an external storage device, and the like as a hardware configuration. Further, as the sound input / output hardware of the electronic computer, the sound input device 3, the microphone 4, the sound output device 5, and the speaker 6 are provided. The operation system, the audio output application 2, the audio data memory 7, and the output buffer memory 8 described above are realized as processes and storage resources on the electronic computer by executing a program prepared in advance by the CPU. It is what is done.

さて、このような音声出力装置の構成において、オペレーティングシステム１は、サウンドドライバ（入力）１２を含んでいる。そして、サウンドドライバ（入力）１２は、サウンド入力装置３を介して取り込んだ、マイクロフォン４がピックアップした周囲騒音を表す音声データを生成する。 In the configuration of such an audio output device, the operating system 1 includes a sound driver (input) 12. Then, the sound driver (input) 12 generates sound data representing ambient noise picked up by the microphone 4 and taken in via the sound input device 3.

また、オペレーティングシステム１は、サウンド出力装置５を制御して音声をスピーカ６に出力するサウンドドライバ（出力）１１を含んでいる。ここで、サウンドドライバ（出力）１１は、出力する音声の音声データのアドレスとサイズを表すＨＤＲを格納するＨＤＲキューを備えている。そして、サウンドドライバ（出力）１１は、ＨＤＲキューに格納されたＨＤＲを順次取り出し、取り出したＨＤＲが示すアドレスから当該ＨＤＲが示すサイズ分の音声データを読み出し、音声データが表す音声をサウンド出力装置５を介してスピーカ６に出力する処理を行うものである。また、サウンドドライバ（出力）１１は、以上のようにＨＤＲが示す音声データが表す音声の出力を終了したならば、ＨＤＲキューに格納したＨＤＲの発行元に再生終了通知を通知するものである。 In addition, the operating system 1 includes a sound driver (output) 11 that controls the sound output device 5 and outputs sound to the speaker 6. Here, the sound driver (output) 11 includes an HDR queue that stores an HDR representing an address and a size of audio data of audio to be output. Then, the sound driver (output) 11 sequentially extracts the HDR stored in the HDR queue, reads out the audio data for the size indicated by the HDR from the address indicated by the extracted HDR, and outputs the audio represented by the audio data to the sound output device 5. The process which outputs to the speaker 6 via this is performed. In addition, when the sound driver (output) 11 finishes outputting the audio represented by the audio data indicated by the HDR as described above, the sound driver (output) 11 notifies the reproduction end notification to the issuer of the HDR stored in the HDR queue.

そして、音声出力アプリケーション２は、出力制御部２１と、ゲイン調整部２２とを備えている。
以下、このような音声出力アプリケーション２の動作について説明する。
音声出力アプリケーション２の出力制御部２１は、音声データメモリ７に格納されている、スピーカ６から出力すべき音声を表す音声データＤが音声データメモリ７上に発生すると、当該音声データＤを対象として音声出力処理を行う。ここで、音声データＤは、予め音声データメモリ７に格納されている音声データであってもよいし、音声合成処理などにより新たに生成されたものであってもよい。また、音声データＤは、たとえば、ユーザに対してガイダンスを行う音声を表すものであり、この場合、音声データＤが表す音声の時間長は数秒から数十秒となる。 The audio output application 2 includes an output control unit 21 and a gain adjustment unit 22.
Hereinafter, the operation of the audio output application 2 will be described.
When the audio data D representing the audio to be output from the speaker 6 stored in the audio data memory 7 is generated on the audio data memory 7, the output control unit 21 of the audio output application 2 targets the audio data D. Perform audio output processing. Here, the voice data D may be voice data stored in the voice data memory 7 in advance, or may be newly generated by voice synthesis processing or the like. The voice data D represents, for example, voice for guidance to the user. In this case, the time length of the voice represented by the voice data D is several seconds to several tens of seconds.

図２に、音声出力アプリケーション２の出力制御部２１が行う音声出力処理の手順を示す。
図示するように、この処理では、まず、サウンドドライバ（出力）１１に対して、デバイス（サウンド出力装置５）の使用開始の宣言や、出力する音声データのフォーマットの宣言などの各種前処理を行う（ステップ２０２）。
そして、次に、音声データＤを分割した音声データ分割数ｎ分の各分割音声データＤ（ｊ）について、ｊの小さいものより、順次、以下の処理を行う（ステップ２０４、２２０、２２８）。ここで、音声データＤの分割は、ｔｓを予め定めた時間長として、各分割音声データＤ（ｊ）が表す音声が、音声データＤが表す音声の内の、ｊ×ｔｓから（ｊ+1）×ｔｓまでの間の期間の音声を表すように行う。但し、音声データＤが表す音声の時間長をＬとし、音声データ分割数ｎはステップ２０４においてＬ≦ｎ×ｔｓを満たす最小の整数として求められ、ｊは、０≦ｊ＜ｎ満たす整数である。 FIG. 2 shows a procedure of audio output processing performed by the output control unit 21 of the audio output application 2.
As shown in the figure, in this processing, first, various preprocessing such as declaration of start of use of the device (sound output device 5) and declaration of format of audio data to be output is performed on the sound driver (output) 11. (Step 202).
Next, the following processing is sequentially performed on the divided audio data D (j) corresponding to the number n of audio data divisions obtained by dividing the audio data D, starting from the smallest j (steps 204, 220, and 228). Here, the audio data D is divided from j × ts (j + 1) of the audio represented by the audio data D, with the audio represented by each divided audio data D (j) having ts as a predetermined time length. ) Xts so as to represent the voice in the period. However, the audio time length represented by the audio data D is L, the audio data division number n is determined as the smallest integer satisfying L ≦ n × ts in step 204, and j is an integer satisfying 0 ≦ j <n. .

次に、ｊの２を法数とする剰余をｉとして求め（ステップ２０６）、ｊが２未満かどうかを調べ（ステップ２０８）、２未満であれば、ゲイン調整部２２に、分割音声データＤ（ｊ）に現時点における周囲騒音に応じたゲイン調整を施させ、分割音声データＤ（ｊ）にゲイン調整を施した音声データを出力用バッファメモリ８にＰＤ（ｉ）として格納する（ステップ２１４）。ただし、ｊが１である場合には、前回ステップ２１４の実行を開始してから、前記した時間長ｔｓ経過後に、今回のステップ２１４の実行を開始することが好ましい。 Next, a remainder with j of 2 as a modulus is obtained as i (step 206), and it is checked whether j is less than 2 (step 208). If it is less than 2, the gain adjustment unit 22 sends the divided audio data D. (J) is subjected to gain adjustment according to the ambient noise at the present time, and the audio data obtained by adjusting the gain of the divided audio data D (j) is stored as PD (i) in the output buffer memory 8 (step 214). . However, when j is 1, it is preferable to start the execution of the current step 214 after the time length ts has elapsed since the execution of the previous step 214 was started.

そして、ＰＤ（ｉ）のアドレスやサイズの属性をＨＤＲ（ｉ）に設定し（ステップ２１６）、ＨＤＲ（ｉ）をＨＤＲとしてサウンドドライバ（出力）１１に発行し、ＨＤＲキューに格納する（ステップ２１８）。ここで、音声出力アプリケーション２のゲイン調整部２２は、出力制御部２１の要求に応じて、サウンドドライバ（入力）１２から周囲騒音の音声データを取り込み、取り込んだ音声データが表す周囲騒音の音量レベルに応じたゲイン調整を分割音声データＤ（ｊ）に施す。すなわち、分割音声データＤ（ｊ）が表す音声を、周囲騒音の音量レベルに応じた増幅率で増幅した音声を表す音声データを、出力用バッファメモリ８にＰＤ（ｉ）として格納する、分割音声データＤ（ｊ）にゲイン調整を施した音声データとして生成する。なお、このゲイン調整部２２のゲイン調整は、周囲騒音の音声データが表す周囲騒音の音量レベルを周波数帯域毎に求めると共に、求めた周囲騒音の各周波数帯域の音量レベルに応じたゲイン調整を周波数帯域毎に分割音声データＤ（ｊ）に施すことによって行うようにしてもよい。 Then, the address and size attributes of PD (i) are set to HDR (i) (step 216), HDR (i) is issued as HDR to the sound driver (output) 11 and stored in the HDR queue (step 218). ). Here, the gain adjustment unit 22 of the audio output application 2 acquires ambient noise audio data from the sound driver (input) 12 in response to a request from the output control unit 21, and the ambient noise volume level represented by the acquired audio data Is applied to the divided audio data D (j). That is, the divided audio data that stores the audio data representing the audio represented by the divided audio data D (j) with the amplification factor corresponding to the volume level of the ambient noise is stored as PD (i) in the output buffer memory 8. Data D (j) is generated as audio data with gain adjustment. The gain adjustment of the gain adjusting unit 22 is performed by obtaining the volume level of the ambient noise represented by the sound data of the ambient noise for each frequency band, and adjusting the gain adjustment according to the volume level of each obtained frequency band of the ambient noise. You may make it carry out by giving to division | segmentation audio | voice data D (j) for every zone | band.

次に、ｊが２未満でない場合には（ステップ２０８）、サウンドドライバ（出力）１１からの再生終了通知を待って（ステップ２１０）、ＨＤＲ（ｉ）をクリア（ステップ２１２）した上で、分割音声データＤ（ｊ）に現時点における周囲騒音に応じたゲイン調整を施させ、分割音声データＤ（ｊ）にゲイン調整を施した音声データを出力用バッファメモリ８にＰＤ（ｉ）として格納する（ステップ２１４）。 Next, if j is not less than 2 (step 208), the process waits for a reproduction end notification from the sound driver (output) 11 (step 210), clears HDR (i) (step 212), and then divides. The audio data D (j) is subjected to gain adjustment according to the ambient noise at the present time, and the audio data obtained by adjusting the gain of the divided audio data D (j) is stored as PD (i) in the output buffer memory 8 ( Step 214).

そして、ＰＤ（ｉ）のアドレスやサイズの属性をＨＤＲ（ｉ）に設定し（ステップ２１６）、ＨＤＲ（ｉ）をＨＤＲとしてサウンドドライバ（出力）１１に発行し、ＨＤＲキューに格納する（ステップ２１８）。
そして、音声データＤを分割した各分割音声データＤ（ｊ）の全てについて以上の処理を終了したならば、サウンドドライバ（出力）１１の再生終了を待って（ステップ２２２）、ＨＤＲ（０）とＨＤＲ（１）をクリアし（ステップ２２４）、サウンドドライバ（出力）１１に対して、デバイス（サウンド出力装置５）の使用終了の宣言などの後処理を行って（ステップ２２６）、音声出力処理を終了する。 Then, the address and size attributes of PD (i) are set to HDR (i) (step 216), HDR (i) is issued as HDR to the sound driver (output) 11 and stored in the HDR queue (step 218). ).
When the above processing is completed for all the divided audio data D (j) obtained by dividing the audio data D, the reproduction of the sound driver (output) 11 is awaited (step 222), and HDR (0) is set. HDR (1) is cleared (step 224), and post-processing such as declaration of the end of use of the device (sound output device 5) is performed on the sound driver (output) 11 (step 226), and audio output processing is performed. finish.

図３に、このような音声出力処理の処理例を示す。
いま、図３ａに示すように、音声データＤが６×ｔｓの時間長分の音声を表すものであった場合、音声データＤは、Ｄ（０）からＤ（５）の分割音声データに分割される。
そして、この場合には、図３ｂに示すように、時間長ｔｓ毎の時間区間ｔ０からｔ６の最初の時間区間ｔ０において、音声データＤの最初の分割音声データＤ（０）を、その時点の周囲騒音Ｎの音量レベルに応じてゲイン調整した音声データＧＤ（０）がＰＤ（０）に設定され、ＰＤ（０）のアドレスとサイズがＨＤＲ（０）に設定されると共に、ＨＤＲ（０）がＨＤＲとしてサウンド出力装置５のＨＤＲキューに追加される。 FIG. 3 shows an example of such audio output processing.
Now, as shown in FIG. 3a, when the audio data D represents the audio for a time length of 6 × ts, the audio data D is divided into divided audio data from D (0) to D (5). Is done.
In this case, as shown in FIG. 3b, in the first time interval t0 from the time interval t0 to t6 for each time length ts, the first divided audio data D (0) of the audio data D is The audio data GD (0) gain-adjusted according to the volume level of the ambient noise N is set in PD (0), the address and size of PD (0) are set in HDR (0), and HDR (0) Is added to the HDR queue of the sound output device 5 as HDR.

次に、時間区間ｔ１では、音声データＤの２番目の分割音声データＤ（１）を、その時点の周囲騒音Ｎの音量レベルに応じてゲイン調整した音声データＧＤ（１）がＰＤ（１）に設定され、ＰＤ（１）のアドレスとサイズがＨＤＲ（１）に設定されると共に、ＨＤＲ（１）がＨＤＲとしてサウンド出力装置５のＨＤＲキューに追加される。また、Ｑ｛ＰＤ（０）｝として示すように、サウンドドライバ（出力）１１、サウンド出力装置５によって、ＰＤ（０）に設定されたＧＤ（０）が表す音声が再生出力される。ただし、この例は、ｊが１である場合に、先に行った分割音声データＤ（０）についてのゲイン調整（ステップ２１４）の実行を開始してから、前記した時間長ｔｓ経過後に、今回の分割音声データＤ（１）のゲイン調整（ステップ２１４）の実行を開始するようにした場合についてのものである。 Next, in the time interval t1, the audio data GD (1) obtained by adjusting the gain of the second divided audio data D (1) of the audio data D according to the volume level of the ambient noise N at that time is PD (1). And the address and size of PD (1) are set to HDR (1), and HDR (1) is added to the HDR queue of the sound output device 5 as HDR. Further, as indicated by Q {PD (0)}, the sound represented by GD (0) set in PD (0) is reproduced and output by the sound driver (output) 11 and the sound output device 5. However, in this example, when j is 1, this time after the time length ts has elapsed since the execution of the gain adjustment (step 214) for the divided audio data D (0) performed previously is started. This is for the case where the execution of the gain adjustment (step 214) of the divided audio data D (1) is started.

そして、その次の、時間区間ｔ２では、音声データＤの３番目の分割音声データＤ（２）を、その時点の周囲騒音Ｎの音量レベルに応じてゲイン調整した音声データＧＤ（２）がＰＤ（０）に設定され、ＰＤ（０）のアドレスとサイズがＨＤＲ（０）に設定されると共に、ＨＤＲ（０）がＨＤＲとしてサウンド出力装置５のＨＤＲキューに追加される。また、Ｑ｛ＰＤ（１）｝として示すように、サウンドドライバ（出力）１１、サウンド出力装置５によって、ＰＤ（１）に設定されたＧＤ（１）が表す音声が、再生出力される。 Then, in the next time interval t2, the audio data GD (2) obtained by adjusting the gain of the third divided audio data D (2) of the audio data D according to the volume level of the ambient noise N at that time is PD. It is set to (0), the address and size of PD (0) are set to HDR (0), and HDR (0) is added to the HDR queue of the sound output device 5 as HDR. Further, as indicated by Q {PD (1)}, the sound represented by GD (1) set in PD (1) is reproduced and output by the sound driver (output) 11 and the sound output device 5.

また、次の、時間区間ｔ３では、音声データＤの４番目の分割音声データＤ（３）を、その時点の周囲騒音Ｎの音量レベルに応じてゲイン調整した音声データＧＤ（３）がＰＤ（１）に設定され、ＰＤ（１）のアドレスとサイズがＨＤＲ（１）に設定されると共に、ＨＤＲ（１）がＨＤＲとしてサウンド出力装置５のＨＤＲキューに追加される。また、Ｑ｛ＰＤ（０）｝として示すように、サウンドドライバ（出力）１１、サウンド出力装置５によって、ＰＤ（０）に設定されたＧＤ（２）が表す音声が、再生出力される。 In the next time interval t3, the audio data GD (3) obtained by adjusting the gain of the fourth divided audio data D (3) of the audio data D according to the volume level of the ambient noise N at that time is PD ( 1), the address and size of the PD (1) are set to HDR (1), and HDR (1) is added to the HDR queue of the sound output device 5 as HDR. Further, as indicated by Q {PD (0)}, the sound represented by GD (2) set to PD (0) is reproduced and output by the sound driver (output) 11 and the sound output device 5.

以降、同様にＰＤ（０）及びＨＤＲ（１）と、ＨＤＲ（０）及びＰＤ（１）を交互に用いながら、時間区間ｔｍでは、音声データＤのｍ+１番目の分割音声データＤ（ｍ）のゲイン調整と、ゲイン調整した音声データＧＤ（ｍ-１）が表す音声のサウンドドライバ（出力）１１、サウンド出力装置５による出力が行われる。 Thereafter, similarly, PD (0) and HDR (1) and HDR (0) and PD (1) are alternately used, and in the time interval tm, m + 1th divided audio data D (m ) And the sound output by the sound driver (output) 11 and the sound output device 5 represented by the sound data GD (m-1) after gain adjustment are performed.

結果、Ｑとして示すように音声データＤをゲイン調整した音声がサウンド出力装置５によってスピーカ６から出力されることになる。そして、出力される音声の各部分は、約ｔｓ時間前の周囲騒音の音量レベルに応じてゲイン調整されたものとなる。
以上、本発明の第１実施形態について説明した。
このように本台１実施形態によれば、音声出力アプリケーション２によって、サウンドドライバ（出力）１１に供給する音声データ（ＰＤ）自体を、周囲騒音に応じたゲイン調整が施されたものとしているので、サウンドドライバ（出力）１１に関わらずに、周囲騒音に応じた出力音声のゲイン調整を実現することができる。なお、一般的に、サウンドドライバ（出力）１１及びサウンドドライバ（入力）１２とアプリケーションとのインタフェースＡＰＩは、サウンドドライバ（出力）１１及びサウンドドライバ（入力）１２によらず共通化されている。 As a result, as shown by Q, the sound obtained by adjusting the gain of the sound data D is output from the speaker 6 by the sound output device 5. Then, each part of the output voice is gain-adjusted according to the volume level of the ambient noise about ts time ago.
The first embodiment of the present invention has been described above.
As described above, according to the first embodiment, the audio data (PD) itself supplied to the sound driver (output) 11 by the audio output application 2 is subjected to gain adjustment according to the ambient noise. Regardless of the sound driver (output) 11, the gain adjustment of the output sound according to the ambient noise can be realized. In general, the interface API between the sound driver (output) 11 and the sound driver (input) 12 and the application is shared regardless of the sound driver (output) 11 and the sound driver (input) 12.

ここで、このように、サウンドドライバ（出力）１１に供給する音声データ（ＰＤ）自体を、周囲騒音に応じたゲイン調整を施したものとした場合、音声データ（ＰＤ）にゲイン調整を施した時点と、当該音声データが表す音声が実際に出力される時点との間に生じる時間差が問題となる。出力される音声が、当該時間差分過去の周囲騒音に応じたゲイン調整が施されたものとなるからである。 Here, when the audio data (PD) itself supplied to the sound driver (output) 11 is subjected to gain adjustment according to ambient noise, the audio data (PD) is subjected to gain adjustment. A time difference between the time point and the time point when the sound represented by the sound data is actually output becomes a problem. This is because the output voice is gain-adjusted according to ambient noise in the past of the time difference.

しかしながら、本第１実施形態によれば、音声データ（ＰＤ）のゲイン調整と音声データ（ＰＤ）のサウンドドライバ（出力）１１への供給を、出力すべき音声を表す音声データを分割した分割音声データを単位として行うことができるので、このような時間差を短縮化して、ほぼ現在の周囲騒音の状況に応じたゲイン調整が施された音声を出力することができるようになる。なお、近接する時間の周囲騒音は近似していると考えられるので、分割音声データが表す音声の時間長を充分に短くとることにより、このようにゲイン調整を施した音声は、実用上、現在の周囲騒音の状況に応じたゲイン調整が施された音声として用いることができる。 However, according to the first embodiment, the divided sound obtained by dividing the sound data representing the sound to be output by adjusting the gain of the sound data (PD) and supplying the sound data (PD) to the sound driver (output) 11. Since it can be performed in units of data, such a time difference can be shortened, and a sound that has been gain-adjusted according to the current ambient noise situation can be output. In addition, since it is considered that the ambient noise in the adjacent time is approximated, the sound subjected to gain adjustment in this way is practically presently used by sufficiently shortening the time length of the sound represented by the divided sound data. Can be used as a sound that has been gain-adjusted according to the ambient noise situation.

以下、本発明の第２の実施形態について説明する。
本第２実施形態は、前記第１実施形態の音声出力処理のステップ２１４における出力制御部２１の指示に応じて、ゲイン調整部２２が行うゲイン調整を、より速やかに実行できるようにしたものである。
図４に、本第２施形態に係る音声出力装置の構成を示す。
図示するように、本第２実施形態に係る音声素出力装置は、図１に示した音声出力装置に、騒音データバッファ９を追加すると共に、音声出力アプリケーションに騒音データ取得制御部２３を設けたものである。
また、本第２実施形態では、図２に示した出力制御部２１が行う出力制御処理のステップ２０４において、算出した音声データ分割数ｎと音声出力処理開始とを騒音データ取得制御部２３とゲイン調整部に通知するようにする。 Hereinafter, a second embodiment of the present invention will be described.
In the second embodiment, the gain adjustment performed by the gain adjustment unit 22 can be executed more promptly in response to an instruction from the output control unit 21 in step 214 of the sound output processing of the first embodiment. is there.
FIG. 4 shows the configuration of the audio output device according to the second embodiment.
As illustrated, the speech element output device according to the second embodiment includes a noise data buffer 9 added to the speech output device illustrated in FIG. 1 and a noise data acquisition control unit 23 provided in the speech output application. Is.
In the second embodiment, in step 204 of the output control process performed by the output control unit 21 shown in FIG. 2, the calculated audio data division number n and the start of the audio output process are set to the noise data acquisition control unit 23 and the gain. Notify the adjustment unit.

そして、騒音データ取得制御部２３において、図５ａに示す騒音データ取得処理を行うと共に、ゲイン調整部２２において図５ｂに示すゲイン調整処理によって、前記第１実施形態の音声出力処理のステップ２１４における出力制御部２１の指示に応じた、ゲインの調整を行うようにしたものである。
以下、騒音データ取得制御部２３が行う騒音データ取得処理について説明する。
図５ａに示すように、この処理では、出力制御部２１から音声出力処理開始の通知があるまで（ステップ５０４）、サウンドドライバ（入力）１２から周囲騒音の音声データを取り込み続け、騒音データバッファ９のＮＤＰＲに、常に、直近過去ｔｓ時間分の周囲騒音の音声データが格納されるようにする（ステップ５０２）。ＮＤＰＲにおける直近過去ｔｓ時間分の周囲騒音の音声データの常時格納は、たとえば、騒音データバッファ９のＮＤＰＲをｔｓ時間分の音声データを格納するFIFOとして構成し、順次、サウンドドライバ（入力）１２から取り込んだ周囲騒音の音声データをＮＤＰＲに格納することなどにより実現できる。なお、ｔｓは、第１実施形態で示した分割音声データＤ（ｊ）の時間長である。 The noise data acquisition control unit 23 performs the noise data acquisition process shown in FIG. 5a, and the gain adjustment unit 22 performs the output in step 214 of the audio output process of the first embodiment by the gain adjustment process shown in FIG. 5b. The gain is adjusted in accordance with an instruction from the control unit 21.
Hereinafter, the noise data acquisition process performed by the noise data acquisition control unit 23 will be described.
As shown in FIG. 5a, in this process, the sound data of the ambient noise is continuously captured from the sound driver (input) 12 until the output control unit 21 notifies the start of the sound output process (step 504), and the noise data buffer 9 NDPR is always stored with sound data of ambient noise for the latest past ts time (step 502). For example, NDPR in the noise data buffer 9 is always stored as a FIFO for storing audio data for ts time, and the sound data from the sound driver (input) 12 is sequentially stored. This can be realized, for example, by storing the captured ambient noise voice data in the NDPR. Note that ts is the time length of the divided audio data D (j) shown in the first embodiment.

そして、出力制御部２１から音声出力処理開始が通知されたならば（ステップ５０４）、以降、騒音データバッファ９のＮＤ（０）とＮＤ（１）に、交互に、ｔｓ時間分づつ、サウンドドライバ（入力）１２から取り込んだ周囲騒音の音声データを、ｎ１回格納する処理を行い（ステップ５０６-５１４）、ｎ１回格納したならばステップ５０２からの処理に戻る。 If the start of the audio output process is notified from the output control unit 21 (step 504), the sound driver is alternately switched to ND (0) and ND (1) of the noise data buffer 9 every ts time. (Input) The process of storing the ambient noise voice data fetched from 12 is performed n1 times (steps 506-514), and if stored n1 times, the process returns to step 502.

次に、ゲイン調整部２２が行うゲイン調整処理について説明する。
図５ｂに示すように、ゲイン調整処理では、出力制御部２１から音声出力処理開始の通知を待ち（ステップ５５２）、通知があったならば、まず、分割音声データＤ（０）（ステップ５５４、５５６）に、騒音データバッファ９のＮＤＰＲから取り込んだ（ステップ５５８、５７２）周囲騒音の音声データの音量レベルに応じたゲイン調整を施し、ゲイン調整を施した音声データを出力用バッファメモリ８にＰＤ（０）として格納する（ステップ５５４、５６６）。 Next, gain adjustment processing performed by the gain adjustment unit 22 will be described.
As shown in FIG. 5b, in the gain adjustment processing, the output control unit 21 waits for a notification of the start of the audio output processing (step 552). If there is a notification, first, the divided audio data D (0) (step 554, 556), the gain adjustment is performed according to the volume level of the sound data of the ambient noise taken from the NDPR of the noise data buffer 9 (steps 558 and 572), and the sound data subjected to the gain adjustment is stored in the output buffer memory 8 as a PD. Store as (0) (steps 554, 566).

そして、以降は、ｊ=１からｊ=ｎ-1までの各ｊについて順次（ステップ５６８、５７０）
分割音声データＤ（ｊ）を取得し（ステップ５５６）、騒音データバッファ９のＮＤ（ｋ）に時間長ｔｓ分の周囲騒音の音声データが格納されるのを待って（ステップ５６２）、ＮＤ（ｋ）から周囲騒音の音声データを取り込み（ステップ５６４）、取り込んだ周囲騒音の音声データの音量レベルに応じたゲイン調整を取得した分割音声データＤ（ｊ）に施し、ゲイン調整を施した音声データを出力用バッファメモリ８にＰＤ（ｉ）として格納する（ステップ５６６）。ただし、ｉは、ｊの２を法数とする剰余であり、ｋは、ｊ-１の２を法数とする剰余である（ステップ５６０）
そして、ｊ=ｎ-1までの処理を終了したならば（ステップ５６８）、ステップ５５２からの処理に戻る。 Thereafter, each j from j = 1 to j = n−1 is sequentially performed (steps 568 and 570).
The divided voice data D (j) is acquired (step 556), and the voice data of ambient noise for the time length ts is stored in ND (k) of the noise data buffer 9 (step 562). k) audio data of ambient noise is fetched from step (564), and the gain adjustment is performed on the obtained divided audio data D (j) according to the volume level of the acquired ambient noise audio data, and the gain data is adjusted. Is stored as PD (i) in the output buffer memory 8 (step 566). However, i is a remainder whose modulus is 2 of j, and k is a remainder whose modulus is 2 of j−1 (step 560).
When the processing up to j = n−1 is completed (step 568), the processing returns to step 552.

次に、以上のようなゲイン調整処理の処理例を図６に示す。
いま、図６ａに示すように、音声データＤが６×ｔｓの時間長分の音声を表すものであり、当該音声データＤは、Ｄ（０）からＤ（５）の分割音声データに分割されるものとする。また、図６ｂに示すように、各々時間長ｔｓの各時間区間ｔ０からｔ６における周囲騒音Ｎを、Ｎ（ｔ０）からＮ（ｔ６）で示すものとする。但し、時間区間ｔ０は、音声出力処理が開始された時点Ｓの直近過去の時間長ｔｓの時間区間を表す。 Next, FIG. 6 shows an example of gain adjustment processing as described above.
Now, as shown in FIG. 6a, the audio data D represents audio for a time length of 6 × ts, and the audio data D is divided into divided audio data from D (0) to D (5). Shall be. Further, as shown in FIG. 6b, the ambient noise N in each time interval t0 to t6 of each time length ts is represented by N (t0) to N (t6). However, the time interval t0 represents a time interval of the last past time length ts of the time point S at which the audio output process is started.

この場合に、時点Ｓで音声出力処理が開始されると、当該時点において、騒音データバッファ９のＮＤＰＲには、騒音データ取得制御部２３によって、既に、時間長ｔｓを有する直近過去の時間区間ｔ０の周囲騒音の音声データＮ（ｔ０）が格納されている。そこで、ゲイン調整部２２は、直ちに、音声データＤの最初の分割音声データＤ（０）に、ＮＤＰＲに格納されている周囲騒音Ｎ（ｔ０）の音量レベルに応じたゲイン調整して音声データＧＤ（０）を生成し、出力用音声バッファメモリ８のＰＤ（０）に設定する。 In this case, when the audio output process is started at time S, at the time, the noise data acquisition control unit 23 stores the NDPR of the noise data buffer 9 in the most recent time interval t0 that already has the time length ts. Voice data N (t0) of ambient noise is stored. Therefore, the gain adjusting unit 22 immediately adjusts the gain according to the volume level of the ambient noise N (t0) stored in the NDPR to the first divided audio data D (0) of the audio data D, and the audio data GD. (0) is generated and set in PD (0) of the output audio buffer memory 8.

そして、次に、ゲイン調整部は、騒音データ取得制御部２３によって、ＮＤ（０）に時間区間ｔ１の周囲騒音の音声データＮ（ｔ１）が格納されしだい、音声データＤの２番目の分割音声データＤ（１）を、ＮＤ（０）に格納されている周囲騒音Ｎ（ｔ１）の音量レベルに応じてゲイン調整して音声データＧＤ（１）を生成し、出力用音声バッファメモリ８のＰＤ（１）に設定する。 Then, as soon as the sound data N (t1) of the ambient noise in the time interval t1 is stored in ND (0) by the noise data acquisition control unit 23, the gain adjustment unit 23nd the second divided sound of the sound data D. The audio data GD (1) is generated by adjusting the gain of the data D (1) according to the volume level of the ambient noise N (t1) stored in ND (0), and the PD of the output audio buffer memory 8 Set to (1).

また、次に、ゲイン調整部は、騒音データ取得制御部２３によって、ＮＤ（１）に時間区間ｔ２の周囲騒音の音声データＮ（ｔ２）が格納されしだい、音声データＤの３番目の分割音声データＤ（２）を、ＮＤ（１）に格納されている周囲騒音Ｎ（ｔ２）の音量レベルに応じてゲイン調整して音声データＧＤ（２）を生成し、出力用音声バッファメモリ８のＰＤ（０）に設定する。 Next, the gain adjustment unit, as soon as the noise data acquisition control unit 23 stores the sound data N (t2) of the ambient noise in the time interval t2 in ND (1), the third divided sound of the sound data D The data D (2) is gain-adjusted according to the volume level of the ambient noise N (t2) stored in ND (1) to generate audio data GD (2), and the output audio buffer memory 8 PD Set to (0).

次に、ゲイン調整部は、騒音データ取得制御部２３によって、ＮＤ（０）に時間区間ｔ３の周囲騒音の音声データＮ（ｔ３）が格納されしだい、音声データＤの４番目の分割音声データＤ（３）を、ＮＤ（０）に格納されている周囲騒音Ｎ（ｔ３）の音量レベルに応じてゲイン調整して音声データＧＤ（３）を生成し、出力用音声バッファメモリ８のＰＤ（１）に設定する。 Next, as soon as the noise data acquisition control unit 23 stores the sound data N (t3) of the ambient noise in the time interval t3 in the ND (0), the gain adjustment unit 4th divided sound data D of the sound data D (3) is gain-adjusted according to the volume level of the ambient noise N (t3) stored in ND (0) to generate audio data GD (3), and PD (1) of the output audio buffer memory 8 ).

そして、以降、同様に、ＮＤ（１）とＮＤ（０）に格納された周囲騒音の音声データを交互に用いながら、分割音声データＤ（４）からＤ（５）にゲイン調整を施し、ＰＤ（０）とＰＤ（１）に交互に設定していく。
このように、本第２実施形態によれば、音声出力の要求発生時に、常時保持するようにした直近過去の周囲騒音の音声データを用いて、速やかに音声データのゲイン調整を開始することができ、また、これにより速やかに音声出力を開始することができるようになる。 Thereafter, similarly, while using the sound data of ambient noise stored in ND (1) and ND (0) alternately, gain adjustment is performed on the divided sound data D (4) to D (5), and PD (0) and PD (1) are set alternately.
As described above, according to the second embodiment, when the voice output request is generated, the voice data gain adjustment can be started promptly using the voice data of the latest ambient noise that is always held. It is also possible to start voice output promptly.

本発明の第１実施形態に係る音声出力装置の構成を示すブロック図である。It is a block diagram which shows the structure of the audio | voice output apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る音声出力処理を示すフローチャートである。It is a flowchart which shows the audio | voice output process which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る音声出力処理の処理例を示す図である。It is a figure which shows the process example of the audio | voice output process which concerns on 1st Embodiment of this invention. 本発明の第２実施形態に係る音声出力装置の構成を示すブロック図である。It is a block diagram which shows the structure of the audio | voice output apparatus which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係る騒音データ取得処理とゲイン調整処理を示すフローチャートである。It is a flowchart which shows the noise data acquisition process and gain adjustment process which concern on 2nd Embodiment of this invention. 本発明の第２実施形態に係る騒音データ取得処理とゲイン調整処理の処理例を示す図である。It is a figure which shows the process example of the noise data acquisition process and gain adjustment process which concern on 2nd Embodiment of this invention.

Explanation of symbols

１…オペレーティングシステム、２…音声出力アプリケーション、３…サウンド入力装置、４…マイクロフォン、５…サウンド出力装置、６…スピーカ、７…音声データメモリ、８…出力用バッファメモリ、９…騒音データバッファ、１１…サウンドドライバ（出力）、１２…サウンドドライバ（入力）、２１…出力制御部、２２…ゲイン調整部、２３…騒音データ取得制御部。 DESCRIPTION OF SYMBOLS 1 ... Operating system, 2 ... Sound output application, 3 ... Sound input device, 4 ... Microphone, 5 ... Sound output device, 6 ... Speaker, 7 ... Sound data memory, 8 ... Output buffer memory, 9 ... Noise data buffer, DESCRIPTION OF SYMBOLS 11 ... Sound driver (output), 12 ... Sound driver (input), 21 ... Output control part, 22 ... Gain adjustment part, 23 ... Noise data acquisition control part

Claims

A microphone,
A sound input device for generating ambient audio data representing ambient audio picked up using the microphone;
Memory storing original audio data,
A buffer for storing audio data;
A sound output device for outputting sound to a speaker ;
The issued voice output request is accepted, the voice data stored in the buffer specified as the output voice data in the accepted voice output request is read, and the voice represented by the read voice data is output to the speaker. An output audio gain adjustment device applied to an audio output system including a sound driver to be performed by the sound output device ,
Noise level detection means for detecting a volume level of ambient noise represented by ambient audio data generated by the sound input device;
When the output of the original sound data stored in the memory is requested, the original sound data stored in the memory is divided into divided sound data that is sound data representing sound for each predetermined time length. The divided sound data is divided into target divided sound data sequentially at the predetermined time intervals, and the sound represented by the target divided sound data is volume of ambient noise detected by the noise level detecting means at the time Generating the audio data representing the audio amplified by a gain according to a level, storing the generated audio data in the buffer, and outputting the audio output request specifying the stored audio data as the output audio data Output audio gain adjustment comprising audio output processing means for performing audio output processing issued to a sound driver Location.

The gain adjusting device according to claim 1,
The noise level detection means includes
Ambient noise holding means for holding the surrounding voice data for the predetermined time length in the past past, generated by the sound input device, at least until the start of the voice output processing of the voice output processing means,
At the start of the sound output processing of the sound output processing means, the ambient noise to be detected is detected at the time when the volume level of the surrounding sound data for the predetermined time length in the latest past held in the ambient noise holding means Thereafter, the volume level of the surrounding audio data generated by the sound input device during the most recently passed time interval for each elapse of the time interval of the predetermined time length, Volume level calculating means for calculating the volume level of the ambient noise to be detected;
The sound output processing means, in the sound output processing, each time the sound volume level calculating means calculates the sound volume level of the ambient noise, the divided sound data is sequentially set as the target divided sound data, and the target An audio output device that generates the audio data representing the audio obtained by amplifying the audio represented by the divided audio data with a gain corresponding to the calculated volume level of the ambient noise.

The output audio gain adjusting device according to claim 1 or 2 ,
The noise level detection means detects the volume level of ambient noise for each frequency band,
The sound output processing means amplifies the sound represented by the target divided sound data with a gain corresponding to the volume level of each frequency band of the ambient noise detected by the noise level detection means at each time point for each frequency band. An output audio gain adjustment apparatus, characterized in that the audio data representing the reproduced audio is generated.

The output audio gain adjusting device according to claim 1, 2, or 3 ,
The audio output device is an audio output gain adjustment device mounted on an automobile.

A microphone,
A sound input device for generating ambient audio data representing ambient audio picked up using the microphone;
Memory storing original audio data,
A buffer for storing audio data;
A sound output device for outputting sound to a speaker ;
The issued voice output request is accepted, the voice data stored in the buffer specified as the output voice data in the accepted voice output request is read, and the voice represented by the read voice data is output to the speaker. A computer program that is read and executed by a computer having a sound driver for the sound output device ,
The computer,
Noise level detection means for detecting a volume level of ambient noise represented by ambient audio data generated by the sound input device;
When the output of the original sound data stored in the memory is requested, the original sound data stored in the memory is divided into divided sound data that is sound data representing sound for each predetermined time length. The divided sound data is divided into target divided sound data sequentially at the predetermined time intervals, and the sound represented by the target divided sound data is volume of ambient noise detected by the noise level detecting means at the time Generating the audio data representing the audio amplified by a gain according to a level, storing the generated audio data in the buffer, and outputting the audio output request specifying the stored audio data as the output audio data A computer characterized by functioning as an audio output processing means for performing an audio output process issued to a sound driver Program.

A computer program according to claim 5,
The noise level detection means includes
Ambient noise holding means for holding the surrounding voice data for the predetermined time length in the past past, generated by the sound input device, at least until the start of the voice output processing of the voice output processing means,
At the start of the sound output processing of the sound output processing means, the ambient noise to be detected is detected at the time when the volume level of the surrounding sound data for the predetermined time length in the latest past held in the ambient noise holding means Thereafter, the volume level of the surrounding audio data generated by the sound input device during the most recently passed time interval for each elapse of the time interval of the predetermined time length, Volume level calculating means for calculating the volume level of the ambient noise to be detected;
The sound output processing means, in the sound output processing, each time the sound volume level calculating means calculates the sound volume level of the ambient noise, the divided sound data is sequentially set as the target divided sound data, and the target A computer program for generating sound data representing sound obtained by amplifying sound represented by divided sound data with a gain corresponding to the volume level of the calculated ambient noise