JP7487060B2

JP7487060B2 - Audio device and audio control method

Info

Publication number: JP7487060B2
Application number: JP2020162215A
Authority: JP
Inventors: 浩二阪本
Original assignee: Denso Ten Ltd
Current assignee: Denso Ten Ltd
Priority date: 2020-09-28
Filing date: 2020-09-28
Publication date: 2024-05-20
Anticipated expiration: 2040-09-28
Also published as: JP2022054933A

Description

本発明は、音響装置および音響制御方法に関する。 The present invention relates to an audio device and an audio control method.

従来、例えば、ステレオ音源から疑似的に残響音の信号を生成し、音源とともに残響音を出力することで、複数のチャンネルでサラウンド再生する技術がある（例えば、特許文献１参照）。 Conventionally, for example, there is a technology that generates pseudo-reverberant sound signals from a stereo sound source and outputs the reverberant sound together with the sound source to reproduce surround sound on multiple channels (see, for example, Patent Document 1).

特許第５３７２１４２号公報Japanese Patent No. 5372142

しかしながら、従来技術では、より高音質なサラウンド再生を行う点で改善の余地があった。 However, conventional technology leaves room for improvement when it comes to providing higher quality surround sound.

本発明は、上記に鑑みてなされたものであって、高音質なサラウンド再生を行うことができる音響装置および音響制御方法を提供することを目的とする。 The present invention has been made in consideration of the above, and aims to provide an audio device and an audio control method that can perform high-quality surround sound reproduction.

上述した課題を解決し、目的を達成するために、本発明に係る音響装置は、音源信号に基づいて、前記音源信号に含まれる音響成分の特徴である特徴情報を検出し、前記特徴情報に基づいて、前記音源信号から疑似的な残響音を生成するためのフィルタのゲインを決定し、前記フィルタのゲインとして、前記決定されたゲインを設定する。 In order to solve the above-mentioned problems and achieve the object, the acoustic device of the present invention detects , based on a sound source signal, feature information that is characteristic of the acoustic components contained in the sound source signal, determines, based on the feature information, a gain of a filter for generating artificial reverberation sound from the sound source signal, and sets the determined gain as the gain of the filter .

本発明によれば、高音質なサラウンド再生を行うことができる。 The present invention allows for high quality surround sound reproduction.

図１Ａは、実施形態に係る音響制御方法の概要を示す図である。FIG. 1A is a diagram showing an overview of an acoustic control method according to an embodiment. 図１Ｂは、実施形態に係る音響制御方法の概要を示す図である。FIG. 1B is a diagram showing an overview of the acoustic control method according to the embodiment. 図２は、実施形態に係る音響装置の構成例を示すブロック図である。FIG. 2 is a block diagram illustrating an example of the configuration of the audio device according to the embodiment. 図３は、決定部の決定処理を示す図である。FIG. 3 is a diagram illustrating a determination process performed by the determination unit. 図４は、実施形態に係る音響装置によって実行される処理の処理手順を示すフローチャートである。FIG. 4 is a flowchart showing the processing procedure of the processing executed by the audio device according to the embodiment.

以下、添付図面を参照して、本願の開示する音響装置および音響制御方法の実施形態を詳細に説明する。なお、以下に示す実施形態により本発明が限定されるものではない。 Embodiments of the audio device and audio control method disclosed in this application will be described in detail below with reference to the attached drawings. Note that the present invention is not limited to the embodiments described below.

まず、図１Ａおよび図１Ｂを用いて、実施形態に係る音響制御方法の概要について説明する。図１Ａおよび図１Ｂは、実施形態に係る音響制御方法の概要を示す図である。実施形態に係る音響制御方法は、例えば、図１Ａに示す音響装置１によって実行される。 First, an overview of the acoustic control method according to the embodiment will be described with reference to Figs. 1A and 1B. Figs. 1A and 1B are diagrams showing an overview of the acoustic control method according to the embodiment. The acoustic control method according to the embodiment is executed, for example, by the acoustic device 1 shown in Fig. 1A.

図１Ａでは、車内空間において、前方の左右に配置された２つのスピーカＦＲ，ＦＬから音源信号である直接音および実際の残響音を出力し、後方の左右に配置された２つのスピーカＲＬ，ＲＲから疑似的な残響音（以下、疑似残響音）を出力することでサラウンド再生する場合を示している。 Figure 1A shows a case where surround sound is reproduced in a vehicle interior by outputting direct sound, which is a sound source signal, and actual reverberant sound from two speakers FR, FL located on the left and right sides of the front, and outputting pseudo reverberant sound (hereinafter, pseudo reverberant sound) from two speakers RL, RR located on the left and right sides of the rear.

ここで、音源信号は、２つのスピーカＦＲ，ＦＬそれぞれから異なる音を出力することで空間的な広がり（音像幅）をもった音源の信号である。つまり、音源信号は、２つのチャンネル（スピーカＦＲ，ＦＬ）でステレオ再生されるステレオ信号である。 The sound source signal here is a signal from a sound source that has a spatial spread (sound image width) by outputting different sounds from each of the two speakers FR and FL. In other words, the sound source signal is a stereo signal that is reproduced in stereo on two channels (speakers FR and FL).

また、音源信号は、クラシック音楽やオペラ等のような複数の楽器音や音声（ボーカル）が混在した音、すなわち、複数の音源が混在する音の信号であってもよく、ピアノのみやバイオリンのみといった単一の楽器（音源）の音の信号であってもよい。 The sound source signal may be a signal of a mixture of sounds from multiple musical instruments and voices (vocals), such as classical music or opera, i.e., a signal of a mixture of sounds from multiple sound sources, or a signal of the sound of a single instrument (sound source), such as only a piano or only a violin.

実施形態に係る音響制御方法では、疑似残響音を生成するフィルタのゲインを音源（楽器や音声）毎に変えることで、１曲の音源信号においてより自然な疑似残響音を出力できるようになるため、時間とともに音源が変化するような信号が好適である。なお、フィルタは、例えば、ＦＩＲ（Finite Impulse Response）フィルタや、ＩＩＲ（Infinite Impulse Response）フィルタ等のインパルス応答性のフィルタである。 In the acoustic control method according to the embodiment, the gain of the filter that generates the artificial reverberation sound is changed for each sound source (musical instrument or voice), so that a more natural artificial reverberation sound can be output from the sound source signal of a single piece of music. Therefore, a signal in which the sound source changes over time is preferable. The filter is, for example, an impulse response filter such as an FIR (Finite Impulse Response) filter or an IIR (Infinite Impulse Response) filter.

具体的には、実施形態に係る音響制御方法では、音響装置１が音源信号に含まれる音源毎の特徴に基づいて音源毎にゲインが最適化されたフィルタを用いて疑似残響音を生成することで、高音質なサラウンド再生を実現する。 Specifically, in the acoustic control method according to the embodiment, the acoustic device 1 generates pseudo-reverberation sound using a filter whose gain is optimized for each sound source based on the characteristics of each sound source contained in the sound source signal, thereby achieving high-quality surround sound reproduction.

ここで、図１Ａに示すように、空間において音源Ｓからの音響を受聴するリスナＬは、２種類の空間印象を知覚できることが知られている。一方の空間印象は、直接音と時間的にも空間的にも融合して知覚される「みかけの音源の幅」と定義される音像幅ＡＳＷであり、他方の空間印象は「みかけの音源以外の音源によって聴き手のまわりが満たされている感じ」と定義される包まれ感ＬＥＶである。つまり、音像幅ＡＳＷは、音源信号の初期成分である直接音および初期反射音成分に由来した音像であり、包まれ感ＬＥＶは、音源信号の後期成分である残響音成分に由来した音像である。 As shown in FIG. 1A, it is known that a listener L who hears sound from a sound source S in a space can perceive two types of spatial impressions. One spatial impression is the sound image width ASW, which is defined as the "width of the apparent sound source" that is perceived as blending with the direct sound in both time and space, and the other spatial impression is the sense of envelopment LEV, which is defined as the "feeling that the listener is surrounded by sound sources other than the apparent sound source." In other words, the sound image width ASW is a sound image derived from the direct sound and early reflected sound components, which are the early components of the sound source signal, and the sense of envelopment LEV is a sound image derived from the reverberation sound components, which are the later components of the sound source signal.

これら音像幅ＡＳＷと包まれ感ＬＥＶを設計および評価するにあたっては、いわゆる「第一波面の法則」を用いた指標を利用する場合がある。かかる指標では、図１Ｂに示すように、２つの閾値ＴＨ１，ＴＨ２によって区画された２つの領域Ｒ１，Ｒ２（閾値範囲）が定義される。 When designing and evaluating the sound image width ASW and the sense of envelopment LEV, an index using the so-called "law of the first wave front" may be used. In this index, as shown in Figure 1B, two regions R1 and R2 (threshold ranges) are defined, which are separated by two thresholds TH1 and TH2.

領域Ｒ１は、音源信号に含まれる成分のうち、主として直接音を含む成分（初期成分）が含まれる領域である。例えば、領域Ｒ１の初期成分が大きいと音像幅ＡＳＷが大きくなるため、聴感上、拡散されているとリスナＬが感じることで音源によっては音質が悪い（不明瞭である）と評価される。なお、直接音とは、例えば、音声（ボーカル）や楽器等から直接録音した音であり、壁等で反射した音を含まない音である。 Region R1 is a region that contains, among the components contained in the sound source signal, components (initial components) that mainly contain direct sound. For example, if the initial components of region R1 are large, the sound image width ASW becomes large, and the listener L may perceive the sound as being diffused, and depending on the sound source, the sound quality may be evaluated as poor (unclear). Note that direct sound is, for example, sound recorded directly from voice (vocals) or musical instruments, and does not include sound reflected by walls, etc.

また、領域Ｒ２は、音源信号に含まれる成分のうち、主として残響音を含む成分（後期成分）が含まれる領域である。例えば、領域Ｒ２の残響音の成分が大きいと包まれ感ＬＥＶが大きくなるため、聴感上、拡散されているとリスナＬが感じることで包まれ感が充実すると評価される。なお、残響音とは、例えば、音声や楽器等の音が壁等で反射した音を録音した音であり、直接音から時間的に遅れた音である。 Region R2 is a region that contains, among the components contained in the sound source signal, components that mainly contain reverberation sounds (late components). For example, if the reverberation sound components in region R2 are large, the sense of envelopment LEV increases, and the listener L will perceive the sound as being diffused, which will result in a rich sense of envelopment. Note that reverberation sounds are, for example, recorded sounds of voices, musical instruments, etc. that are reflected by walls, etc., and are sounds that are delayed in time from the direct sound.

つまり、第１波面の法則において、領域Ｒ１に含まれる初期成分と領域Ｒ２に含まれる後期成分とが明確に分離できた場合に、音像の明確化と包まれ感の充実化とを両立させることができる。 In other words, in the first wave front law, if the early components contained in region R1 and the later components contained in region R2 can be clearly separated, it is possible to achieve both a clear sound image and a rich sense of envelopment.

ここで、従来、音源のＬＲチャンネルの無相関成分を抽出することで、残響音を付加する信号処理が用いられる場合があり、その場合、ＬＲチャンネルの相関成分であるボーカル等のみ明瞭に、その他の楽器等は不明瞭に、という不自然な聴こえ方となる場合がある。このように、従来は、高音質なサラウンド再生を行う点で改善の余地があった。 Here, in the past, signal processing was sometimes used to add reverberation by extracting uncorrelated components from the left and right channels of the sound source. In such cases, only the vocals, which are correlated components of the left and right channels, could be heard clearly, while other instruments, etc., could be heard unnaturally. As such, in the past, there was room for improvement in terms of achieving high-quality surround playback.

実施形態に係る音響制御方法では、第１波面の法則における閾値範囲を音源の特徴に応じて変化させ、フィルタで生成される疑似残響音がかかる閾値範囲の領域Ｒ２に収まるようなフィルタのゲインを決定する。 In the acoustic control method according to the embodiment, the threshold range in the first wave front law is changed according to the characteristics of the sound source, and the gain of the filter is determined so that the pseudo-reverberation sound generated by the filter falls within region R2 of the threshold range.

具体的には、実施形態に係る音響制御方法では、まず、音源信号に基づいて、音源信号に含まれる音響成分（直接音成分および残響音成分）の特徴である特徴情報を検出する。なお、特徴情報の詳細については後述する。 Specifically, in the acoustic control method according to the embodiment, first, feature information that is characteristic of the acoustic components (direct sound components and reverberation sound components) contained in the sound source signal is detected based on the sound source signal. Details of the feature information will be described later.

つづいて、実施形態に係る音響制御方法では、特徴情報に基づいて、疑似残響音を生成するためのフィルタのゲインを決定する。そして、実施形態に係る音響制御方法では、決定したゲインが設定されたフィルタを用いて、疑似残響音を生成する。 Next, in the acoustic control method according to the embodiment, the gain of a filter for generating artificial reverberation sound is determined based on the feature information. Then, in the acoustic control method according to the embodiment, the artificial reverberation sound is generated using a filter to which the determined gain is set.

これにより、例えば、音声や、打楽器、クラシック音楽等のように、音源の残響音の特徴が異なる場合であっても、音源毎に最適な疑似残響音を生成でき、音源の特徴に応じて最適なサラウンド再生が可能となる。すなわち、実施形態に係る音響制御方法によれば、高音質なサラウンド再生を行うことができる。 As a result, even if the characteristics of the reverberation sound of the sound source are different, such as for voice, percussion instruments, classical music, etc., it is possible to generate an optimal pseudo-reverberation sound for each sound source, and optimal surround reproduction according to the characteristics of the sound source is possible. In other words, the acoustic control method according to the embodiment makes it possible to perform high-quality surround reproduction.

次に、図２を用いて、実施形態に係る音響装置１の構成例について説明する。図２は、実施形態に係る音響装置１の構成例を示すブロック図である。図２に示す音響装置１は、音源装置１００と、複数のスピーカＦＬ，ＦＲ，ＲＬ，ＲＲとに接続される。また、図２に示すように、音響装置１は、制御部２と、記憶部３とを備える。 Next, a configuration example of the acoustic device 1 according to the embodiment will be described with reference to FIG. 2. FIG. 2 is a block diagram showing a configuration example of the acoustic device 1 according to the embodiment. The acoustic device 1 shown in FIG. 2 is connected to a sound source device 100 and a plurality of speakers FL, FR, RL, and RR. As shown in FIG. 2, the acoustic device 1 also includes a control unit 2 and a storage unit 3.

音源装置１００は、音源信号を音響装置１へ出力する。音源信号は、例えば、ステレオ信号であり、２つのチャンネルである２つのスピーカＦＬ，ＦＲからそれぞれ異なる信号が出力されることで空間的な広がりをもった音源Ｓとなる。 The sound source device 100 outputs a sound source signal to the audio device 1. The sound source signal is, for example, a stereo signal, and a sound source S with spatial expansion is generated by outputting different signals from two speakers FL and FR, which are two channels.

複数のスピーカＦＬ，ＦＲ，ＲＬ，ＲＲは、音響装置１が出力される信号を音として出力する。具体的には、スピーカＦＬ，ＦＲは、音源信号である直接音を出力し、スピーカＲＬ，ＲＲは、音源信号から生成された疑似残響音を出力する。 The multiple speakers FL, FR, RL, and RR output the signal output by the audio device 1 as sound. Specifically, the speakers FL and FR output direct sound, which is a sound source signal, and the speakers RL and RR output pseudo-reverberation sound generated from the sound source signal.

ここで、音響装置１は、たとえば、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）、フラッシュメモリ、入出力ポートなどを有するコンピュータや各種の回路を含む。 Here, the audio device 1 includes, for example, a computer having a CPU (Central Processing Unit), ROM (Read Only Memory), RAM (Random Access Memory), flash memory, input/output ports, and various other circuits.

コンピュータのＣＰＵは、たとえば、ＲＯＭに記憶されたプログラムを読み出して実行することによって、制御部２の取得部２１、検出部２２、決定部２３、生成部２４および出力部２５として機能する。 The computer's CPU functions as the acquisition unit 21, detection unit 22, determination unit 23, generation unit 24 and output unit 25 of the control unit 2, for example, by reading and executing a program stored in the ROM.

また制御部２の取得部２１、検出部２２、決定部２３、生成部２４および出力部２５の少なくともいずれか一つまたは全部をＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等のハードウェアで構成することもできる。 In addition, at least one or all of the acquisition unit 21, detection unit 22, decision unit 23, generation unit 24 and output unit 25 of the control unit 2 can be configured with hardware such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

また、記憶部３は、ＲＡＭやフラッシュメモリに対応する。ＲＡＭやフラッシュメモリは、各種プログラムの情報等を記憶することができる。なお、音響装置１は、有線や無線のネットワークで接続された他のコンピュータや可搬型記録媒体を介して上記したプログラムや各種情報を取得することとしてもよい。 The storage unit 3 corresponds to a RAM or a flash memory. The RAM or the flash memory can store information about various programs. The audio device 1 may also acquire the above-mentioned programs and various information via other computers or portable recording media connected to a wired or wireless network.

次に、制御部２の各機能（取得部２１、検出部２２、決定部２３、生成部２４および出力部２５）について詳細に説明する。 Next, each function of the control unit 2 (acquisition unit 21, detection unit 22, determination unit 23, generation unit 24, and output unit 25) will be described in detail.

取得部２１は、各種情報や信号を取得する。例えば、取得部２１は、音源装置１００から音源信号を取得する。例えば、取得部２１は、ステレオ信号である音源信号を取得する。具体的には、取得部２１は、２つのチャンネルである２つのスピーカＦＬ，ＦＲそれぞれから出力される音源信号を取得する。 The acquisition unit 21 acquires various information and signals. For example, the acquisition unit 21 acquires a sound source signal from the sound source device 100. For example, the acquisition unit 21 acquires a sound source signal that is a stereo signal. Specifically, the acquisition unit 21 acquires sound source signals output from two speakers FL and FR, which are two channels.

また、取得部２１は、音源信号に関する音源情報を取得する。音源情報は、例えば、音源信号の種別（ジャンル）や、録音環境に関する情報を取得する。音源信号の種別は、例えば、音声や、楽器（打楽器や管楽器等）、クラシック音楽等である。録音環境は、例えば、レコーディングスタジオや、コンサートホール等の録音した場所の情報である。 The acquisition unit 21 also acquires sound source information related to the sound source signal. The sound source information acquired includes, for example, the type (genre) of the sound source signal and information related to the recording environment. The type of sound source signal includes, for example, voice, musical instruments (percussion instruments, wind instruments, etc.), classical music, etc. The recording environment includes, for example, information on the location where the sound was recorded, such as a recording studio or concert hall.

取得部２１は、例えば、リスナＬ等のユーザによる入力により音源情報を取得したり、インターネットを介してサーバ等から音源情報を取得する。また、取得部２１は、取得した音源信号を解析して音源情報を取得してもよい。 The acquisition unit 21 acquires the sound source information, for example, through an input by a user such as a listener L, or acquires the sound source information from a server or the like via the Internet. The acquisition unit 21 may also acquire the sound source information by analyzing the acquired sound source signal.

検出部２２は、音源信号に基づいて、音源信号に含まれる音響成分の特徴である特徴情報を検出する。 The detection unit 22 detects feature information, which is a feature of the acoustic components contained in the sound source signal, based on the sound source signal.

例えば、検出部２２は、音響成分の特徴に関する特徴情報として、２つのチャンネルそれぞれで再生される音源信号の関係性に関するチャンネル関係情報を検出する。具体的には、チャンネル関係情報は、ＬＲチャンネル差やＬＲチャンネル相関の情報を含む。 For example, the detection unit 22 detects channel relationship information related to the relationship between the sound source signals reproduced on each of the two channels as feature information related to the features of the acoustic components. Specifically, the channel relationship information includes information on the LR channel difference and the LR channel correlation.

ＬＲチャンネル差とは、２つのチャンネルでステレオ再生される２つの音源信号の差に関する情報である。具体的には、ＬＲチャンネル差は、チャンネル間レベル差（ＩＣＬＤ：Inter-channel Level Difference）や、チャンネル間時間差（Inter-channel Time Difference）である。 The LR channel difference is information about the difference between two sound source signals reproduced in stereo on two channels. Specifically, the LR channel difference is the inter-channel level difference (ICLD) or the inter-channel time difference (ICLD).

ＬＲチャンネル相関とは、２つのチャンネルでステレオ再生される２つの音源信号の相関成分に関する情報である。具体的には、ＬＲチャンネル相関は、チャンネル間の相互相関（ＩＣＣ：Inter-channel Cross Correlation）である。 The LR channel correlation is information about the correlation components of two sound source signals reproduced in stereo on two channels. Specifically, the LR channel correlation is the inter-channel cross correlation (ICC) between the channels.

このように、検出部２２は、特徴情報として、チャンネル関係情報を検出することで、音響成分を高精度に検出することができるため、後段の生成部２４におけるフィルタ処理により最適な疑似残響音を生成することができる。 In this way, the detection unit 22 can detect acoustic components with high accuracy by detecting channel relationship information as feature information, and therefore can generate optimal pseudo-reverberation sounds through filtering in the downstream generation unit 24.

そして、検出部２２は、チャンネル関係情報を一定間隔で連続して検出して時系列に並べ、検出した時系列のチャンネル関係情報について移動平均を算出することで、かかる移動平均を特徴情報として検出する。これにより、チャンネル関係情報の急峻な変化を平滑化できるため、後段の決定部２３によって決定されるゲインの急峻な変化を平滑化できる。この結果、生成部２４によって生成される疑似残響音の変化を滑らかにすることができるため、より自然なサラウンド再生を実現できる。 The detection unit 22 then continuously detects the channel relationship information at regular intervals, arranges the information in a time series, and calculates a moving average of the detected time series channel relationship information, thereby detecting the moving average as feature information. This makes it possible to smooth out any sudden changes in the channel relationship information, and therefore to smooth out any sudden changes in the gain determined by the determination unit 23 in the subsequent stage. As a result, it is possible to smooth out the changes in the pseudo reverberation sound generated by the generation unit 24, thereby achieving more natural surround playback.

なお、検出部２２は、時系列のチャンネル関係情報のうち、音源信号の音圧レベル（振幅）が所定値未満の区間についてはチャンネル関係情報を最大化した後移動平均する。つまり、検出部２２は、時系列のチャンネル関係情報のうち、音源信号の音圧レベルが所定値未満の区間はサラウンドを出さず所定値未満の区間周辺はサラウンドを出さない方向へ制御するよう移動平均を算出する。 The detection unit 22 maximizes the channel relationship information for sections of the time-series channel relationship information where the sound pressure level (amplitude) of the sound source signal is less than a predetermined value, and then calculates a moving average. In other words, the detection unit 22 calculates a moving average so as to control the time-series channel relationship information so that surround sound is not output for sections where the sound pressure level of the sound source signal is less than a predetermined value, and surround sound is not output around the sections where the sound pressure level is less than the predetermined value.

これにより、打楽器のソロ演奏のように打点間に無音が存在するような場合に第一波面の法則が成立する上限が低くなるため、打楽器に対しフィルタゲインを下げて適したサラウンドレベルへ制御される。 This lowers the upper limit at which the law of the first wave front applies when there is silence between beats, such as in a solo percussion performance, so the filter gain for the percussion instruments is lowered to control the surround level to an appropriate level.

また、検出部２２は、音響成分の継続時間や周波数に関する情報を特徴情報として検出する。 The detection unit 22 also detects information related to the duration and frequency of the acoustic components as feature information.

継続時間とは、音源信号が継続する時間である。具体的には、検出部２２は、音源信号の包絡線を算出し、算出した包絡線を微分後、再度包絡線を算出し、得られた包絡線の傾きから継続時間を検出する。 The duration is the time that the sound source signal continues. Specifically, the detection unit 22 calculates the envelope of the sound source signal, differentiates the calculated envelope, and then calculates the envelope again, and detects the duration from the slope of the resulting envelope.

周波数とは、音源信号の周波数成分に関する情報である。具体的には、検出部２２は、時間で表現される音源信号をフーリエ変換することで、周波数で表現された音源信号の周波数の重心を検出する。 The frequency is information about the frequency components of the sound source signal. Specifically, the detection unit 22 detects the center of gravity of the frequency of the sound source signal expressed in frequency by performing a Fourier transform on the sound source signal expressed in time.

このように、検出部２２は、特徴情報として、継続時間や周波数の情報を検出することで、音響成分を高精度に検出することができるため、後段の生成部２４におけるフィルタ処理により最適な疑似残響音を生成することができる。 In this way, the detection unit 22 can detect acoustic components with high accuracy by detecting duration and frequency information as feature information, and therefore can generate optimal pseudo-reverberation sounds through filtering in the downstream generation unit 24.

決定部２３は、後述する生成部２４が用いるフィルタのゲインを決定する。具体的には、決定部２３は、検出部２２が検出した特徴情報および取得部２１によって取得された音源情報の少なくとも一方に基づいて、フィルタのゲインを決定する。 The determination unit 23 determines the gain of a filter used by the generation unit 24, which will be described later. Specifically, the determination unit 23 determines the gain of the filter based on at least one of the feature information detected by the detection unit 22 and the sound source information acquired by the acquisition unit 21.

具体的には、決定部２３は、特徴情報および音源情報の少なくとも一方に基づいて、第１波面の法則から求めた閾値範囲を補正し、補正後の閾値範囲に基づいてゲインを決定する。ここで、第１波面の法則を元にしたゲインの決定処理について図３を用いて説明する。 Specifically, the determination unit 23 corrects the threshold range determined from the first wave front law based on at least one of the feature information and the sound source information, and determines the gain based on the corrected threshold range. Here, the process of determining the gain based on the first wave front law will be described with reference to FIG. 3.

図３は、決定部２３によるゲインの決定処理を示す図である。図３に示すように、決定部２３は、まず、第１波面の法則から求めた閾値範囲を特徴情報（チャンネル関係情報、継続時間および周波数）および音源情報に基づいて補正する。 Figure 3 is a diagram showing the gain determination process performed by the determination unit 23. As shown in Figure 3, the determination unit 23 first corrects the threshold range obtained from the first wave front law based on the feature information (channel relationship information, duration, and frequency) and sound source information.

例えば、図３の中段のグラフに示す閾値範囲である領域Ｒ１および領域Ｒ２を基準範囲とする。かかる場合、決定部２３は、特徴情報および音源情報に基づいて、閾値ＴＨ１および閾値ＴＨ２を時間および振幅の２軸で補正することで、領域Ｒ１および領域Ｒ２を補正する。 For example, the threshold ranges shown in the graph in the middle of FIG. 3, regions R1 and R2, are set as the reference ranges. In this case, the determination unit 23 corrects regions R1 and R2 by correcting thresholds TH1 and TH2 on the two axes of time and amplitude based on the feature information and sound source information.

例えば、決定部２３は、チャンネル関係情報が大きい程、閾値ＴＨ１および閾値ＴＨ２を時間および振幅が大きくなる方向に補正する。つまり、決定部２３は、チャンネル関係情報が大きい程、領域Ｒ１および領域Ｒ２が大きくなるように補正する。 For example, the larger the channel relationship information, the more the determination unit 23 corrects the thresholds TH1 and TH2 so that the time and amplitude become larger. In other words, the larger the channel relationship information, the more the determination unit 23 corrects the regions R1 and R2 so that they become larger.

一方、決定部２３は、チャンネル関係情報が小さい程、閾値ＴＨ１および閾値ＴＨ２を時間および振幅が小さくなる方向に補正する。つまり、決定部２３は、チャンネル関係情報が小さい程、領域Ｒ１および領域Ｒ２が小さくなるように補正する。 On the other hand, the smaller the channel relationship information, the more the determination unit 23 corrects the thresholds TH1 and TH2 in the direction of decreasing the time and amplitude. In other words, the smaller the channel relationship information, the more the determination unit 23 corrects the regions R1 and R2 to become smaller.

なお、チャンネル関係情報であるチャンネル間レベル差が大きい程、または、チャンネル間時間差が大きい程、相互相関が低い程（無相関成分が多い程）、チャンネル関係情報が大きくなる。 The channel relationship information becomes larger the greater the inter-channel level difference, or the greater the inter-channel time difference, or the lower the cross-correlation (the more uncorrelated components there are).

また、決定部２３は、音響成分の継続時間が長い程、閾値ＴＨ１および閾値ＴＨ２を時間および振幅が大きくなる方向に補正する。つまり、決定部２３は、音響成分の継続時間が長い程、領域Ｒ１および領域Ｒ２が大きくなるように補正する。 In addition, the determination unit 23 corrects the thresholds TH1 and TH2 in the direction of increasing the time and amplitude as the duration of the sound component becomes longer. In other words, the determination unit 23 corrects the regions R1 and R2 so that they become larger as the duration of the sound component becomes longer.

一方、決定部２３は、音響成分の継続時間が短い程、閾値ＴＨ１および閾値ＴＨ２を時間および振幅が小さくなる方向に補正する。つまり、決定部２３は、音響成分の継続時間が短い程、領域Ｒ１および領域Ｒ２が小さくなるように補正する。 On the other hand, the determination unit 23 corrects the thresholds TH1 and TH2 in a direction that reduces the time and amplitude as the duration of the sound component becomes shorter. In other words, the determination unit 23 corrects the regions R1 and R2 so that they become smaller as the duration of the sound component becomes shorter.

また、決定部２３は、音響成分の周波数（重心）が低い程、閾値ＴＨ１および閾値ＴＨ２を時間および振幅が大きくなる方向に補正する。つまり、決定部２３は、音響成分の周波数（重心）が低い程、領域Ｒ１および領域Ｒ２が大きくなるように補正する。 The determination unit 23 also corrects the thresholds TH1 and TH2 in a direction in which the time and amplitude increase as the frequency (center of gravity) of the sound component decreases. In other words, the determination unit 23 corrects the regions R1 and R2 to become larger as the frequency (center of gravity) of the sound component decreases.

一方、決定部２３は、音響成分の周波数（重心）が高い程、閾値ＴＨ１および閾値ＴＨ２を時間および振幅が小さくなる方向に補正する。つまり、決定部２３は、音響成分の周波数（重心）が高い程、領域Ｒ１および領域Ｒ２が小さくなるように補正する。 On the other hand, the determination unit 23 corrects the thresholds TH1 and TH2 in a direction that reduces the time and amplitude as the frequency (center of gravity) of the sound component increases. In other words, the determination unit 23 corrects the regions R1 and R2 to be smaller as the frequency (center of gravity) of the sound component increases.

また、決定部２３は、音源情報から音源信号がクラシックである場合には、閾値ＴＨ１および閾値ＴＨ２を時間および振幅が大きくなる方向に補正する。つまり、決定部２３は、音源信号がクラシックである場合には、領域Ｒ１および領域Ｒ２が大きくなるように補正する。 Furthermore, when the sound source signal is determined to be classical based on the sound source information, the determination unit 23 corrects the thresholds TH1 and TH2 in the direction of increasing the time and amplitude. In other words, when the sound source signal is classical, the determination unit 23 corrects the regions R1 and R2 to become larger.

一方、決定部２３は、音源情報から音源信号が音声である場合には、閾値ＴＨ１および閾値ＴＨ２を時間および振幅が小さくなる方向に補正する。つまり、決定部２３は、音源信号が音声である場合には、領域Ｒ１および領域Ｒ２が小さくなるように補正する。 On the other hand, when the sound source information indicates that the sound source signal is voice, the determination unit 23 corrects the thresholds TH1 and TH2 in a direction that reduces the time and amplitude. In other words, when the sound source signal is voice, the determination unit 23 corrects the regions R1 and R2 to be smaller.

そして、決定部２３は、補正後の閾値範囲に基づいて、後段の生成部２４で用いられるフィルタのゲインを決定する。具体的には、決定部２３は、フィルタにより生成される疑似残響音が補正後の閾値範囲に収まるようにゲインを決定する。 Then, the determination unit 23 determines the gain of the filter to be used in the subsequent generation unit 24 based on the corrected threshold range. Specifically, the determination unit 23 determines the gain so that the pseudo reverberation sound generated by the filter falls within the corrected threshold range.

つまり、決定部２３は、領域Ｒ１および領域Ｒ２が大きくなるように補正された場合には、疑似残響音の時間および振幅が大きくなるようにゲインを決定する。一方、決定部２３は、領域Ｒ１および領域Ｒ２が小さくなるように補正された場合には、疑似残響音の時間および振幅が小さくなるようにゲインを決定する。 In other words, when regions R1 and R2 are corrected to be larger, the determination unit 23 determines the gain so that the time and amplitude of the artificial reverberation sound are larger. On the other hand, when regions R1 and R2 are corrected to be smaller, the determination unit 23 determines the gain so that the time and amplitude of the artificial reverberation sound are smaller.

このように、決定部２３は、特徴情報および音源情報に基づいて補正した第１波面の法則の閾値範囲からゲインを決定することで、最適な疑似残響音を生成するためのゲインを決定することができる。 In this way, the determination unit 23 can determine the gain for generating the optimal pseudo-reverberation sound by determining the gain from the threshold range of the first wave front law corrected based on the feature information and sound source information.

生成部２４は、決定部２３によって決定されたゲインが設定されたフィルタを用いて、音源信号から疑似残響音を示す残響信号を生成する。フィルタは、例えば、ＦＩＲ（Finite Impulse Response）フィルタや、ＩＩＲ（Infinite Impulse Response）フィルタ等のインパルス応答性のフィルタを用いることができる。 The generator 24 generates a reverberation signal representing a pseudo-reverberation sound from the sound source signal using a filter having a gain determined by the determiner 23. The filter may be, for example, an impulse response filter such as an FIR (Finite Impulse Response) filter or an IIR (Infinite Impulse Response) filter.

出力部２５は、取得部２１から入力される音源信号および生成部２４から入力される残響信号に所定の処理を施してスピーカＦＬ，ＦＲ，ＲＬ，ＲＲから出力する。 The output unit 25 performs predetermined processing on the sound source signal input from the acquisition unit 21 and the reverberation signal input from the generation unit 24, and outputs the results from the speakers FL, FR, RL, and RR.

具体的には、出力部２５は、音源信号をＤ／Ａ変換し、Ｄ／Ａ変換後の音源信号を増幅してスピーカＦＬ，ＦＲから出力する。また、出力部２５は、残響信号をＤ／Ａ変換し、Ｄ／Ａ変換後の残響信号を増幅してスピーカＲＬ，ＲＲから出力する。 Specifically, the output unit 25 performs D/A conversion on the sound source signal, amplifies the D/A converted sound source signal, and outputs it from the speakers FL and FR. The output unit 25 also performs D/A conversion on the reverberation signal, amplifies the D/A converted reverberation signal, and outputs it from the speakers RL and RR.

次に、図４を用いて、実施形態に係る音響装置１において実行される処理の手順について説明する。図４は、実施形態に係る音響装置１によって実行される処理の処理手順を示すフローチャートである。 Next, the procedure of the process executed by the audio device 1 according to the embodiment will be described with reference to FIG. 4. FIG. 4 is a flowchart showing the procedure of the process executed by the audio device 1 according to the embodiment.

図４に示すように、まず、取得部２１は、音源装置１００から音源信号を取得する（ステップＳ１０１）。つづいて、取得部２１は、音源信号に関する音源情報を取得する（ステップＳ１０２）。 As shown in FIG. 4, first, the acquisition unit 21 acquires a sound source signal from the sound source device 100 (step S101). Next, the acquisition unit 21 acquires sound source information related to the sound source signal (step S102).

つづいて、検出部２２は、取得部２１によって取得された音源信号および音源情報に基づいて、音源信号に含まれる音響成分の特徴である特徴情報を検出する（ステップＳ１０３）。 Next, the detection unit 22 detects feature information that is a feature of the acoustic components contained in the sound source signal based on the sound source signal and sound source information acquired by the acquisition unit 21 (step S103).

つづいて、決定部２３は、検出部２２によって検出された特徴情報に基づいて、第１波面の法則における閾値範囲を決定する（ステップＳ１０４）。 Next, the determination unit 23 determines the threshold range in the first wave front law based on the characteristic information detected by the detection unit 22 (step S104).

つづいて、決定部２３は、決定した閾値範囲に基づいて、疑似残響音が閾値範囲に収まるようフィルタ（生成部２４）のゲインを決定する（ステップＳ１０５）。 Next, the determination unit 23 determines the gain of the filter (generation unit 24) based on the determined threshold range so that the pseudo reverberation sound falls within the threshold range (step S105).

つづいて、生成部２４は、決定したゲインが設定されたフィルタを用いて、疑似残響音を示す残響信号を生成する（ステップＳ１０６）。 Next, the generation unit 24 generates a reverberation signal that indicates a simulated reverberation sound using a filter to which the determined gain is set (step S106).

つづいて、出力部２５は、生成部２４によって生成された残響信号および取得部２１によって取得された音源信号を複数のスピーカＦＬ，ＦＲ，ＲＬ，ＲＲを介して出力することでサラウンド再生し（ステップＳ１０７）、処理を終了する。 Then, the output unit 25 performs surround playback by outputting the reverberation signal generated by the generation unit 24 and the sound source signal acquired by the acquisition unit 21 through the multiple speakers FL, FR, RL, and RR (step S107), and ends the process.

上述してきたように、実施形態に係る音響装置１は、検出部２２と、決定部２３と、生成部２４とを備える。検出部２２は、音源信号に基づいて、音源信号に含まれる音響成分の特徴である特徴情報を検出する。決定部２３は、特徴情報に基づいて、音源信号から疑似的な残響音を生成するためのフィルタのゲインを決定する。生成部２４は、決定部２３によって決定されたゲインが設定されたフィルタを用いて、音源信号から残響音を示す残響信号を生成する。これにより、高音質なサラウンド再生を行うことができる。 As described above, the acoustic device 1 according to the embodiment includes a detection unit 22, a determination unit 23, and a generation unit 24. The detection unit 22 detects feature information that is a feature of the acoustic components contained in the sound source signal based on the sound source signal. The determination unit 23 determines the gain of a filter for generating pseudo reverberation sound from the sound source signal based on the feature information. The generation unit 24 generates a reverberation signal indicating reverberation sound from the sound source signal using a filter to which the gain determined by the determination unit 23 is set. This enables high-quality surround reproduction.

さらなる効果や変形例は、当業者によって容易に導き出すことができる。このため、本発明のより広範な態様は、以上のように表しかつ記述した特定の詳細および代表的な実施形態に限定されるものではない。したがって、添付の特許請求の範囲およびその均等物によって定義される総括的な発明の概念の精神または範囲から逸脱することなく、様々な変更が可能である。 Further advantages and modifications may readily occur to those skilled in the art. Therefore, the invention in its broader aspects is not limited to the specific details and representative embodiments shown and described above. Accordingly, various modifications may be made without departing from the spirit or scope of the general inventive concept as defined by the appended claims and equivalents thereof.

１音響装置
２制御部
３記憶部
２１取得部
２２検出部
２３決定部
２４生成部
２５出力部
１００音源装置
ＦＬ、ＦＲ、ＲＬ、ＲＲスピーカ
Ｌリスナ
Ｓ音源 Reference Signs List 1 Acoustic device 2 Control unit 3 Storage unit 21 Acquisition unit 22 Detection unit 23 Determination unit 24 Generation unit 25 Output unit 100 Sound source device FL, FR, RL, RR Speaker L Listener S Sound source

Claims

Detecting feature information that is a feature of an acoustic component included in a sound source signal based on the sound source signal;
correcting a threshold range obtained from the first wavefront law based on the characteristic information;
determining a gain of a filter for generating a pseudo reverberation sound from the sound source signal based on the corrected threshold range ;
The acoustic device sets the determined gain as a gain of the filter.

The sound source signal is
A stereo signal consisting of two channel playback signals,
The audio device according to claim 1 , wherein channel relationship information relating to a relationship between the sound source signals reproduced on each of the two channels is detected as the feature information.

The audio device according to claim 2 , wherein the channel relationship information is continuously detected at regular intervals, and a moving average calculated for the detected time-series channel relationship information is detected as the feature information.

The audio device according to claim 3 , wherein the moving average is performed by maximizing the channel relation information in a section in which the sound pressure level of the sound source signal is less than a predetermined value, among the time-series channel relation information.

5. The audio device according to claim 1, wherein information regarding a duration and a frequency of the sound component is detected as the feature information.

obtaining sound source information relating to the sound source signal;
The acoustic device according to claim 1 , further comprising: a gain of the filter determined based on the characteristic information and the sound source information.

a detection step of detecting feature information that is a feature of an acoustic component included in a sound source signal based on the sound source signal;
a correction step of correcting a threshold range obtained from the first wavefront law based on the characteristic information;
determining a gain of a filter for generating a pseudo reverberation sound from the sound source signal based on the corrected threshold range ;
a setting step of setting the gain determined by the determining step as the gain of the filter;
and generating a reverberation signal indicating the reverberation sound from the sound source signal using the filter whose gain has been set in the setting step.