JP6314803B2

JP6314803B2 - Signal processing apparatus, signal processing method, and program

Info

Publication number: JP6314803B2
Application number: JP2014239051A
Authority: JP
Inventors: 中村　理; 理中村; 金章藤下
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2014-11-26
Filing date: 2014-11-26
Publication date: 2018-04-25
Anticipated expiration: 2034-11-26
Also published as: JP2016099606A

Description

本開示は、信号処理装置、信号処理方法及びプログラムに関する。 The present disclosure relates to a signal processing device, a signal processing method, and a program.

近年、音響信号から特定の音を抑制するための信号処理装置が開発されている。一例として、楽曲からボーカルを抑制して再生する、いわゆるカラオケ機能を実現する信号処理装置が多く開発されている。ボーカルの抑制技術においては、ボーカルが定位する位置が一般的に中央であることに着目してボーカルを抑制することが基本的な方針となっている。詳しくは、楽曲の多くはボーカルが中央に定位するように作成されているので、結果的に、ボーカルが左チャネルと右チャネルに同じように録音されている。このため、ステレオ信号の両チャネルの信号で差分をとると、両チャネルに同じように録音されているボーカルが抑制される。ただし、このようなボーカル抑制技術においては、聴覚上のノイズが生じる場合があるため、ノイズを低減するための技術が求められている。 In recent years, signal processing apparatuses for suppressing specific sounds from acoustic signals have been developed. As an example, many signal processing apparatuses that realize a so-called karaoke function that reproduces music while suppressing vocals have been developed. In the vocal suppression technology, it is a basic policy to suppress vocals by focusing on the fact that the position where the vocals are localized is generally in the center. Specifically, since many of the music pieces are created so that the vocal is localized in the center, as a result, the vocal is recorded in the left channel and the right channel in the same way. For this reason, if a difference is taken between the signals of both channels of the stereo signal, vocals recorded in the same way on both channels are suppressed. However, in such vocal suppression technology, there is a case where auditory noise may occur, and thus a technology for reducing noise is required.

例えば、下記特許文献１では、一旦音響信号を周波数領域で表現した上で、ボーカルを抑制するための差分計算を周波数領域で行い、信号レベルが低い周波数帯域を元の音響信号により補完する技術が開示されている。 For example, in Patent Document 1 below, there is a technique in which after a sound signal is once expressed in the frequency domain, a difference calculation for suppressing vocals is performed in the frequency domain, and a frequency band having a low signal level is complemented with the original sound signal. It is disclosed.

特許第５３６５３８０号公報Japanese Patent No. 5365380

しかし、上記特許文献１に記載された技術では、ノイズを低減する代償としてボーカルを抑制する性能が低下していた。詳しくは、信号レベルが低い周波数帯域が、ボーカルを含む元の音響信号により補完されてしまっていた。 However, in the technique described in Patent Document 1, the performance of suppressing vocals has been reduced as a price for reducing noise. Specifically, the frequency band with a low signal level has been supplemented by the original acoustic signal including vocals.

そこで、本開示では、音響信号から特定の音を抑制することと聴覚上のノイズを低減することとを両立することが可能な、新規かつ改良された信号処理装置、信号処理方法及びプログラムを提案する。 Therefore, the present disclosure proposes a new and improved signal processing apparatus, signal processing method, and program capable of both suppressing a specific sound from an acoustic signal and reducing auditory noise. To do.

本開示によれば、入力された音響信号を形成する第１のチャネルの音響信号及び第２のチャネルの音響信号の差分信号を計算する差分信号計算部と、前記差分信号計算部により計算された前記差分信号に前記差分信号を処理した信号を加算する処理部と、を備える信号処理装置が提供される。 According to the present disclosure, the difference signal calculation unit that calculates the difference signal between the first channel acoustic signal and the second channel acoustic signal that form the input acoustic signal, and the difference signal calculation unit calculates the difference signal calculation unit. And a processing unit that adds a signal obtained by processing the difference signal to the difference signal.

また、本開示によれば、入力された音響信号を形成する第１のチャネルの音響信号及び第２のチャネルの音響信号の差分信号を計算することと、計算された前記差分信号に前記差分信号を処理した信号をプロセッサにより加算することと、を含む信号処理方法が提供される。 In addition, according to the present disclosure, the difference signal between the acoustic signal of the first channel and the acoustic signal of the second channel forming the input acoustic signal is calculated, and the difference signal is added to the calculated difference signal. A signal processing method comprising: adding a processed signal by a processor.

また、本開示によれば、コンピュータを、入力された音響信号を形成する第１のチャネルの音響信号及び第２のチャネルの音響信号の差分信号を計算する差分信号計算部と、前記差分信号計算部により計算された前記差分信号に前記差分信号を処理した信号を加算する処理部と、として機能させるためのプログラムが提供される。 In addition, according to the present disclosure, the difference signal calculation unit that calculates a difference signal between the first channel acoustic signal and the second channel acoustic signal that form the input acoustic signal, and the difference signal calculation. And a processing unit that adds a signal obtained by processing the difference signal to the difference signal calculated by a unit.

以上説明したように本開示によれば、音響信号から特定の音を抑制することと聴覚上のノイズを低減することとを両立することが可能である。なお、上記の効果は必ずしも限定的なものではなく、上記の効果とともに、または上記の効果に代えて、本明細書に示されたいずれかの効果、または本明細書から把握され得る他の効果が奏されてもよい。 As described above, according to the present disclosure, it is possible to achieve both suppression of a specific sound from an acoustic signal and reduction of auditory noise. Note that the above effects are not necessarily limited, and any of the effects shown in the present specification, or other effects that can be grasped from the present specification, together with or in place of the above effects. May be played.

第１の実施形態に係る信号処理装置の論理的な構成の一例を示すブロック図である。It is a block diagram which shows an example of the logical structure of the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係る差分信号計算部のシグナルフローの一例を示す図である。It is a figure which shows an example of the signal flow of the difference signal calculation part which concerns on 1st Embodiment. 第１の実施形態に係る差分信号計算部のシグナルフローの一例を示す図である。It is a figure which shows an example of the signal flow of the difference signal calculation part which concerns on 1st Embodiment. 第１の実施形態に係る差分信号計算部のシグナルフローの一例を示す図である。It is a figure which shows an example of the signal flow of the difference signal calculation part which concerns on 1st Embodiment. 第１の実施形態に係る差分信号計算部のシグナルフローの一例を示す図である。It is a figure which shows an example of the signal flow of the difference signal calculation part which concerns on 1st Embodiment. 第１の実施形態に係るぼかし処理部のシグナルフローの一例を示す図である。It is a figure which shows an example of the signal flow of the blurring process part which concerns on 1st Embodiment. 第１の実施形態に係るぼかし処理部のシグナルフローの一例を示す図である。It is a figure which shows an example of the signal flow of the blurring process part which concerns on 1st Embodiment. 第１の実施形態に係る信号処理装置において実行される信号処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the signal processing performed in the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係るぼかし処理部において実行される遅延バッファＤＢの更新処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the update process of delay buffer DB performed in the blurring process part which concerns on 1st Embodiment. 第１の比較例に係る信号処理を説明するための図である。It is a figure for demonstrating the signal processing which concerns on a 1st comparative example. 第１の比較例に係る信号処理を説明するための図である。It is a figure for demonstrating the signal processing which concerns on a 1st comparative example. 第１の比較例に係る信号処理を説明するための図である。It is a figure for demonstrating the signal processing which concerns on a 1st comparative example. 第２の比較例に係る信号処理を説明するための図である。It is a figure for demonstrating the signal processing which concerns on a 2nd comparative example. 第２の比較例に係る信号処理を説明するための図である。It is a figure for demonstrating the signal processing which concerns on a 2nd comparative example. 第１の実施形態に係る信号処理装置の効果を説明するための図である。It is a figure for demonstrating the effect of the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係る信号処理装置の論理的な構成の一例を示すブロック図である。It is a block diagram which shows an example of the logical structure of the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係る信号処理装置の論理的な構成の一例を示すブロック図である。It is a block diagram which shows an example of the logical structure of the signal processing apparatus which concerns on 1st Embodiment. 入力された音響信号がモノラルに近い度合を説明するための図である。It is a figure for demonstrating the degree to which the input acoustic signal is near monaural. 第１の実施形態に係る信号処理装置の論理的な構成の一例を示すブロック図である。It is a block diagram which shows an example of the logical structure of the signal processing apparatus which concerns on 1st Embodiment. 第２の実施形態に係る信号処理装置の論理的な構成の一例を示すブロック図である。It is a block diagram which shows an example of the logical structure of the signal processing apparatus which concerns on 2nd Embodiment. 第２の実施形態に係る信号処理装置において実行される信号処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the signal processing performed in the signal processing apparatus which concerns on 2nd Embodiment. 第２の実施形態に係る信号処理装置の効果を説明するための図である。It is a figure for demonstrating the effect of the signal processing apparatus which concerns on 2nd Embodiment. 第２の実施形態に係る信号処理装置の論理的な構成の一例を示すブロック図である。It is a block diagram which shows an example of the logical structure of the signal processing apparatus which concerns on 2nd Embodiment. 第３の実施形態に係る信号処理装置の論理的な構成の一例を示すブロック図である。It is a block diagram which shows an example of the logical structure of the signal processing apparatus which concerns on 3rd Embodiment. 第３の実施形態に係る信号処理装置において実行される信号処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the signal processing performed in the signal processing apparatus which concerns on 3rd Embodiment. 第４の実施形態に係る信号処理装置の論理的な構成の一例を示すブロック図である。It is a block diagram which shows an example of the logical structure of the signal processing apparatus which concerns on 4th Embodiment. 第４の実施形態に係るぼかし処理部のシグナルフローの一例を示す図である。It is a figure which shows an example of the signal flow of the blurring process part which concerns on 4th Embodiment. 本実施形態に係る情報処理装置のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of the information processing apparatus which concerns on this embodiment.

以下に添付図面を参照しながら、本開示の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In addition, in this specification and drawing, about the component which has the substantially same function structure, duplication description is abbreviate | omitted by attaching | subjecting the same code | symbol.

なお、説明は以下の順序で行うものとする。
１．概要
２．第１の実施形態
２−１．第１の構成例
２−２．動作処理例
２−３．効果
２−４．第２の構成例
２−５．第３の構成例
２−６．第４の構成例
３．第２の実施形態
３−１．第１の構成例
３−２．動作処理例
３−３．効果
３−４．第２の構成例
４．第３の実施形態
４−１．構成例
４−２．動作処理例
５．第４の実施形態
６．ハードウェア構成
７．まとめ The description will be made in the following order.
1. Overview 2. First embodiment 2-1. First configuration example 2-2. Operation processing example 2-3. Effect 2-4. Second configuration example 2-5. Third configuration example 2-6. 4. Fourth configuration example Second embodiment 3-1. First configuration example 3-2. Example of operation processing 3-3. Effect 3-4. Second configuration example 4. Third embodiment 4-1. Configuration example 4-2. Example of operation processing 4. Fourth embodiment Hardware configuration Summary

＜１．概要＞
まず、本開示の一実施形態に係る信号処理装置の概要について説明する。 <1. Overview>
First, an overview of a signal processing device according to an embodiment of the present disclosure will be described.

本実施形態に係る信号処理装置は、入力された音響信号から特定音を抑制する信号処理を行う。抑制される特定音は、例えば音響信号の中央に定位する音であってもよい。そのような特定音としては、例えばボーカルや、ベース系の音が挙げられる。以下では、一例として、本実施形態に係る信号処理装置１００が抑制する特定音はボーカルであるものとして説明する。また、特定音を抑制する処理を、以下ではぼかし（blur）処理とも称する。 The signal processing apparatus according to the present embodiment performs signal processing that suppresses a specific sound from an input acoustic signal. The specific sound to be suppressed may be, for example, a sound localized at the center of the acoustic signal. Examples of such specific sounds include vocals and bass sounds. Hereinafter, as an example, the specific sound suppressed by the signal processing apparatus 100 according to the present embodiment will be described as being vocal. In addition, the process for suppressing the specific sound is hereinafter referred to as a blur process.

本実施形態に係る信号処理装置は、まず、差分信号を生成することで、音響信号の中央に定位する特定音を抑制する。続いて、本実施形態に係る信号処理装置は、ぼかし処理を行うことにより、差分信号の生成過程で生じた聴覚ノイズを低減する。 The signal processing apparatus according to the present embodiment first suppresses a specific sound localized at the center of the acoustic signal by generating a differential signal. Subsequently, the signal processing apparatus according to the present embodiment reduces the auditory noise generated in the process of generating the difference signal by performing the blurring process.

以下、図１〜図２８を参照して、本実施形態について詳細に説明する。 Hereinafter, this embodiment will be described in detail with reference to FIGS.

＜２．第１の実施形態＞
［２−１．第１の構成例］
図１は、本実施形態に係る信号処理装置１００の論理的な構成の一例を示すブロック図である。図１に示す構成例を、以下では第１の構成例とも称する。図１に示すように、本構成例に係る信号処理装置１００は、差分信号計算部１１０及びぼかし処理部１２０を有する。信号処理装置１００は、入力されたオーディオ信号（音響信号）に信号処理を施して、処理後の音響信号を出力する。 <2. First Embodiment>
[2-1. First Configuration Example]
FIG. 1 is a block diagram illustrating an example of a logical configuration of the signal processing apparatus 100 according to the present embodiment. Hereinafter, the configuration example illustrated in FIG. 1 is also referred to as a first configuration example. As illustrated in FIG. 1, the signal processing device 100 according to this configuration example includes a differential signal calculation unit 110 and a blur processing unit 120. The signal processing apparatus 100 performs signal processing on the input audio signal (acoustic signal) and outputs the processed acoustic signal.

（１）差分信号計算部１１０
差分信号計算部１１０は、入力された音響信号を形成する第１のチャネルの音響信号及び第２のチャネルの音響信号の差分信号を計算する機能を有する。例えば、入力された音響信号はステレオ信号であり、第１のチャネルの音響信号は左チャネルの音響信号であり、第２のチャネルの音響信号は右チャネルの音響信号である。以下では、左チャネルの音響信号をＬｃｈとも称し、右チャネルの音響信号をＲｃｈとも称する。差分信号計算部１１０から出力される音響信号は、ステレオ信号であってもよい。以下では、出力される左チャネルの音響信号をＬ´ｃｈとも称し、出力される右チャネルの音響信号をＲ´ｃｈとも称する。 (1) Difference signal calculation unit 110
The differential signal calculation unit 110 has a function of calculating a differential signal between the acoustic signal of the first channel and the acoustic signal of the second channel that form the input acoustic signal. For example, the input acoustic signal is a stereo signal, the first channel acoustic signal is a left channel acoustic signal, and the second channel acoustic signal is a right channel acoustic signal. Hereinafter, the acoustic signal of the left channel is also referred to as Lch, and the acoustic signal of the right channel is also referred to as Rch. The acoustic signal output from the difference signal calculation unit 110 may be a stereo signal. Hereinafter, the output left channel acoustic signal is also referred to as L'ch, and the output right channel acoustic signal is also referred to as R'ch.

差分信号計算部１１０は、時間領域で差分信号を計算する。例えば、差分信号計算部１１０は、時間領域の信号であるＲｃｈの信号とＬｃｈの信号との差分をとることで、差分信号を計算する。以下、図２〜図５を参照して、時間領域で差分信号を計算するための、差分信号計算部１１０のシグナルフローの一例を説明する。 The difference signal calculation unit 110 calculates a difference signal in the time domain. For example, the difference signal calculation unit 110 calculates the difference signal by taking the difference between the Rch signal and the Lch signal, which are time domain signals. Hereinafter, an example of the signal flow of the difference signal calculation unit 110 for calculating the difference signal in the time domain will be described with reference to FIGS.

図２は、本実施形態に係る差分信号計算部１１０のシグナルフローの一例を示す図である。図２に示す例では、差分信号計算部１１０は、ＬｃｈからＲｃｈを減算し、０．５倍することで、差分信号Ｓ（ｉ）を得る。本シグナルフローは、以下の数式で表現される。
Ｓ（ｉ）＝（Ｌ（ｉ）−Ｒ（ｉ））×０．５（数式１） FIG. 2 is a diagram illustrating an example of a signal flow of the difference signal calculation unit 110 according to the present embodiment. In the example illustrated in FIG. 2, the difference signal calculation unit 110 obtains the difference signal S (i) by subtracting Rch from Lch and multiplying by 0.5. This signal flow is expressed by the following mathematical formula.
S (i) = (L (i) −R (i)) × 0.5 (Formula 1)

ここで、Ｌ（ｉ）はＬｃｈの信号であり、Ｒ（ｉ）はＲｃｈの信号である。ｉはサンプル時刻を表す。差分信号計算部１１０は、処理前後の信号レベルを保つ目的で減算後の信号を０．５倍している。 Here, L (i) is an Lch signal, and R (i) is an Rch signal. i represents the sample time. The difference signal calculation unit 110 multiplies the signal after subtraction by 0.5 for the purpose of maintaining the signal level before and after processing.

図３は、本実施形態に係る差分信号計算部１１０のシグナルフローの一例を示す図である。図３に示す例では、差分信号計算部１１０は、図２に示した例と同様にして差分信号Ｓ（ｉ）を計算する。そして、差分信号計算部１１０はＬ´ｃｈ及びＲ´ｃｈで同じ差分信号Ｓ（ｉ）を出力する。本シグナルフローによる出力信号は、実質的にモノラル信号と同等である。 FIG. 3 is a diagram illustrating an example of a signal flow of the difference signal calculation unit 110 according to the present embodiment. In the example shown in FIG. 3, the difference signal calculation unit 110 calculates the difference signal S (i) in the same manner as in the example shown in FIG. Then, the difference signal calculation unit 110 outputs the same difference signal S (i) for L′ ch and R′ch. The output signal by this signal flow is substantially equivalent to a monaural signal.

図４は、本実施形態に係る差分信号計算部１１０のシグナルフローの一例を示す図である。図４に示す例では、差分信号計算部１１０は、図２に示した例と同様にして差分信号Ｓ（ｉ）を計算する。そして、差分信号計算部１１０は、Ｌ’ｃｈの位相を反転したものをＲ’ｃｈとして出力する。本シグナルフローによる出力信号は、図３に示した例と比較して、ユーザに広がり感を感じさせることが可能である。ただし、本シグナルフローによる出力信号は、本位相が反転したことに起因する違和感をユーザに与え得る。 FIG. 4 is a diagram illustrating an example of a signal flow of the difference signal calculation unit 110 according to the present embodiment. In the example shown in FIG. 4, the difference signal calculation unit 110 calculates the difference signal S (i) in the same manner as in the example shown in FIG. Then, the difference signal calculation unit 110 outputs a signal obtained by inverting the phase of L′ ch as R′ch. Compared with the example shown in FIG. 3, the output signal by this signal flow can make the user feel a sense of spread. However, the output signal by this signal flow can give the user a sense of discomfort due to the inversion of this phase.

図５は、本実施形態に係る差分信号計算部１１０のシグナルフローの一例を示す図である。図５に示す例では、差分信号計算部１１０は、まず、入力信号のＬｃｈとＲｃｈを加算してモノラル化することで中央に定位するボーカルを抽出する。次に、差分信号計算部１１０は、モノラル化した信号を０．５倍して信号レベルを保ち、Ｌｃｈ及びＲｃｈの各々から減算することで、Ｌ’ｃｈ及びＲ’ｃｈを得る。本シグナルフローによる出力信号は、図４に示した例と同様である。 FIG. 5 is a diagram illustrating an example of a signal flow of the difference signal calculation unit 110 according to the present embodiment. In the example shown in FIG. 5, the differential signal calculation unit 110 first extracts a vocal localized at the center by adding the Lch and Rch of the input signal to make it monaural. Next, the difference signal calculation unit 110 multiplies the monaural signal by 0.5, maintains the signal level, and subtracts from each of Lch and Rch to obtain L′ ch and R′ch. The output signal by this signal flow is the same as the example shown in FIG.

（２）ぼかし処理部１２０
ぼかし処理部１２０は、ぼかし処理を行う。詳しくは、ぼかし処理部１２０は、差分信号計算部１１０により計算された差分信号に、当該差分信号を処理した信号を加算する処理部としての機能を有する。差分信号を処理した信号は多様に考えられる。本実施形態に係るぼかし処理部１２０は、差分信号を処理した信号として、差分信号を遅延させた遅延信号を生成する。そして、ぼかし処理部１２０は、差分信号に生成した遅延信号を加算することで、出力信号を得る。なお、差分信号に遅延信号を加算する処理は、単純な加算であってもよいし、重み付け加算であってもよいし、いずれか一方の符号を反転させた上での加算（即ち、減算）であってもよい。以下では、ぼかし処理部１２０からの出力信号を、ぼかし信号Ｆ（ｉ）とも称する。 (2) Blur processing unit 120
The blur processing unit 120 performs blur processing. Specifically, the blur processing unit 120 has a function as a processing unit that adds a signal obtained by processing the difference signal to the difference signal calculated by the difference signal calculation unit 110. There are various signals that can be processed from the differential signal. The blurring processing unit 120 according to the present embodiment generates a delayed signal obtained by delaying the differential signal as a signal obtained by processing the differential signal. Then, the blurring processing unit 120 obtains an output signal by adding the generated delay signal to the difference signal. Note that the process of adding the delay signal to the difference signal may be simple addition or weighted addition, or addition (ie, subtraction) with one of the signs reversed. It may be. Hereinafter, the output signal from the blur processing unit 120 is also referred to as a blur signal F (i).

ぼかし処理部１２０は、ＩＩＲ（Infinite impulse response）フィルタを用いて遅延信号を生成してもよい。ここで、図６を参照して、ＩＩＲフィルタを用いて遅延信号を生成して、ぼかし信号Ｆ（ｉ）を得るためのシグナルフローを説明する。 The blurring processing unit 120 may generate a delay signal using an IIR (Infinite impulse response) filter. Here, with reference to FIG. 6, a signal flow for generating a delay signal using an IIR filter and obtaining a blur signal F (i) will be described.

図６は、本実施形態に係るぼかし処理部１２０のシグナルフローの一例を示す図である。図６に示すように、ぼかし処理部１２０は、遅延バッファＤＢ１２１に蓄積された遅延信号Ｄ（ｉ）を差分信号Ｓ（ｉ）に加算することで、ぼかし信号Ｆ（ｉ）を得る。遅延信号Ｄ（ｉ）は、ぼかし信号Ｆ（ｉ）がｎサンプル遅延した信号である。加算の際、ぼかし処理部１２０は、加算に係る重み付け係数ｒを用いて、差分信号Ｓ（ｉ）と遅延信号Ｄ（ｉ）とを重み付け加算する。重み付け係数ｒは、差分信号Ｓ（ｉ）及び遅延信号Ｄ（ｉ）の混合率であるとも捉えることが可能である。本シグナルフローは、以下の数式で表現される。
Ｆ（ｉ）＝（１−ｒ）×Ｓ（ｉ）＋ｒ×Ｄ（ｉ）（数式２） FIG. 6 is a diagram illustrating an example of a signal flow of the blur processing unit 120 according to the present embodiment. As illustrated in FIG. 6, the blurring processing unit 120 obtains the blurring signal F (i) by adding the delay signal D (i) accumulated in the delay buffer DB121 to the differential signal S (i). The delayed signal D (i) is a signal obtained by delaying the blur signal F (i) by n samples. At the time of addition, the blurring processing unit 120 weights and adds the difference signal S (i) and the delayed signal D (i) using the weighting coefficient r related to the addition. The weighting coefficient r can also be regarded as a mixing ratio of the differential signal S (i) and the delayed signal D (i). This signal flow is expressed by the following mathematical formula.
F (i) = (1−r) × S (i) + r × D (i) (Formula 2)

ここで、重み付け係数ｒは以下の範囲の値をとる。
０＜ｒ＜１（数式３） Here, the weighting coefficient r takes a value in the following range.
0 <r <1 (Formula 3)

ぼかし処理部１２０は、ＦＩＲ（Finite impulse response）フィルタを用いて遅延信号を生成してもよい。ここで、図７を参照して、ＦＩＲフィルタを用いて遅延信号を生成して、ぼかし信号Ｆ（ｉ）を得るためのシグナルフローを説明する。 The blurring processing unit 120 may generate a delay signal using an FIR (Finite impulse response) filter. Here, a signal flow for generating a delay signal using an FIR filter and obtaining a blur signal F (i) will be described with reference to FIG.

図７は、本実施形態に係るぼかし処理部１２０のシグナルフローの一例を示す図である。図７に示すように、ぼかし処理部１２０は、入力された信号を１サンプル遅延させる遅延器１２２をｍ個有し、差分信号Ｓ（ｉ）に最大ｍサンプル遅延した遅延信号までを重み付け加算することで、ぼかし信号Ｆ（ｉ）を得る。ここでの遅延信号は、差分信号Ｓ（ｉ）が遅延した信号である。本シグナルフローは、以下の数式で表現される。
Ｆ（ｉ）＝ｒ_０×Ｓ（ｉ）＋ｒ_１×Ｓ（ｉ−１）＋・・・＋ｒ_ｍ×Ｓ（ｉ−ｍ）
（数式４） FIG. 7 is a diagram illustrating an example of a signal flow of the blur processing unit 120 according to the present embodiment. As shown in FIG. 7, the blurring processing unit 120 has m delay units 122 that delay the input signal by one sample, and weights and adds up to a delay signal delayed by a maximum of m samples to the differential signal S (i). Thus, the blur signal F (i) is obtained. The delayed signal here is a signal obtained by delaying the differential signal S (i). This signal flow is expressed by the following mathematical formula.
F (i) = r ₀ × S (i) + r ₁ × S (i−1) +... + R _m × S (im)
(Formula 4)

ここで、Ｓ（ｉ−ｍ）はｍサンプル過去の差分信号を表す。また、重み付け係数ｒ_０〜ｒ_ｍは、それぞれ上記数式３を満たす。 Here, S (i−m) represents a differential signal in the past of m samples. Further, weighting coefficient _r 0 ~r _m, respectively satisfy the above equation 3.

なお、ぼかし処理部１２０は、ＩＩＲフィルタ又はＦＩＲフィルタのいずれか一方を用いてもよいし、両方を組み合わせて用いてもよいし、他の任意の方法で遅延信号を生成してもよい。 Note that the blurring processing unit 120 may use either the IIR filter or the FIR filter, may use both in combination, or may generate a delay signal by any other method.

以上、第１の構成例について説明した。続いて、本実施形態に係る信号処理装置１００の動作処理を説明する。 The first configuration example has been described above. Subsequently, an operation process of the signal processing apparatus 100 according to the present embodiment will be described.

［２−２．動作処理例］
図８は、本実施形態に係る信号処理装置１００において実行される信号処理の流れの一例を示すフローチャートである。なお、本フローチャートでは、ぼかし処理部１２０がＩＩＲフィルタを用いて遅延信号を生成する例を説明する。 [2-2. Operation processing example]
FIG. 8 is a flowchart showing an example of the flow of signal processing executed in the signal processing apparatus 100 according to the present embodiment. In this flowchart, an example in which the blur processing unit 120 generates a delay signal using an IIR filter will be described.

図８に示すように、まず、ステップＳ１０２で、差分信号計算部１１０は、ｉ番目のＬｃｈの信号Ｌ（ｉ）及びＲｃｈの信号Ｒ（ｉ）の入力を受け付ける。 As shown in FIG. 8, first, in step S102, the difference signal calculation unit 110 accepts input of an i-th Lch signal L (i) and an Rch signal R (i).

次いで、ステップＳ１０４で、差分信号計算部１１０は、差分信号Ｓ（ｉ）を計算する。例えば、差分信号計算部１１０は、上記数式１を用いて差分信号Ｓ（ｉ）を計算する。 Next, in step S104, the difference signal calculation unit 110 calculates the difference signal S (i). For example, the difference signal calculation unit 110 calculates the difference signal S (i) using the above Equation 1.

次に、ステップＳ１０６で、ぼかし処理部１２０は、差分信号Ｓ（ｉ）と遅延信号Ｄ（ｉ）からぼかし信号Ｆ（ｉ）を計算する。例えば、ぼかし処理部１２０は、上記数式２を用いてぼかし信号Ｆ（ｉ）を計算する。 Next, in step S106, the blurring processing unit 120 calculates the blurring signal F (i) from the difference signal S (i) and the delay signal D (i). For example, the blur processing unit 120 calculates the blur signal F (i) using Equation 2 above.

次いで、ステップＳ１０８で、ぼかし処理部１２０は、遅延バッファＤＢ１２１を更新する。本処理は後に詳しく説明するため、ここでの説明は省略する。 Next, in step S108, the blurring processing unit 120 updates the delay buffer DB 121. Since this process will be described in detail later, a description thereof is omitted here.

そして、ステップＳ１１０で、ぼかし処理部１２０は、計算したぼかし信号Ｆ（ｉ）を出力する。 In step S110, the blur processing unit 120 outputs the calculated blur signal F (i).

以上、信号処理装置１００による信号処理例を説明した。続いて、図９を参照して、上記ステップＳ１０８における処理を説明する。 The signal processing example by the signal processing apparatus 100 has been described above. Next, the process in step S108 will be described with reference to FIG.

図９は、本実施形態に係るぼかし処理部１２０において実行される遅延バッファＤＢ１２１の更新処理の流れの一例を示すフローチャートである。 FIG. 9 is a flowchart showing an example of a flow of update processing of the delay buffer DB 121 executed in the blur processing unit 120 according to the present embodiment.

図９に示すように、まず、ステップＳ２０２で、ぼかし処理部１２０は、ｊ＝０とおく。ｊは更新処理のために用いられる変数である。 As shown in FIG. 9, the blurring processing unit 120 first sets j = 0 in step S202. j is a variable used for the update process.

次いで、ステップＳ２０４で、ぼかし処理部１２０は、ｊ＜ｎ−１を満たすか否かを判定する。ここで、ｎは遅延バッファＤＢ１２１のサイズであり、遅延量を表す。 Next, in step S204, the blurring processing unit 120 determines whether j <n−1 is satisfied. Here, n is the size of the delay buffer DB 121 and represents the delay amount.

ｊ＜ｎ−１であると判定された場合（Ｓ２０４／ＹＥＳ）、ステップＳ２０６で、ぼかし処理部１２０は、遅延バッファＤＢ［ｊ］に遅延バッファＤＢ［ｊ＋１］をコピーする。ここで、遅延バッファＤＢ［ｊ］とは、遅延バッファＤＢ１２１に格納されるｊ番目のデータを表す。 When it is determined that j <n−1 (S204 / YES), in step S206, the blurring processing unit 120 copies the delay buffer DB [j + 1] to the delay buffer DB [j]. Here, the delay buffer DB [j] represents the jth data stored in the delay buffer DB121.

次に、ステップＳ２０８で、ぼかし処理部１２０は、ｊ＝ｊ＋１として変数ｊをインクリメントする。 Next, in step S208, the blurring processing unit 120 increments the variable j by setting j = j + 1.

その後、処理は再度ステップＳ２０４へ戻る。このようにして、ｊ＜ｎ−１が満たされなくなるまで、ステップＳ２０６及びＳ２０８における処理が繰り返される。 Thereafter, the process returns to step S204 again. In this way, the processes in steps S206 and S208 are repeated until j <n−1 is not satisfied.

ｊ＜ｎ−１でないと判定された場合（Ｓ２０４／ＮＯ）、ステップＳ２１０で、ぼかし処理部１２０は、遅延バッファＤＢ［ｎ−１］にぼかし信号Ｆ（ｉ）をコピーする。 When it is determined that j <n−1 is not satisfied (S204 / NO), in step S210, the blurring processing unit 120 copies the blurring signal F (i) to the delay buffer DB [n−1].

以上説明した処理により、遅延バッファＤＢ［０］には、ｎサンプル遅延した信号が格納されることとなる。ぼかし処理部１２０は、遅延バッファＤＢ［０］を遅延信号Ｄ（ｉ）として利用する。以上、ぼかし処理部１２０による遅延バッファＤＢ１２１の更新処理例を説明した。 Through the processing described above, a signal delayed by n samples is stored in the delay buffer DB [0]. The blurring processing unit 120 uses the delay buffer DB [0] as the delay signal D (i). The example of the update processing of the delay buffer DB 121 by the blur processing unit 120 has been described above.

［２−３．効果］
以下では、比較例と比較して本実施形態に係る信号処理装置１００の効果を説明する。 [2-3. effect]
Below, the effect of the signal processing apparatus 100 according to the present embodiment will be described in comparison with the comparative example.

（前提知識）
圧縮符号化技術のひとつに、チャネル間の相関を利用して符号化するジョイントステレオ（ＪｏｉｎｔＳｔｅｒｅｏ）符号化方式がある。ジョイントステレオ符号化方式には、ミドルサイドステレオ（ＭｉｄｄｌｅＳｉｄｅＳｔｅｒｅｏ）符号化方式とインテンシティステレオ（ＩｎｔｅｎｓｉｔｙＳｔｅｒｅｏ）符号化方式がある。ミドルサイドステレオ符号化方式は、和信号（Ｌｃｈ＋Ｒｃｈ）と差信号（Ｌｃｈ−Ｒｃｈ）に分けて符号化する方式であり、和信号（Ｌｃｈ＋Ｒｃｈ）に重みを付けて符号化することで符号化効率を向上させることが可能な符号化方式である。インテンシティステレオ符号化方式は、和信号（Ｌｃｈ＋Ｒｃｈ）と左右のチャネルのパワー比を符号化することで符号化効率を向上させることが可能な符号化方式である。ジョイントステレオ符号化方式は、圧縮効率を向上させ、より少ないビットレートでの圧縮を可能にしたり、同じビットレートであればより高音質での圧縮を可能にしたりする。 (Prerequisite knowledge)
As one of the compression coding techniques, there is a joint stereo coding method in which the coding is performed using the correlation between channels. The joint stereo coding method includes a middle side stereo coding method and an intensity stereo coding method. The middle-side stereo encoding method is a method of encoding separately for a sum signal (Lch + Rch) and a difference signal (Lch-Rch), and encoding the sum signal (Lch + Rch) with a weight increases the encoding efficiency. This is an encoding method that can be improved. The intensity stereo coding method is a coding method capable of improving the coding efficiency by coding the power ratio between the sum signal (Lch + Rch) and the left and right channels. The joint stereo coding scheme improves compression efficiency and enables compression with a smaller bit rate, or enables compression with higher sound quality at the same bit rate.

（第１の比較例）
まず、第１の比較例として、上述した、ステレオ信号の両チャネルの信号で差分をとることで、両チャネルに同じように録音されているボーカルを抑制する信号処理装置について考える。以下では、図１０〜図１２を参照して、第１の比較例に係る信号処理装置が、ジョイントステレオ符号化方式を利用して圧縮された音源について処理する場合について説明する。 (First comparative example)
First, as a first comparative example, consider a signal processing apparatus that suppresses vocals recorded in the same way on both channels by taking the difference between the signals of both channels of the stereo signal described above. Below, with reference to FIGS. 10-12, the case where the signal processing apparatus which concerns on a 1st comparative example processes about the sound source compressed using the joint stereo encoding system is demonstrated.

図１０〜図１２は、第１の比較例に係る信号処理を説明するための図である。詳しくは、図１０は、ジョイントステレオ符号化方式を利用して圧縮された音源が本比較例に係る信号処理装置により処理された場合のパワースペクトログラムの例である。図１０においては、横軸は時間であり、縦軸は周波数であり、有色の部分は信号レベル（パワー）が高いことを示し、無色の部分は信号レベルが低いことを示している。図１０を参照すると、信号レベルが高い部分と低い部分とが、時間方向をフレーム単位とし周波数方向をスケールファクターバンド単位とするブロック状に形成され、混在している。このような、信号レベルが高い部分と低い部分とがブロック状に形成されることに起因して、耳障りな聴覚ノイズが生じる。 10 to 12 are diagrams for explaining signal processing according to the first comparative example. Specifically, FIG. 10 is an example of a power spectrogram when a sound source compressed using the joint stereo coding method is processed by the signal processing apparatus according to this comparative example. In FIG. 10, the horizontal axis represents time, the vertical axis represents frequency, the colored part indicates that the signal level (power) is high, and the colorless part indicates that the signal level is low. Referring to FIG. 10, a portion with a high signal level and a portion with a low signal level are formed in a block shape in which the time direction is a frame unit and the frequency direction is a scale factor band unit, and are mixed. Such an unpleasant auditory noise is caused by the fact that the high signal level and the low signal level are formed in a block shape.

また、図１１は、図１０の区間ＡＢ間のパワースペクトログラムを抜き出したグラフであり、ある時刻における周波数方向の変化の様子を示している。図１１においては、横軸は周波数であり、縦軸はパワーである。横軸の目盛はスケールファクターバンド単位で振られている。なお、実際の圧縮符号化では、低域のスケールファクターバンドの幅は高域に比べて狭く設定されるが、図１１では模式的に同じ幅で描写している。図１１を参照すると、スケールファクターバンドごとにパワースペクトルが急峻に上がったり下がったりしている。このような急峻な変化は、音源がジョイントステレオ符号化方式を用いて圧縮されていることに起因する。 FIG. 11 is a graph obtained by extracting the power spectrogram between the sections AB in FIG. 10 and shows a change in the frequency direction at a certain time. In FIG. 11, the horizontal axis is frequency and the vertical axis is power. The scale on the horizontal axis is scaled in units of scale factor bands. In actual compression coding, the width of the low-scale scale factor band is set to be narrower than that of the high frequency, but in FIG. 11, they are schematically depicted with the same width. Referring to FIG. 11, the power spectrum rises and falls sharply for each scale factor band. Such a steep change is attributed to the fact that the sound source is compressed using the joint stereo encoding method.

詳しく説明すると、まず、ジョイントステレオ符号化方式では、スケールファクターバンドごとにミドルサイドステレオ符号化をするか否かの判断やインテンシティステレオ符号化が行なわれる。ミドルサイドステレオ符号化方式による圧縮が行なわれたスケールファクターバンドの差信号（Ｌｃｈ−Ｒｃｈ）に割り当てられるビットレートが非常に小さい場合、圧縮後の音響信号における当該スケールファクターバンド部分は実質的にモノラル信号に近くなる。そのため、本比較例に係る信号処理装置による処理では、実質的にモノラル信号に近いスケールファクターバンド部分のレベルがゼロに近い値になり得る。同様に、インテンシティステレオ符号化方式による圧縮が行なわれたスケールファクターバンドの左右のチャネルのパワー比が１に近い場合、圧縮後の音響信号における当該スケールファクターバンド部分は実質的にモノラル信号に近くなる。そのため、本比較例に係る信号処理装置による処理では、実質的にモノラル信号に近いスケールファクターバンド部分のレベルがゼロに近い値になり得る。このように、音源がジョイントステレオ符号化方式を用いて圧縮されていることに起因して、図１１に示した周波数方向の急峻なレベルの変化が生じ得る。このような周波数方向の急峻なレベルの変化が、耳障りな聴覚ノイズが発生する原因の１つである。 More specifically, first, in the joint stereo encoding method, determination as to whether middle-side stereo encoding is performed or intensity stereo encoding is performed for each scale factor band. When the bit rate assigned to the difference signal (Lch-Rch) of the scale factor band compressed by the middle side stereo coding method is very small, the scale factor band portion in the compressed acoustic signal is substantially monaural. Close to the signal. Therefore, in the processing by the signal processing apparatus according to this comparative example, the level of the scale factor band portion that is substantially close to a monaural signal can be a value close to zero. Similarly, when the power ratio of the left and right channels of the scale factor band compressed by the intensity stereo coding method is close to 1, the scale factor band portion in the compressed acoustic signal is substantially close to a monaural signal. Become. Therefore, in the processing by the signal processing apparatus according to this comparative example, the level of the scale factor band portion that is substantially close to a monaural signal can be a value close to zero. As described above, due to the sound source being compressed using the joint stereo encoding method, the steep level change in the frequency direction shown in FIG. 11 may occur. Such a sharp level change in the frequency direction is one of the causes of annoying auditory noise.

また、図１２は、図１０の区間ＣＤ間のパワースペクトログラムを抜き出したグラフであり、ある周波数における時間方向の変化の様子を示している。図１２においては、横軸は時間であり、縦軸はパワーである。横軸の目盛はフレーム単位で振られている。図１２を参照すると、フレームごとにパワースペクトルが急峻に上がったり下がったりしている。このような時間方向の急峻なレベルの変化は、音源がフレームごとにジョイントステレオ符号化方式を用いて圧縮されることに起因して、各スケールファクターバンドにおいて生じ得る。そして、このような時間方向の急峻なレベルの変化こそが、耳障りな聴覚ノイズが発生する大きな原因の１つになっている。 FIG. 12 is a graph obtained by extracting a power spectrogram between the section CDs of FIG. 10, and shows a change in the time direction at a certain frequency. In FIG. 12, the horizontal axis is time, and the vertical axis is power. The scale on the horizontal axis is swung in units of frames. Referring to FIG. 12, the power spectrum sharply rises and falls every frame. Such a steep level change in the time direction may occur in each scale factor band due to the sound source being compressed using a joint stereo coding method for each frame. Such a steep level change in the time direction is one of the major causes of annoying auditory noise.

（第２の比較例）
次いで、第２の比較例として、圧縮音源であっても聴覚ノイズの発生を防ぐ効果のある、上記特許文献１に記載された技術を用いた信号処理装置について考える。本比較例に係る信号処理装置は、上述したように、一旦音響信号を周波数領域で表現した上で、ボーカルを抑制するための差分計算、つまりＬｃｈ−Ｒｃｈを周波数領域で行う。以下では、図１３及び図１４を参照して、第２の比較例に係る信号処理装置が、ジョイントステレオ符号化方式を利用して圧縮された音源について処理する場合について説明する。 (Second comparative example)
Next, as a second comparative example, consider a signal processing apparatus using the technique described in Patent Document 1 that has an effect of preventing the generation of auditory noise even with a compressed sound source. As described above, the signal processing apparatus according to this comparative example once expresses an acoustic signal in the frequency domain, and then performs difference calculation for suppressing vocals, that is, Lch-Rch in the frequency domain. Hereinafter, a case where the signal processing apparatus according to the second comparative example processes a sound source compressed using the joint stereo coding scheme will be described with reference to FIGS. 13 and 14.

図１３及び図１４は、第２の比較例に係る信号処理を説明するための図である。詳しくは、図１３の符号２００は、Ｌｃｈのスケールファクターバンドごとのパワー（Ｐｌ）である。図１３の符号２１０は、Ｒｃｈのスケールファクターバンドごとのパワー（Ｐｒ）である。図１３の符号２２０は、差分信号Ｌｃｈ−Ｒｃｈのスケールファクターバンドごとのパワー（Ｐｄ）である。ＬｃｈのパワーとＲｃｈのパワーとが、同一のスケールファクターバンドにおいて同程度のレベルである場合、差分信号のパワーはゼロに近いレベルになる。例えば、符号２０１と２１１、符号２０２と符号２１２、符号２０３と符号２１３、及び符号２０４と２１４は、それぞれ同程度のレベルである。このため、符号２２０に示す差分信号において、これらに対応するスケールファクターバンドのパワーはゼロに近いレベルになっている。このような状態は、図１１に示した例と同様である。 13 and 14 are diagrams for explaining signal processing according to the second comparative example. Specifically, reference numeral 200 in FIG. 13 denotes power (Pl) for each Lch scale factor band. Reference numeral 210 in FIG. 13 represents power (Pr) for each Rch scale factor band. The code | symbol 220 of FIG. 13 is the power (Pd) for every scale factor band of difference signal Lch-Rch. When the Lch power and the Rch power are at the same level in the same scale factor band, the power of the differential signal is close to zero. For example, reference numerals 201 and 211, reference numerals 202 and 212, reference numerals 203 and 213, and reference numerals 204 and 214 are at the same level. For this reason, in the differential signal indicated by reference numeral 220, the power of the scale factor band corresponding to these is close to zero. Such a state is the same as the example shown in FIG.

そこで、本比較例に係る信号処理装置は、図１４に示すように、ゼロに近いレベルとなった部分を、元の信号により補完することで、このような急峻なレベルの変化を緩和している。例えば、本比較例に係る信号処理装置は、第１のステップとして、区間１１、区間１２、及び区間１３のような急峻なレベル低下を検出する。そして、本比較例に係る信号処理装置は、第２のステップとして、もとのＬｃｈの信号を利用して、区間１１、区間１２、区間１３を補完することで、急峻なレベル低下を防ぐ。 Therefore, as shown in FIG. 14, the signal processing apparatus according to the present comparative example alleviates such a steep level change by complementing the portion that is close to zero with the original signal. Yes. For example, the signal processing apparatus according to this comparative example detects a steep level decrease such as the section 11, the section 12, and the section 13 as the first step. Then, as a second step, the signal processing apparatus according to this comparative example uses the original Lch signal to complement the section 11, the section 12, and the section 13, thereby preventing a steep level decrease.

具体的には、図１４の符号２４０に示すように、本比較例に係る信号処理装置は、図１３の符号２２０に示した差分信号の区間１１、１２及び１３に、符号２００に示すＬｃｈにおける各区間に対応するスケールファクターバンドのパワー２０１、２０２、２０３及び２０４をコピーする。本比較例に係る信号処理装置は、コピーの際に任意の係数を乗算し得る。図１３の符号２２０と図１４の符号２４０とを比較すると、区間１１、１２及び１３以外の区間は同一である。図１４の符号２４０に示すように、本比較例に係る信号処理装置は、周波数方向の急峻なレベル変化を防ぐことが可能である。これに伴い、本比較例に係る信号処理装置は、時間方向の急峻なレベル変化もある程度防ぐことが予想されるので、聴覚ノイズの発生を防ぎ得る。 Specifically, as indicated by reference numeral 240 in FIG. 14, the signal processing apparatus according to this comparative example uses the difference signal sections 11, 12, and 13 indicated by reference numeral 220 in FIG. The scale factor band power 201, 202, 203 and 204 corresponding to each section is copied. The signal processing apparatus according to this comparative example can multiply an arbitrary coefficient during copying. Comparing the reference numeral 220 in FIG. 13 and the reference numeral 240 in FIG. 14, the sections other than the sections 11, 12, and 13 are the same. As indicated by reference numeral 240 in FIG. 14, the signal processing device according to this comparative example can prevent a steep level change in the frequency direction. Along with this, the signal processing apparatus according to this comparative example is expected to prevent a steep level change in the time direction to some extent, so that it is possible to prevent the generation of auditory noise.

しかし、本比較例に係る信号処理装置は、ノイズを低減する代償としてボーカルを抑制する性能が低下していた。これは、本比較例に係る信号処理装置は、急峻なレベル変化を防ぐために、ボーカルを含むＬｃｈの信号を用いて急峻なレベル低下が検出された区間を補完していたことに起因する。 However, the signal processing apparatus according to this comparative example has a reduced performance of suppressing vocals as a price for reducing noise. This is due to the fact that the signal processing apparatus according to this comparative example supplemented the section in which the steep level decrease was detected using the Lch signal including the vocal in order to prevent the steep level change.

また、本比較例に係る信号処理装置は、上述した第１のステップに失敗した場合、失敗した区間を補完することができない。さらに、第１のステップは周波数領域で行われるので、処理対象の信号が時間領域信号であった場合、本比較例に係る信号処理装置は、ボーカル抑制処理前後で時間領域及び周波数領域の変換処理を行っていた。例えば、本比較例に係る信号処理装置は、ボーカル抑制処理前にＦＦＴ（ＦａｓｔＦｏｕｒｉｅｒＴｒａｎｓｆｏｒｍ）等で周波数領域の信号へ変換し、ボーカル抑制処理後にＩＦＦＴ（ＩｎｖｅｒｓｅＦＦＴ）等で時間領域信号へ変換し得る。このような変換処理のための演算量は少なくない。また、第１のステップにおける検出処理のための演算量も少なくない。 In addition, when the signal processing apparatus according to this comparative example fails in the first step described above, the failed section cannot be complemented. Furthermore, since the first step is performed in the frequency domain, when the signal to be processed is a time domain signal, the signal processing apparatus according to this comparative example performs a time domain and frequency domain conversion process before and after the vocal suppression process. Had gone. For example, the signal processing apparatus according to this comparative example converts the signal into a frequency domain signal using FFT (Fast Fourier Transform) or the like before vocal suppression processing, and converts it into a time domain signal using IFFT (Inverse FFT) or the like after vocal suppression processing. obtain. The amount of calculation for such conversion processing is not small. Also, the amount of calculation for the detection process in the first step is not small.

（本実施形態の効果）
以下では、図１５を参照して、本実施形態に係る信号処理装置１００の効果を説明する。 (Effect of this embodiment)
Below, with reference to FIG. 15, the effect of the signal processing apparatus 100 which concerns on this embodiment is demonstrated.

図１５は、本実施形態に係る信号処理装置１００の効果を説明するための図である。詳しくは、図１５の符号３００は、ジョイントステレオ符号化方式を利用して圧縮された音源が本実施形態に係る信号処理装置１００により処理された場合のパワースペクトログラムの、ある周波数における時間方向の変化の様子を示している。また、図１５の符号３１０は、図１２に示したパワースペクトログラムの変化の様子である。図１２及び図１５における同一記号の区間は同一区間を示す。 FIG. 15 is a diagram for explaining the effect of the signal processing apparatus 100 according to the present embodiment. Specifically, reference numeral 300 in FIG. 15 represents a change in the time direction at a certain frequency of a power spectrogram when a sound source compressed using the joint stereo encoding method is processed by the signal processing apparatus 100 according to the present embodiment. The state of is shown. Moreover, the code | symbol 310 of FIG. 15 is a mode of the change of the power spectrogram shown in FIG. 12 and FIG. 15 indicate the same section.

図１５を参照すると、本実施形態に係る信号処理装置１００により処理された場合、急峻なレベルの変化が緩和されている。例えば、区間ＣＤ２において、符号３１０では急峻なレベルの落ち込みが認められる一方で、符号３００では急峻なレベルの落ち込みが認められず、徐々に変化している。これは、急峻なレベル低下が生じた区間に、レベル低下が生じていない遅延信号が加算されることに起因する。図１５に示したように、本実施形態に係る信号処理装置１００は、時間方向の急峻なレベルの変化を緩和することが可能であるため、耳障りな聴覚ノイズの発生を防ぐことができる。 Referring to FIG. 15, when processed by the signal processing apparatus 100 according to the present embodiment, a steep change in level is alleviated. For example, in the section CD2, a steep level drop is recognized in the reference numeral 310, whereas a steep level drop is not recognized in the reference numeral 300, and the change is gradually made. This is due to the fact that a delayed signal that does not have a level drop is added to a section in which a steep level drop has occurred. As shown in FIG. 15, the signal processing apparatus 100 according to the present embodiment can alleviate a steep level change in the time direction, and thus can prevent generation of annoying auditory noise.

また、本実施形態に係るぼかし処理部１２０は、ボーカルが抑制された差分信号を用いて遅延信号を生成し、この遅延信号を用いて急峻なレベルの変化を緩和する。そのため、本実施形態では、第２の比較例のようにボーカル抑制性能を代償とすることがなく、高いボーカル抑制性能を実現することが可能である。 Also, the blurring processing unit 120 according to the present embodiment generates a delay signal using the differential signal in which vocals are suppressed, and relaxes a steep level change using the delay signal. Therefore, in the present embodiment, unlike the second comparative example, the vocal suppression performance is not compensated, and high vocal suppression performance can be realized.

また、本実施形態に係るぼかし処理部１２０は、差分信号計算部１１０から出力された時間領域の信号を、周波数領域に変換することなく処理可能である。このため、本実施形態に係る信号処理装置１００は、第２の比較例に係る信号処理装置と比較して、変換処理のための演算量を削減することが可能である。 Further, the blurring processing unit 120 according to the present embodiment can process the time domain signal output from the differential signal calculation unit 110 without converting it to the frequency domain. For this reason, the signal processing apparatus 100 according to the present embodiment can reduce the amount of calculation for the conversion process as compared with the signal processing apparatus according to the second comparative example.

また、本実施形態に係るぼかし処理部１２０は、遅延信号をＩＩＲやＦＩＲ等を用いて生成するため、小さな演算量で急峻なレベルの変化を緩和することが可能である。さらに、本実施形態に係るぼかし処理部１２０は、急峻なレベル低下を検出しないので、第２の比較例と比較して、検出処理の失敗に起因する補完の失敗を回避することが可能であり、検出処理のための演算量を削減することが可能である。 Further, since the blurring processing unit 120 according to the present embodiment generates a delay signal using IIR, FIR, or the like, it is possible to alleviate a steep level change with a small amount of calculation. Furthermore, since the blurring processing unit 120 according to the present embodiment does not detect a steep decrease in level, it is possible to avoid a failure in complementation due to a failure in detection processing, as compared with the second comparative example. It is possible to reduce the amount of calculation for the detection process.

以上、本実施形態に係る効果について説明した。以下では、本実施形態に係る他の構成例について説明する。なお、以下で説明する他の構成例においても、上述した効果は同様に奏される。 The effects according to the present embodiment have been described above. Hereinafter, another configuration example according to the present embodiment will be described. Note that the effects described above are similarly achieved in other configuration examples described below.

［２−４．第２の構成例］
本構成例は、ぼかし処理部１２０により用いられる遅延量ｎ及び重み付け係数ｒを適切に設定する構成例である。以下、図１６を参照して、本構成例について説明する。 [2-4. Second configuration example]
This configuration example is a configuration example in which the delay amount n and the weighting coefficient r used by the blurring processing unit 120 are appropriately set. Hereinafter, this configuration example will be described with reference to FIG.

図１６は、本実施形態に係る信号処理装置１００の論理的な構成の一例を示すブロック図である。図１６に示す構成例を、以下では第２の構成例とも称する。図１６に示すように、本構成例に係る信号処理装置１００は、差分信号計算部１１０、ぼかし処理部１２０、遅延量設定部１２３及び係数設定部１２４を有する。 FIG. 16 is a block diagram illustrating an example of a logical configuration of the signal processing apparatus 100 according to the present embodiment. Hereinafter, the configuration example illustrated in FIG. 16 is also referred to as a second configuration example. As illustrated in FIG. 16, the signal processing device 100 according to this configuration example includes a differential signal calculation unit 110, a blur processing unit 120, a delay amount setting unit 123, and a coefficient setting unit 124.

差分信号計算部１１０は、差分信号Ｓ（ｉ）を出力する。ぼかし処理部１２０は、上記数式２に、遅延量設定部１２３により設定された遅延量ｎ及び係数設定部１２４により設定された重み付け係数ｒを用いて、ぼかし信号Ｆ（ｉ）を得る。差分信号計算部１１０及びぼかし処理部１２０の内部処理については上記説明した通りであるので、ここでの詳細な説明は省略する。 The difference signal calculation unit 110 outputs a difference signal S (i). The blurring processing unit 120 obtains the blurring signal F (i) using the delay amount n set by the delay amount setting unit 123 and the weighting coefficient r set by the coefficient setting unit 124 in Equation 2 above. Since the internal processing of the difference signal calculation unit 110 and the blur processing unit 120 is as described above, detailed description thereof is omitted here.

（１）遅延量設定部１２３
遅延量設定部１２３は、遅延信号の遅延量ｎを設定する機能を有する。遅延量設定部１２３は、適切な遅延量ｎを設定することで、時間方向の急峻なレベルの変化を緩和することが可能である。 (1) Delay amount setting unit 123
The delay amount setting unit 123 has a function of setting the delay amount n of the delay signal. The delay amount setting unit 123 can alleviate a sharp level change in the time direction by setting an appropriate delay amount n.

第１の比較例において生じていた図１０に示したブロック状のスペクトログラムの各ブロックのサイズは、圧縮符号化情報（オーディオコーデック）に依存する。詳しくは、ブロックの時間方向のサイズはオーディオコーデックのフレーム幅にほぼ等しく、ブロックの周波数方向のサイズはオーディオコーデックのスケールファクターバンド幅にほぼ等しい。第１の比較例において生じていた図１２に示した時間軸方向のレベル変動の例の通り、レベルが急峻にゼロに近い値になったり、ある程度のレベルに戻ったりするときの時間幅は、オーディオコーデックのフレーム幅の整数倍にほぼ合致している。例えば、図１２における区間ＣＤ２は１フレーム分の幅であり、区間ＣＤ２と区間ＣＤ３の間の幅も１フレーム分である。 The size of each block of the block spectrogram shown in FIG. 10 that has occurred in the first comparative example depends on the compression coding information (audio codec). Specifically, the size of the block in the time direction is approximately equal to the frame width of the audio codec, and the size of the block in the frequency direction is approximately equal to the scale factor bandwidth of the audio codec. As shown in the example of the level fluctuation in the time axis direction shown in FIG. 12 that has occurred in the first comparative example, the time width when the level suddenly approaches zero or returns to a certain level is It almost matches the integer multiple of the audio codec frame width. For example, the section CD2 in FIG. 12 has a width of one frame, and the width between the sections CD2 and CD3 is also one frame.

このように、第１の比較例における時間方向の急峻なレベル変動がオーディオコーデックのフレーム単位で生じることから、遅延量設定部１２３は、入力された音響信号の圧縮符号化情報を用いて遅延量ｎを設定する。本実施形態において、信号処理装置１００から出力されるぼかし信号Ｆ（ｉ）の現在のフレームのレベルが直前のフレームに比べて急峻に下がることを防ぐためには、差分信号Ｓ（ｉ）に加算される遅延信号Ｄ（ｉ）のレベルが、ある程度あることが望ましい。つまり、上記数式２において、差分信号Ｓ（ｉ）のレベルがゼロに近い場合に、遅延信号Ｄ（ｉ）のレベルがある程度ある場合、ぼかし信号Ｆ（ｉ）の急峻なレベル低下が防がれる。よって、遅延量設定部１２３は、下記の数式に示すように、遅延信号Ｄ（ｉ）の遅延量ｎをオーディオコーデックが示すフレーム幅以下に設定する。
０＜遅延量ｎ＜＝オーディオコーデックのフレーム幅（数式５） As described above, since the steep level fluctuation in the time direction in the first comparative example occurs in units of frames of the audio codec, the delay amount setting unit 123 uses the compression encoding information of the input acoustic signal to delay the amount of delay. Set n. In the present embodiment, the current frame level of the blur signal F (i) output from the signal processing device 100 is added to the difference signal S (i) in order to prevent the current frame level from dropping sharply compared to the immediately preceding frame. It is desirable that the level of the delayed signal D (i) is a certain level. That is, in Equation 2 above, when the level of the differential signal S (i) is close to zero and the delay signal D (i) has a certain level, a steep drop in the blur signal F (i) is prevented. . Therefore, the delay amount setting unit 123 sets the delay amount n of the delay signal D (i) to be equal to or less than the frame width indicated by the audio codec, as shown in the following equation.
0 <Delay amount n <= Audio codec frame width (Formula 5)

この場合、差分信号Ｓ（ｉ）のレベルがゼロに近くなったタイミングでは、その直前のゼロでない差分信号Ｓ（ｉ）成分が遅延信号Ｄ（ｉ）に含まれることになる。よって、差分信号Ｓ（ｉ）のレベルがゼロに近い場合であっても、遅延信号Ｄ（ｉ）のレベルがある程度あることが実現され、ぼかし信号Ｆ（ｉ）の急峻なレベル低下が防がれる。 In this case, at the timing when the level of the differential signal S (i) approaches zero, the immediately preceding non-zero differential signal S (i) component is included in the delayed signal D (i). Therefore, even when the level of the differential signal S (i) is close to zero, it is realized that the level of the delayed signal D (i) is some, and a sharp level drop of the blur signal F (i) is prevented. It is.

なお、経験的には、下記の数式の範囲で遅延量ｎが設定されることが望ましい。
オーディオコーデックのフレーム幅の７０％＜遅延量ｎ
＜オーディオコーデックのフレーム幅（数式６） Empirically, it is desirable to set the delay amount n within the range of the following mathematical formula.
70% of audio codec frame width <delay amount n
<Audio codec frame width (Formula 6)

（２）係数設定部１２４
係数設定部１２４は、ぼかし処理部１２０による加算に係る重み付け係数ｒを設定する機能を有する。係数設定部１２４は、適切な重み付け係数ｒを設定することで、ぼかし処理の強弱を調整することが可能である。例えば、係数設定部１２４は、入力された音響信号のオーディオコーデックに基づいて重み付け係数ｒを設定する。 (2) Coefficient setting unit 124
The coefficient setting unit 124 has a function of setting a weighting coefficient r related to the addition performed by the blurring processing unit 120. The coefficient setting unit 124 can adjust the strength of the blurring process by setting an appropriate weighting coefficient r. For example, the coefficient setting unit 124 sets the weighting coefficient r based on the audio codec of the input acoustic signal.

オーディオコーデックのビットレートが低い場合、第１の比較例において図１０に示したブロック状のスペクトログラムが生じやすい。これは、オーディオコーデックのビットレートが低い場合、より積極的にジョイントステレオ符号化が使用されるためである。そこで、係数設定部１２４は、オーディオコーデックのうちビットレートに基づいて重み付け係数ｒを設定する。より具体的には、係数設定部１２４は、オーディオコーデックのビットレートが低い場合に、より強くぼかし処理を行なうよう重み付け係数ｒを設定する。つまり、係数設定部１２４は、数式２において、オーディオコーデックのビットレートが低い場合に重み付け係数ｒを１側に寄せて設定し、オーディオコーデックのビットレートが高い場合に重み付け係数ｒをゼロ側に寄せて設定する。他にも、係数設定部１２４は、ジョイントステレオ符号化の使用状況に応じて重み付け係数ｒを設定してもよい。信号処理装置１００は、このような設定により、聴覚ノイズが生じる可能性が高い場合に強くぼかし処理を行い、聴覚ノイズが生じる可能性が低い場合にぼかし処理を弱めて原音を活かすことが可能となる。 When the bit rate of the audio codec is low, the block spectrogram shown in FIG. 10 is likely to occur in the first comparative example. This is because joint stereo coding is more actively used when the bit rate of the audio codec is low. Therefore, the coefficient setting unit 124 sets the weighting coefficient r based on the bit rate of the audio codec. More specifically, the coefficient setting unit 124 sets the weighting coefficient r so as to perform the blurring process more strongly when the bit rate of the audio codec is low. That is, in Equation 2, when the audio codec bit rate is low, the coefficient setting unit 124 sets the weighting coefficient r to the 1 side, and when the audio codec bit rate is high, the coefficient setting unit 124 moves the weighting coefficient r to the zero side. To set. In addition, the coefficient setting unit 124 may set the weighting coefficient r according to the use situation of joint stereo coding. With this setting, the signal processing apparatus 100 can perform strong blurring processing when there is a high possibility of auditory noise, and can use the original sound by weakening the blurring processing when the possibility of auditory noise is low. Become.

なお、経験的には、係数設定部１２４は、下記の数式の範囲で重み付け係数ｒを設定することが望ましい。
０．０＜ｒ＜０．４（数式７） Empirically, the coefficient setting unit 124 desirably sets the weighting coefficient r within the range of the following mathematical formula.
0.0 <r <0.4 (Formula 7)

（３）その他
遅延量設定部１２３及び係数設定部１２４は、遅延量ｎ及び重み付け係数ｒを時間変化させてもよい。この場合、遅延量設定部１２３及び係数設定部１２４は、複数のフレーム幅の自動切り替えや、可変ビットレートのオーディオコーデックにも対応可能となる。係数設定部１２４は、オーディオコーデック情報により、ジョイントステレオ符号化が使用されていないことが判明した場合、重み付け係数ｒにゼロを設定し、ぼかし処理をオフにしてもよい。 (3) Others The delay amount setting unit 123 and the coefficient setting unit 124 may change the delay amount n and the weighting coefficient r over time. In this case, the delay amount setting unit 123 and the coefficient setting unit 124 can support automatic switching of a plurality of frame widths and a variable bit rate audio codec. When it is determined from the audio codec information that joint stereo encoding is not used, the coefficient setting unit 124 may set the weighting coefficient r to zero and turn off the blurring process.

以上説明したように、本構成例によれば、信号処理装置１００は、遅延量ｎの設定により、時間方向の急峻なレベルの変化を確実に緩和することが可能である。また、本構成例によれば、信号処理装置１００は、重み付け係数ｒの設定により、聴覚ノイズを低減させることと原音を活かすこととを両立させることができる。 As described above, according to the present configuration example, the signal processing apparatus 100 can reliably mitigate a sharp level change in the time direction by setting the delay amount n. Further, according to the present configuration example, the signal processing device 100 can achieve both reducing auditory noise and utilizing the original sound by setting the weighting coefficient r.

［２−５．第３の構成例］
本構成例は、係数設定部１２４が重み付け係数ｒを設定するためのパラメータが導入された構成例である。以下、図１７及び図１８を参照して、本構成例について説明する。 [2-5. Third configuration example]
This configuration example is a configuration example in which a parameter for the coefficient setting unit 124 to set the weighting coefficient r is introduced. Hereinafter, this configuration example will be described with reference to FIGS. 17 and 18.

図１７は、本実施形態に係る信号処理装置１００の論理的な構成の一例を示すブロック図である。図１７に示す構成例を、以下では第３の構成例とも称する。図１７に示すように、本構成例に係る信号処理装置１００は、差分信号計算部１１０、ぼかし処理部１２０、係数設定部１２４及びぼかしレベル計算部１２５を有する。 FIG. 17 is a block diagram illustrating an example of a logical configuration of the signal processing apparatus 100 according to the present embodiment. Hereinafter, the configuration example illustrated in FIG. 17 is also referred to as a third configuration example. As illustrated in FIG. 17, the signal processing device 100 according to this configuration example includes a difference signal calculation unit 110, a blur processing unit 120, a coefficient setting unit 124, and a blur level calculation unit 125.

差分信号計算部１１０は、差分信号Ｓ（ｉ）を出力する。本実施形態に係る係数設定部１２４は、ぼかしレベル計算部１２５により計算されたぼかしレベルｆ（ｉ）に応じて、重み付け係数ｒを設定する。ぼかし処理部１２０は、上記数式２に、係数設定部１２４により設定された重み付け係数ｒを用いて、ぼかし信号Ｆ（ｉ）を得る。差分信号計算部１１０、ぼかし処理部１２０及び係数設定部１２４の内部処理については上記説明した通りであるので、ここでの詳細な説明は省略する。 The difference signal calculation unit 110 outputs a difference signal S (i). The coefficient setting unit 124 according to the present embodiment sets the weighting coefficient r according to the blur level f (i) calculated by the blur level calculation unit 125. The blurring processing unit 120 obtains the blurring signal F (i) by using the weighting coefficient r set by the coefficient setting unit 124 in Equation 2 above. Since the internal processing of the difference signal calculation unit 110, the blurring processing unit 120, and the coefficient setting unit 124 is as described above, detailed description thereof is omitted here.

ぼかしレベル計算部１２５は、入力された音響信号の聴覚ノイズの目立ちやすさに応じてぼかしレベルｆ（ｉ）を設定する。以下では、聴覚ノイズの目立ち易さの尺度の一例として、入力された音響信号がモノラルに近い度合が採用される例を説明する。 The blur level calculation unit 125 sets the blur level f (i) in accordance with the conspicuousness of the auditory noise of the input acoustic signal. Hereinafter, an example will be described in which the degree to which the input acoustic signal is close to monaural is adopted as an example of a measure of the conspicuousness of auditory noise.

第１の比較例において生じていた、図１０に示したブロック状のスペクトログラムに起因する聴覚ノイズの程度は、楽曲中に変化し得る。このため、聴覚ノイズの目立ち易さに応じて、ぼかし処理の強度を変化させることが望ましい。聴覚ノイズの目立ち易さは、例えば、入力された音響信号のＬｃｈとＲｃｈがどの程度似ているか、言い換えると、どの程度モノラルに近いかでおおよそ測ることができる。入力された音響信号においてモノラルに近いパート、つまり、殆どの音が中央に定位するパートは聴覚ノイズが目立ち易い。例えば、ボーカルのソロのパートはモノラルに近いことが多く、聴覚ノイズが目立ち易い。逆にモノラルに近くないパート、つまり、中央に定位する音が少ないパートは聴覚ノイズが目立ち難い。これは、ジョイントステレオ符号化自体が、モノラルに近いパートで主に利用されることに起因する。このため、入力された音響信号がモノラルに近い場合に、より強くぼかし処理が行われることが望ましい。 The degree of auditory noise caused by the block spectrogram shown in FIG. 10 that has occurred in the first comparative example can vary during the music. For this reason, it is desirable to change the intensity | strength of a blurring process according to the conspicuousness of auditory noise. The conspicuousness of auditory noise can be roughly measured by, for example, how similar the input audio signal Lch and Rch are, in other words, how close to monaural. Auditory noise tends to be conspicuous in a part that is close to monaural in the input acoustic signal, that is, a part in which most sounds are localized in the center. For example, vocal solo parts are often close to monaural, and auditory noise tends to stand out. Conversely, the part that is not close to monaural, that is, the part that has few sounds localized in the center, is less noticeable. This is due to the fact that joint stereo coding itself is mainly used in parts close to monaural. For this reason, it is desirable that the blurring process be performed more strongly when the input acoustic signal is close to monaural.

そこで、係数設定部１２４は、入力された音響信号がモノラルに近い度合に基づいて、重み付け係数ｒを設定する。そのために、ぼかしレベル計算部１２５は、入力された音響信号がモノラルに近い度合に基づいてぼかしレベルｆ（ｉ）を計算する。例えば、ぼかしレベル計算部１２５は、入力された音響信号がモノラルに近い場合にぼかしレベルｆ（ｉ）を大きく設定し、モノラルに近くない場合にぼかしレベルぼかしレベルｆ（ｉ）を小さく設定する。そして、係数設定部１２４は、ぼかしレベルｆ（ｉ）に応じて重み付け係数ｒを設定する。例えば、係数設定部１２４は、ぼかしレベルｆ（ｉ）が大きいほど重み付け係数ｒを１側に寄せて設定し、ぼかしレベルｆ（ｉ）が小さいほどほど重み付け係数ｒをゼロ側に寄せて設定する。 Therefore, the coefficient setting unit 124 sets the weighting coefficient r based on the degree to which the input acoustic signal is close to monaural. Therefore, the blur level calculation unit 125 calculates the blur level f (i) based on the degree to which the input acoustic signal is close to monaural. For example, the blur level calculation unit 125 sets the blur level f (i) to be large when the input acoustic signal is close to monaural, and sets the blur level blur level f (i) to be small when it is not close to monaural. Then, the coefficient setting unit 124 sets the weighting coefficient r according to the blur level f (i). For example, the coefficient setting unit 124 sets the weighting coefficient r closer to the 1 side as the blur level f (i) is higher, and sets the weighting coefficient r closer to the zero side as the blur level f (i) is lower.

モノラルに近いか否かは、下記数式に示す、モノラルにどの程度近いかを示す尺度ｔ（ｉ）により判定され得る。
Ｐｅａｋ_Ｓ（ｉ）＝（１−ｋ）×Ｐｅａｋ_Ｓ（ｉ−１）
＋ｋ×（｜Ｌ（ｉ）−Ｒ（ｉ）｜）（数式８）
Ｐｅａｋ_Ｍ（ｉ）＝（１−ｋ）×Ｐｅａｋ_Ｍ（ｉ−１）
＋ｋ×（｜Ｌ（ｉ）＋Ｒ（ｉ）｜）（数式９）
ｔ（ｉ）＝Ｐｅａｋ_Ｓ（ｉ）／Ｐｅａｋ_Ｍ（ｉ）（数式１０） Whether or not it is close to monaural can be determined by a scale t (i) indicating how close to monaural is shown in the following mathematical formula.
Peak _S (i) = (1−k) × Peak _S (i−1)
+ K × (| L (i) −R (i) |) (Formula 8)
Peak _M (i) = (1−k) × Peak _M (i−1)
+ K × (| L (i) + R (i) |) (Equation 9)
t (i) = Peak _S (i) / Peak _M (i) (Formula 10)

ここで、係数ｋは時定数である。また、Ｐｅａｋ_Ｍ（ｉ）はゼロでないと仮定している。Ｐｅａｋ_Ｓ（ｉ）は、ＬｃｈからＲｃｈを減算した信号のピークレベルである。Ｐｅａｋ_Ｍ（ｉ）は、ＬｃｈにＲｃｈを加算した信号のピークレベルである。なお、上記数式８及び９では絶対値が用いられているが、二乗が用いられていてもよい。 Here, the coefficient k is a time constant. Also, Peak _M (i) is assumed not to be zero. Peak _S (i) is a peak level of a signal obtained by subtracting Rch from Lch. Peak _M (i) is a peak level of a signal obtained by adding Rch to Lch. In addition, although the absolute value is used in the above mathematical formulas 8 and 9, square may be used.

入力された音響信号がモノラルに近い場合、Ｐｅａｋ_Ｓ（ｉ）は小さくなり、Ｐｅａｋ_Ｍ（ｉ）は大きくなる。一方で、モノラルに近くない場合、Ｐｅａｋ_Ｓ（ｉ）は大きくなり、Ｐｅａｋ_Ｍ（ｉ）は小さくなる。よって、尺度ｔ（ｉ）は、モノラルに近い場合に小さくなり、モノラルに近くない場合に大きくなる。この点を、図１８を参照してさらに詳しく説明する。 When the input acoustic signal is close to monaural, Peak _S (i) is small and Peak _M (i) is large. On the other hand, if it is not close to monaural, Peak _S (i) increases and Peak _M (i) decreases. Therefore, the scale t (i) is small when it is close to monaural and is large when it is not close to monaural. This point will be described in more detail with reference to FIG.

図１８は、入力された音響信号がモノラルに近い度合を説明するための図である。詳しくは、図１８では、符号４０１に示すピークレベルＰｅａｋ_Ｍ（ｉ）の時間的変化の例と、符号４０２に示すピークレベルＰｅａｋ_Ｓ（ｉ）と時間的変化の例とを、それぞれ示している。区間２１及び区間２２は、入力された音響信号がモノラルに近いパートであると同時に、聴覚ノイズが目立ち易いパートである。これらの区間では、符号４０２に示すピークレベルＰｅａｋ_Ｓ（ｉ）は小さくなり、符号４０１に示すピークレベルＰｅａｋ_Ｍ（ｉ）は大きくなるため、尺度ｔ（ｉ）は小さくなる。その他の区間では、区間２１及び区間２２と比較して尺度ｔ（ｉ）は大きくなる。 FIG. 18 is a diagram for explaining the degree to which the input acoustic signal is close to monaural. Specifically, in FIG. 18, an example of the temporal change of the peak level Peak _M (i) indicated by reference numeral 401 and an example of the peak level Peak _S (i) and the temporal change indicated by reference numeral 402 are respectively shown. . The sections 21 and 22 are parts in which the input acoustic signal is close to monaural, and at the same time, auditory noise is easily noticeable. In these sections, the peak level Peak _S (i) indicated by reference numeral 402 decreases, and the peak level Peak _M (i) indicated by reference numeral 401 increases, so the scale t (i) decreases. In other sections, the scale t (i) is larger than that in the sections 21 and 22.

ぼかしレベル計算部１２５は、尺度ｔ（ｉ）に応じてぼかしレベルｆ（ｉ）を計算する。例えば、ぼかしレベル計算部１２５は、尺度ｔ（ｉ）が小さい場合にぼかしレベルｆ（ｉ）を大きく設定する。このため、係数設定部１２４は、図１８に示した区間２１及び区間２２に相当する差分信号Ｓ（ｉ）について重み付け係数ｒを大きく設定し、ぼかし処理部１２０は強くぼかし処理を行う。一方で、ぼかしレベル計算部１２５は、尺度ｔ（ｉ）が大きい場合にぼかしレベルｆ（ｉ）を小さく設定する。このため、係数設定部１２４は、図１８に示した区間２１及び区間２２以外の区間に相当する差分信号Ｓ（ｉ）について重み付け係数ｒを小さく設定し、ぼかし処理部１２０は弱くぼかし処理を行う。このように、本構成例に係る信号処理装置１００は、聴覚ノイズの目立ち易さによってぼかしレベルの強度を変化させることにより、聴覚ノイズが目立ち易いパートに的を絞って強くぼかし処理を行なうことができ、より効果的に聴覚ノイズを防ぐことができる。 The blur level calculation unit 125 calculates the blur level f (i) according to the scale t (i). For example, the blur level calculation unit 125 sets the blur level f (i) to be large when the scale t (i) is small. For this reason, the coefficient setting unit 124 sets the weighting coefficient r large for the difference signal S (i) corresponding to the sections 21 and 22 shown in FIG. 18, and the blurring processing unit 120 performs the blurring process strongly. On the other hand, the blur level calculation unit 125 sets the blur level f (i) to be small when the scale t (i) is large. For this reason, the coefficient setting unit 124 sets the weighting coefficient r small for the difference signal S (i) corresponding to the sections other than the sections 21 and 22 shown in FIG. 18, and the blurring processing unit 120 performs the blurring process weakly. . As described above, the signal processing apparatus 100 according to the present configuration example can perform the strong blurring process by focusing on the part where the auditory noise is conspicuous by changing the intensity of the blur level depending on the conspicuousness of the auditory noise. It is possible to prevent auditory noise more effectively.

なお、Ｐｅａｋ_Ｓ（ｉ）の大小だけでは、入力された音響信号がモノラルに近いのか、音響信号のレベル自体が小さいのかを判定することは困難である。また、ぼかしレベル計算部１２５は、ＬｃｈとＲｃｈの相関を尺度ｔ（ｉ）として用いてもよい。ただし、その場合、尺度ｔ（ｉ）の大小関係は逆になる。 Note that it is difficult to determine whether the input acoustic signal is close to monaural or the level of the acoustic signal itself is small only by the magnitude of Peak _S (i). Further, the blur level calculation unit 125 may use the correlation between Lch and Rch as the scale t (i). However, in that case, the magnitude relationship of the scale t (i) is reversed.

以上説明したように、本構成例によれば、信号処理装置１００は、聴覚ノイズが目立ち易いパートに的を絞って強くぼかし処理を行なうことで、より効果的に聴覚ノイズを防ぐことができる。 As described above, according to the present configuration example, the signal processing apparatus 100 can more effectively prevent the auditory noise by performing the strong blurring process focusing on the part where the auditory noise is conspicuous.

［２−６．第４の構成例］
本構成例は、差分信号のうち聴覚ノイズが生じる帯域を抽出して、ぼかし処理を行う構成例である。以下、図１９を参照して、本構成例について説明する。 [2-6. Fourth configuration example]
This configuration example is a configuration example in which a band in which auditory noise is generated is extracted from the difference signal and blur processing is performed. Hereinafter, this configuration example will be described with reference to FIG.

図１９は、本実施形態に係る信号処理装置１００の論理的な構成の一例を示すブロック図である。図１９に示す構成例を、以下では第４の構成例とも称する。図１９に示すように、本構成例に係る信号処理装置１００は、差分信号計算部１１０、ぼかし処理部１２０、帯域分割部１３０及び合成部１３１を有する。 FIG. 19 is a block diagram illustrating an example of a logical configuration of the signal processing apparatus 100 according to the present embodiment. Hereinafter, the configuration example illustrated in FIG. 19 is also referred to as a fourth configuration example. As illustrated in FIG. 19, the signal processing device 100 according to this configuration example includes a difference signal calculation unit 110, a blur processing unit 120, a band division unit 130, and a synthesis unit 131.

差分信号計算部１１０は、差分信号を出力する。次いで、帯域分割部１３０は、差分信号を複数の帯域に分割する。次に、ぼかし処理部１２０は、帯域分割部１３０により分割された複数の帯域のうち少なくともひとつの帯域においてぼかし処理を行う。そして、合成部１３１は、ぼかし処理部１２０によるぼかし処理がされた信号とされなかった信号とを合成して、ぼかし信号を得る。差分信号計算部１１０及びぼかし処理部１２０の内部処理については上記説明した通りであるので、ここでの詳細な説明は省略する。 The difference signal calculation unit 110 outputs a difference signal. Next, the band dividing unit 130 divides the differential signal into a plurality of bands. Next, the blurring processing unit 120 performs blurring processing in at least one band among the plurality of bands divided by the band dividing unit 130. The synthesizing unit 131 then synthesizes the signal that has not been subjected to the blurring processing by the blurring processing unit 120 and the signal that has not been subjected to blurring processing, to obtain a blur signal. Since the internal processing of the difference signal calculation unit 110 and the blur processing unit 120 is as described above, detailed description thereof is omitted here.

（１）帯域分割部１３０
帯域分割部１３０は、差分信号計算部１１０から出力された差分信号を複数の帯域に分割する機能を有する。例えば、帯域分割部１３０は、ぼかし処理部１２０によるぼかし処理の対象となる帯域と、対象外となる帯域とに分割する。ぼかし処理の対象となる帯域は、ひとつの連続した帯域であってもよいし、複数の非連続な帯域の集合体であってもよい。ぼかし処理の対象外となる帯域についても同様である。 (1) Band division unit 130
The band dividing unit 130 has a function of dividing the difference signal output from the difference signal calculating unit 110 into a plurality of bands. For example, the band dividing unit 130 divides the band into the band to be blurred by the blur processing unit 120 and the band to be out of the band. The band to be subjected to the blurring process may be one continuous band or an aggregate of a plurality of discontinuous bands. The same applies to bands that are not subject to blur processing.

第１の比較例において生じていた、図１０に示したブロック状のスペクトログラムに起因する聴覚ノイズの目立ち易さは、どの周波数帯域でブロック状のスペクトログラムが生じるかによって違いがある。これは、ジョイントステレオ符号化が対象とする周波数帯域の偏りや、人間の聴覚の特性に依存すると考えられる。聴覚ノイズが目立ち易い周波数帯域は、経験的に、１ｋＨｚ〜１０ｋＨｚである。このため、聴覚ノイズが目立ち易い帯域で重点的にぼかし処理が行なわれることが望ましい。そこで、帯域分割部１３０は、聴覚ノイズが目立ち易い帯域についてはぼかし処理部１２０へ出力し、その他の帯域については合成部１３１へ出力する。 The conspicuousness of the auditory noise caused by the block-shaped spectrogram shown in FIG. 10 that occurs in the first comparative example differs depending on which frequency band the block-shaped spectrogram is generated. This is considered to depend on the frequency band bias targeted for joint stereo coding and the characteristics of human hearing. The frequency band in which auditory noise is conspicuous is empirically 1 kHz to 10 kHz. For this reason, it is desirable to perform the blurring process mainly in a band in which auditory noise is conspicuous. Therefore, the band dividing unit 130 outputs the band in which the auditory noise is conspicuous to the blurring processing unit 120, and outputs the other band to the synthesizing unit 131.

例えば、帯域分割部１３０は、下側のカットオフ周波数がＦｃ１であり、上側のカットオフ周波数がＦｃ２であるようなバンドパスフィルタを用いて、ぼかし処理部１２０へ出力する帯域の信号を抽出し得る。カットオフ周波数は、経験的にＦｃ１＝１ｋＨｚ程度、Ｆｃ２＝１０ｋＨｚ程度が効果的である。帯域分割部１３０は、バンドパスフィルタにより抽出した帯域の信号についてぼかし処理部１２０へ出力することで、当該帯域に関する重点的なぼかし処理を実現することができる。帯域分割部１３０は、カットオフ周波数がＦｃ１のハイパスフィルタをバンドパスフィルタの代わりに含んでいてもよく、その場合は演算量を抑制可能である。 For example, the band dividing unit 130 extracts a signal of the band to be output to the blurring processing unit 120 using a bandpass filter in which the lower cutoff frequency is Fc1 and the upper cutoff frequency is Fc2. obtain. The cut-off frequency is empirically effective to be about Fc1 = 1 kHz and Fc2 = 10 kHz. The band dividing unit 130 outputs the band signal extracted by the band-pass filter to the blurring processing unit 120, thereby realizing a focused blurring process regarding the band. The band dividing unit 130 may include a high-pass filter whose cutoff frequency is Fc1 instead of the band-pass filter. In this case, the amount of calculation can be suppressed.

なお、帯域分割部１３０は、差分信号計算部１１０よりも前段に設けられていてもよい。その場合、帯域分割部１３０は、差分信号を求める帯域を、主にボーカルの音が存在する周波数帯域に絞ることで、例えば中央に定位することが多いベース系の音が抑制されて低域が少ない軽い音になってしまうことを回避することができる。 Note that the band dividing unit 130 may be provided before the differential signal calculating unit 110. In that case, the band dividing unit 130 narrows the band for obtaining the difference signal to a frequency band in which vocal sounds mainly exist, thereby suppressing the bass sound that is often localized in the center and reducing the low frequency band. It can be avoided that there are few light sounds.

（２）合成部１３１
合成部１３１は、帯域分割部１３０により分割された複数の差分信号を合成する機能を有する。詳しくは、合成部１３１は、ぼかし処理部１２０によりぼかし処理された帯域の差分信号とぼかし処理部１２０により分割された複数の帯域のうちぼかし処理部１２０によるぼかし処理がなされなかった帯域の差分信号とを合成する。合成部１３１は、これらの信号を単純に加算することで合成し得る。 (2) Synthesis unit 131
The combining unit 131 has a function of combining a plurality of difference signals divided by the band dividing unit 130. Specifically, the synthesizing unit 131 uses the difference signal of the band subjected to the blurring process by the blurring processing unit 120 and the difference signal of the band that has not been subjected to the blurring process by the blurring processing unit 120 among the plurality of bands divided by the blurring processing unit 120. And synthesize. The synthesizer 131 can synthesize these signals simply by adding them.

以上説明したように、本構成例によれば、信号処理装置１００は、聴覚ノイズが目立ち易い帯域で重点的にぼかし処理を行うことで、より効果的に聴覚ノイズを防ぐことができる。 As described above, according to the present configuration example, the signal processing apparatus 100 can more effectively prevent the auditory noise by performing the blurring process mainly in the band where the auditory noise is conspicuous.

＜３．第２の実施形態＞
本実施形態は、ゲイン制御により聴覚ノイズを低減する形態である。まず、図２０を参照して、本実施形態の基本構成を説明する。 <3. Second Embodiment>
In the present embodiment, auditory noise is reduced by gain control. First, the basic configuration of the present embodiment will be described with reference to FIG.

［３−１．第１の構成例］
図２０は、本実施形態に係る信号処理装置１００の論理的な構成の一例を示すブロック図である。図２０に示す構成例を、以下では第１の構成例とも称する。図２０に示すように、本構成例に係る信号処理装置１００は、差分信号計算部１１０、ゲインレベル設定部１４０及びゲイン制御部１４１を有する。 [3-1. First Configuration Example]
FIG. 20 is a block diagram illustrating an example of a logical configuration of the signal processing apparatus 100 according to the present embodiment. Hereinafter, the configuration example illustrated in FIG. 20 is also referred to as a first configuration example. As illustrated in FIG. 20, the signal processing device 100 according to this configuration example includes a differential signal calculation unit 110, a gain level setting unit 140, and a gain control unit 141.

差分信号計算部１１０は、差分信号を出力する。次いで、ゲインレベル設定部１４０は、ゲインレベルを設定する。そして、ゲイン制御部１４１は、ゲインレベル設定部１４０により設定されたゲインレベルを用いて、差分信号のゲインを制御する。本実施形態に係る信号処理装置１００は、ボーカルが中央に定位する楽曲である時間領域の音響信号を入力され、ボーカルを抑制した時間領域の音響信号を出力する。差分信号計算部１１０の内部処理については上記説明した通りであるので、ここでの詳細な説明は省略する。 The difference signal calculation unit 110 outputs a difference signal. Next, the gain level setting unit 140 sets a gain level. Then, the gain control unit 141 controls the gain of the differential signal using the gain level set by the gain level setting unit 140. The signal processing apparatus 100 according to the present embodiment receives a time-domain acoustic signal that is a song in which the vocal is localized in the center, and outputs a time-domain acoustic signal in which the vocal is suppressed. Since the internal processing of the difference signal calculation unit 110 is as described above, a detailed description thereof is omitted here.

（１）ゲインレベル設定部１４０
ゲインレベル設定部１４０は、差分信号のゲインレベルを設定する機能を有する。例えば、ゲインレベル設定部１４０は、入力された音響信号の聴覚ノイズの目立ち易さに応じてゲインレベルを設定する。 (1) Gain level setting unit 140
The gain level setting unit 140 has a function of setting the gain level of the differential signal. For example, the gain level setting unit 140 sets the gain level according to the conspicuousness of the auditory noise of the input acoustic signal.

第１の比較例において生じていた、図１０に示したブロック状のスペクトログラムに起因する聴覚ノイズの程度は、楽曲中に変化し得る。このため、聴覚ノイズの目立ち易さに応じて、差分信号のゲインレベルを変化させることが望ましい。上述したように、入力された音響信号がモノラルに近い、例えばボーカルのソロのパート等の殆どの音が中央に定位するパートは聴覚ノイズが目立ち易く、他のパートでは目立ち難い。そのため、入力された音響信号がモノラルに近い場合に、差分信号のゲインレベルを変化させることが望ましい。 The degree of auditory noise caused by the block spectrogram shown in FIG. 10 that has occurred in the first comparative example can vary during the music. For this reason, it is desirable to change the gain level of a differential signal according to the conspicuousness of auditory noise. As described above, the input sound signal is close to monaural, for example, a part in which most sounds such as vocal solo parts are localized in the center, auditory noise is easily noticeable, and other parts are not easily noticeable. Therefore, it is desirable to change the gain level of the differential signal when the input acoustic signal is close to monaural.

そこで、ゲインレベル設定部１４０は、聴覚ノイズの目立ち易さの尺度の一例として、上記数式８〜数式１０に示した尺度ｔ（ｉ）を利用して、入力された音響信号がモノラルに近い度合に基づいてゲインレベルを設定する。具体的には、ゲインレベル設定部１４０は、尺度ｔ（ｉ）が小さい場合にゲインレベルｇ（ｉ）を小さく設定し、尺度ｔ（ｉ）が大きい場合にゲインレベルｇ（ｉ）を大きく設定する。例えば、ゲインレベル設定部１４０は、下記の数式の範囲でゲインレベルｇ（ｉ）を設定する。
０．０＜＝ｇ（ｉ）＜＝１．０（数式１１） Therefore, the gain level setting unit 140 uses the scale t (i) shown in the above Equations 8 to 10 as an example of a measure of the conspicuousness of auditory noise, and the degree to which the input acoustic signal is close to monaural. Set the gain level based on. Specifically, the gain level setting unit 140 sets the gain level g (i) small when the scale t (i) is small, and sets the gain level g (i) large when the scale t (i) is large. To do. For example, the gain level setting unit 140 sets the gain level g (i) within the range of the following mathematical formula.
0.0 <= g (i) <= 1.0 (Formula 11)

なお、経験的には、下記の数式の範囲でゲインレベルｇ（ｉ）が設定されることが望ましい。
０．２５＜ｇ（ｉ）＜＝１．０（数式１２） Empirically, it is desirable to set the gain level g (i) within the range of the following formula.
0.25 <g (i) <= 1.0 (Formula 12)

（２）ゲイン制御部１４１
ゲイン制御部１４１は、ゲインレベル設定部１４０により設定されたゲインレベルを用いて差分信号のゲインを制御する機能を有する。例えば、ゲイン制御部１４１は、ゲインレベル設定部１４０による設定に基づくゲインレベルの制御を行うことで、ボーカルが目立つ区間においてゲインを低下させ、ボーカルを抑制した時間領域の音響信号を出力することが可能である。ゲインレベル設定部１４０により設定されたゲインレベルをｇ（ｉ）とすると、ゲイン制御部１４１は、下記の数式によりゲインが制御された信号Ｇ（ｉ）を計算する。
Ｇ（ｉ）＝ｇ（ｉ）×Ｓ（ｉ）（数式１３） (2) Gain control unit 141
The gain control unit 141 has a function of controlling the gain of the differential signal using the gain level set by the gain level setting unit 140. For example, the gain control unit 141 can control the gain level based on the setting by the gain level setting unit 140, thereby reducing the gain in a section where the vocal is conspicuous and outputting a time domain acoustic signal in which the vocal is suppressed. Is possible. When the gain level set by the gain level setting unit 140 is g (i), the gain control unit 141 calculates a signal G (i) whose gain is controlled by the following mathematical formula.
G (i) = g (i) × S (i) (Formula 13)

［３−２．動作処理例］
図２１は、本実施形態に係る信号処理装置１００において実行される信号処理の流れの一例を示すフローチャートである。 [3-2. Operation processing example]
FIG. 21 is a flowchart illustrating an example of the flow of signal processing executed in the signal processing apparatus 100 according to the present embodiment.

図２１に示すように、まず、ステップＳ３０２で、差分信号計算部１１０は、ｉ番目のＬｃｈの信号Ｌ（ｉ）及びＲｃｈの信号Ｒ（ｉ）の入力を受け付ける。 As shown in FIG. 21, first, in step S302, the difference signal calculation unit 110 accepts input of an i-th Lch signal L (i) and an Rch signal R (i).

次いで、ステップＳ３０４で、差分信号計算部１１０は、差分信号Ｓ（ｉ）を計算する。例えば、差分信号計算部１１０は、上記数式１を用いて差分信号Ｓ（ｉ）を計算する。 Next, in step S304, the difference signal calculation unit 110 calculates the difference signal S (i). For example, the difference signal calculation unit 110 calculates the difference signal S (i) using the above Equation 1.

次に、ステップＳ３０６で、ゲインレベル設定部１４０は、ゲインレベルｇ（ｉ）を計算する。例えば、ゲインレベル設定部１４０は、上記数式８〜数式１２を用いてゲインレベルｇ（ｉ）を計算する。 Next, in step S306, the gain level setting unit 140 calculates the gain level g (i). For example, the gain level setting unit 140 calculates the gain level g (i) using the above formulas 8 to 12.

次いで、ステップＳ３０８で、ゲイン制御部１４１は、ゲインが制御された信号Ｇ（ｉ）を計算する。例えば、ゲイン制御部１４１は、上記数式１３を用いてゲインが制御された信号Ｇ（ｉ）を計算する。 Next, in step S308, the gain control unit 141 calculates a signal G (i) whose gain is controlled. For example, the gain control unit 141 calculates the signal G (i) whose gain is controlled using the above Equation 13.

そして、ステップＳ３１０で、ゲイン制御部１４１は、計算したゲインが制御された信号Ｇ（ｉ）を出力する。 In step S310, the gain control unit 141 outputs a signal G (i) in which the calculated gain is controlled.

［３−３．効果］
以下では、図２２を参照して、本実施形態に係る信号処理装置１００の効果を説明する。 [3-3. effect]
Below, with reference to FIG. 22, the effect of the signal processing apparatus 100 which concerns on this embodiment is demonstrated.

図２２は、本実施形態に係る信号処理装置１００の効果を説明するための図である。図２２の実線は、第１の比較例に係る信号処理装置により処理された音響信号のパワーの時間変化例である。例えば、区間３１及び区間３２は、ボーカルのソロのパートなど、入力された音響信号がモノラルに近い区間である。このような区間は、モノラルに近い信号が抑制されることで差分信号のパワーが小さくなった区間であると共に、聴覚ノイズが目立ち易い部分である。区間３１及び区間３２以外の区間は、様々な楽器が存在するパートなど、入力された音響信号がモノラルに近くない区間である。このような区間は、差分信号のパワーが区間３１及び区間３２と比較して大きい区間であると共に、聴覚ノイズが目立ち難い部分である。 FIG. 22 is a diagram for explaining the effect of the signal processing apparatus 100 according to the present embodiment. The solid line in FIG. 22 is an example of the time change of the power of the acoustic signal processed by the signal processing apparatus according to the first comparative example. For example, the section 31 and the section 32 are sections in which the input acoustic signal is close to monaural, such as a vocal solo part. Such a section is a section in which the power of the differential signal is reduced by suppressing a signal close to monaural and is a portion in which auditory noise is easily noticeable. The sections other than the section 31 and the section 32 are sections in which the input acoustic signal is not close to monaural, such as a part in which various musical instruments exist. Such a section is a section in which the power of the differential signal is larger than that of the sections 31 and 32 and is a portion in which auditory noise is not noticeable.

図２２の破線は、本実施形態に係る信号処理装置１００により処理された音響信号のパワーの時間変化例である。区間３１及び区間３２の破線に示すように、本実施形態に係る信号処理装置１００は、主に聴覚ノイズが目立ち易い部分についてゲイン制御を行ってレベルを下げることができる。信号処理装置１００は、聴覚ノイズが目立ちやすい部分について、聴覚ノイズごとレベルを下げることができるため、ユーザに与える聴覚ノイズの不快感を軽減することが可能である。また、本実施形態に係る信号処理装置１００は、第２の比較例のような周波数領域での処理を行わないので、小さな演算量で処理することが可能である。 The broken line in FIG. 22 is an example of time change of the power of the acoustic signal processed by the signal processing apparatus 100 according to the present embodiment. As indicated by the broken lines in the section 31 and the section 32, the signal processing apparatus 100 according to the present embodiment can lower the level by performing gain control mainly on the portion where the auditory noise is conspicuous. Since the signal processing apparatus 100 can lower the level of each auditory noise in a portion where the auditory noise is conspicuous, it is possible to reduce the discomfort of the auditory noise given to the user. Further, since the signal processing apparatus 100 according to the present embodiment does not perform processing in the frequency domain as in the second comparative example, it is possible to perform processing with a small amount of calculation.

［３−４．第２の構成例］
本構成例は、差分信号のうち聴覚ノイズが生じる帯域を抽出して、ゲイン制御を行う構成例である。以下、図２３を参照して、本構成例について説明する。 [3-4. Second configuration example]
This configuration example is a configuration example in which gain control is performed by extracting a band in which auditory noise occurs from the differential signal. Hereinafter, this configuration example will be described with reference to FIG.

図２３は、本実施形態に係る信号処理装置１００の論理的な構成の一例を示すブロック図である。図２３に示す構成例を、以下では第２の構成例とも称する。図２３に示すように、本構成例に係る信号処理装置１００は、差分信号計算部１１０、帯域分割部１３０、合成部１３１、ゲインレベル設定部１４０及びゲイン制御部１４１を有する。 FIG. 23 is a block diagram illustrating an example of a logical configuration of the signal processing apparatus 100 according to the present embodiment. Hereinafter, the configuration example illustrated in FIG. 23 is also referred to as a second configuration example. As shown in FIG. 23, the signal processing apparatus 100 according to this configuration example includes a differential signal calculation unit 110, a band division unit 130, a synthesis unit 131, a gain level setting unit 140, and a gain control unit 141.

差分信号計算部１１０は、差分信号を出力する。次いで、帯域分割部１３０は、差分信号を複数の帯域に分割する。詳しくは、帯域分割部１３０は、ゲイン制御部１４１によるゲイン制御の対象となる帯域と、対象外となる帯域とに分割する。ここで、第１の実施形態における第４の構成例と同様の理由で、聴覚ノイズが目立ち易い帯域で重点的にゲイン制御が行われることが望ましい。そこで、帯域分割部１３０は、聴覚ノイズが目立ち易い帯域についてはゲイン制御部１４１へ出力し、その他の帯域については合成部１３１へ出力する。 The difference signal calculation unit 110 outputs a difference signal. Next, the band dividing unit 130 divides the differential signal into a plurality of bands. Specifically, the band dividing unit 130 divides a band that is a target of gain control by the gain control unit 141 and a band that is not a target. Here, for the same reason as in the fourth configuration example in the first embodiment, it is desirable that gain control is performed mainly in a band in which auditory noise is conspicuous. Therefore, the band dividing unit 130 outputs to the gain control unit 141 the band where auditory noise is conspicuous, and outputs the other band to the combining unit 131.

次いで、ゲインレベル設定部１４０は、ゲインレベルを設定する。そして、ゲイン制御部１４１は、ゲインレベル設定部１４０により設定されたゲインレベルを用いて、差分信号のゲインを制御する。詳しくは、ゲイン制御部１４１は、帯域分割部１３０により分割された複数の帯域のうち少なくともひとつの帯域において、ゲインレベル設定部１４０により設定されたゲインレベルを用いて差分信号のゲインを制御する。 Next, the gain level setting unit 140 sets a gain level. Then, the gain control unit 141 controls the gain of the differential signal using the gain level set by the gain level setting unit 140. Specifically, the gain control unit 141 controls the gain of the differential signal using the gain level set by the gain level setting unit 140 in at least one of the plurality of bands divided by the band dividing unit 130.

そして、合成部１３１は、ゲイン制御部１４１から出力された信号と帯域分割部１３０から合成部１３１へ直接的に出力された信号とを合成することで、出力する音響信号を得る。詳しくは、合成部１３１は、ゲイン制御部１４１によりゲイン制御された帯域の差分信号と帯域分割部１３０により分割された複数の帯域のうちゲイン制御部１４１によるゲイン制御がなされなかった帯域の差分信号とを合成する。 Then, the synthesis unit 131 obtains an acoustic signal to be output by synthesizing the signal output from the gain control unit 141 and the signal output directly from the band dividing unit 130 to the synthesis unit 131. Specifically, the synthesizing unit 131 includes a difference signal in a band that is gain-controlled by the gain control unit 141 and a difference signal in a band that is not subjected to gain control by the gain control unit 141 among a plurality of bands divided by the band dividing unit 130. And synthesize.

以上説明したように、本構成例によれば、信号処理装置１００は、聴覚ノイズが目立ち易い帯域で重点的にゲイン制御を行うことで、ユーザに与える聴覚ノイズの不快感を効率的に軽減することが可能である。また、本構成例に係る信号処理装置１００は、一部の帯域でゲイン制御を行うため、出力される音響信号全体の音量が過度に低下することを防止することができる。 As described above, according to this configuration example, the signal processing apparatus 100 efficiently reduces the discomfort of the auditory noise given to the user by performing gain control mainly in a band in which the auditory noise is conspicuous. It is possible. Moreover, since the signal processing apparatus 100 according to the present configuration example performs gain control in a part of the band, it is possible to prevent the volume of the entire output acoustic signal from being excessively reduced.

＜４．第３の実施形態＞
本実施形態は、上述した第１の実施形態と第２の実施形態とを組み合わせた形態である。以下、図２４を参照して、本実施形態に係る信号処理装置１００の構成例について説明する。 <4. Third Embodiment>
This embodiment is a combination of the first embodiment and the second embodiment described above. Hereinafter, a configuration example of the signal processing apparatus 100 according to the present embodiment will be described with reference to FIG.

［４−１．構成例］
図２４は、本実施形態に係る信号処理装置１００の論理的な構成の一例を示すブロック図である。図２４に示すように、本実施形態に係る信号処理装置１００は、差分信号計算部１１０、帯域分割部１３０、ぼかし処理部１２０、遅延量設定部１２３、係数設定部１２４、ぼかしレベル計算部１２５、ゲインレベル設定部１４０、ゲイン制御部１４１及び合成部１３１を有する。 [4-1. Configuration example]
FIG. 24 is a block diagram illustrating an example of a logical configuration of the signal processing apparatus 100 according to the present embodiment. As illustrated in FIG. 24, the signal processing device 100 according to the present embodiment includes a difference signal calculation unit 110, a band division unit 130, a blur processing unit 120, a delay amount setting unit 123, a coefficient setting unit 124, and a blur level calculation unit 125. , A gain level setting unit 140, a gain control unit 141, and a synthesis unit 131.

差分信号計算部１１０は、差分信号を出力する。次いで、帯域分割部１３０は、差分信号を複数の帯域に分割する。詳しくは、帯域分割部１３０は、ぼかし処理部１２０によるぼかし処理及びゲイン制御部１４１によるゲイン制御の対象となる帯域と、対象外となる帯域とに分割する。例えば、帯域分割部１３０は、聴覚ノイズが目立ち易い帯域についてはぼかし処理部１２０へ出力し、その他の帯域については合成部１３１へ出力する。 The difference signal calculation unit 110 outputs a difference signal. Next, the band dividing unit 130 divides the differential signal into a plurality of bands. Specifically, the band dividing unit 130 divides a band to be subjected to blurring processing by the blurring processing unit 120 and gain control by the gain control unit 141 and a band to be excluded. For example, the band dividing unit 130 outputs a band in which auditory noise is conspicuous to the blurring processing unit 120, and outputs the other band to the synthesizing unit 131.

次いで、ぼかし処理部１２０は、帯域分割部１３０から出力された帯域の差分信号についてぼかし処理を行う。詳しくは、ぼかし処理部１２０は、帯域分割部１３０により分割された複数の帯域のうち少なくともひとつの帯域においてぼかし処理を行う。その際、ぼかし処理部１２０は、遅延量設定部１２３により設定された遅延量ｎ及び係数設定部１２４により設定された重み係数ｒを用いて、上記数式２によりぼかし信号Ｆ（ｉ）を得る。 Next, the blurring processing unit 120 performs a blurring process on the band difference signal output from the band dividing unit 130. Specifically, the blurring processing unit 120 performs the blurring processing in at least one band among a plurality of bands divided by the band dividing unit 130. At this time, the blurring processing unit 120 uses the delay amount n set by the delay amount setting unit 123 and the weighting factor r set by the coefficient setting unit 124 to obtain the blur signal F (i) according to the above equation 2.

ここで、係数設定部１２４は、第１の実施形態の第２の構成例で説明した処理を行ってもよいし、第２の実施形態の第３の構成例で説明した処理を行ってもよい。即ち、係数設定部１２４は、入力された音響信号のオーディオコーデックに基づいて重み付け係数ｒを設定してもよいし、ぼかしレベル計算部１２５により計算されたぼかしレベルｆ（ｉ）に応じて重み付け係数ｒを設定してもよい。例えば、前者による重み付け係数をｒ１とし、後者による重み付け係数をｒ２とすると、係数設定部１２４は、下記の数式に示すように最大値を重み付け係数ｒとして採用してもよい。
ｒ（ｉ）＝ＭＡＸ（ｒ１（ｉ），ｒ２（ｉ））（数式１４） Here, the coefficient setting unit 124 may perform the processing described in the second configuration example of the first embodiment, or may perform the processing described in the third configuration example of the second embodiment. Good. That is, the coefficient setting unit 124 may set the weighting coefficient r based on the audio codec of the input acoustic signal, or the weighting coefficient according to the blur level f (i) calculated by the blur level calculation unit 125. r may be set. For example, if the former weighting coefficient is r1 and the latter weighting coefficient is r2, the coefficient setting unit 124 may adopt the maximum value as the weighting coefficient r as shown in the following equation.
r (i) = MAX (r1 (i), r2 (i)) (Formula 14)

また、係数設定部１２４は、ｒ１及びｒ２を組み合わせて重み付け係数ｒを設定してもよい。例えば、係数設定部１２４は、ｒ１及びｒ２の平均値により重み付け係数ｒを設定してもよい。つまり、ｒ１及びｒ２の大小関係が重み付け係数ｒに反映されればよい。 The coefficient setting unit 124 may set the weighting coefficient r by combining r1 and r2. For example, the coefficient setting unit 124 may set the weighting coefficient r by the average value of r1 and r2. That is, the magnitude relationship between r1 and r2 may be reflected in the weighting coefficient r.

ゲイン制御部１４１は、ぼかし処理部１２０から出力されたぼかし信号のゲイン制御を行う。詳しくは、ゲイン制御部１４１は、ゲインレベル設定部１４０により設定されたゲインレベルを用いて、ぼかし処理部１２０によりぼかし処理された信号のゲインを制御する。例えば、ゲイン制御部１４１は、下記の数式を用いてゲインが制御された信号Ｇ（ｉ）を得る。
Ｇ（ｉ）＝ｇ（ｉ）×Ｆ（ｉ）（数式１５） The gain control unit 141 performs gain control of the blur signal output from the blur processing unit 120. Specifically, the gain control unit 141 uses the gain level set by the gain level setting unit 140 to control the gain of the signal subjected to the blur processing by the blur processing unit 120. For example, the gain control unit 141 obtains a signal G (i) whose gain is controlled using the following mathematical formula.
G (i) = g (i) × F (i) (Formula 15)

そして、合成部１３１は、ゲイン制御部１４１から出力された信号と帯域分割部１３０から合成部１３１へ直接的に出力された信号とを合成することで、出力する音響信号を得る。詳しくは、合成部１３１は、ゲイン制御部１４１によりゲイン制御された信号と帯域分割部１３０により分割された複数の帯域のうちゲイン制御部１４１によるゲイン制御がなされなかった帯域の差分信号とを合成する。 Then, the synthesis unit 131 obtains an acoustic signal to be output by synthesizing the signal output from the gain control unit 141 and the signal output directly from the band dividing unit 130 to the synthesis unit 131. Specifically, the synthesis unit 131 synthesizes the signal gain-controlled by the gain control unit 141 and the difference signal in the band that has not been gain-controlled by the gain control unit 141 among the plurality of bands divided by the band division unit 130. To do.

なお、ぼかしレベル計算部１２５及びゲインレベル設定部１４０は、聴覚の聴覚ノイズの目立ち易さの尺度として、上記数式８〜１０に示した尺度ｔ（ｉ）を共通して用いてもよいし、異なる尺度を採用してもよい。 Note that the blur level calculation unit 125 and the gain level setting unit 140 may use the scale t (i) shown in the above formulas 8 to 10 in common as the scale of the visibility of auditory noise. Different scales may be employed.

また、ぼかし処理部１２０及びゲインレベル設定部１４０の処理の順番は逆でもよい。 The order of processing of the blur processing unit 120 and the gain level setting unit 140 may be reversed.

以上、本実施形態に係る信号処理装置１００の構成例について説明した。続いて、本実施形態に係る信号処理装置１００の動作処理を説明する。 The configuration example of the signal processing device 100 according to the present embodiment has been described above. Subsequently, an operation process of the signal processing apparatus 100 according to the present embodiment will be described.

［４−２．動作処理例］
図２５は、本実施形態に係る信号処理装置１００において実行される信号処理の流れの一例を示すフローチャートである。 [4-2. Operation processing example]
FIG. 25 is a flowchart illustrating an example of the flow of signal processing executed in the signal processing device 100 according to the present embodiment.

図２５に示すように、まず、ステップＳ４０２で、差分信号計算部１１０は、ｉ番目のＬｃｈの信号Ｌ（ｉ）及びＲｃｈの信号Ｒ（ｉ）の入力を受け付ける。 As shown in FIG. 25, first, in step S <b> 402, the difference signal calculation unit 110 accepts input of an i-th Lch signal L (i) and an Rch signal R (i).

次いで、ステップＳ４０４で、差分信号計算部１１０は、差分信号Ｓ（ｉ）を計算する。例えば、差分信号計算部１１０は、上記数式１を用いて差分信号Ｓ（ｉ）を計算する。 Next, in step S404, the difference signal calculation unit 110 calculates the difference signal S (i). For example, the difference signal calculation unit 110 calculates the difference signal S (i) using the above Equation 1.

次に、ステップＳ４０６で、ゲインレベル設定部１４０は、ゲインレベルｇ（ｉ）を計算する。例えば、ゲインレベル設定部１４０は、上記数式８〜数式１２を用いてゲインレベルｇ（ｉ）を計算する。 Next, in step S406, the gain level setting unit 140 calculates the gain level g (i). For example, the gain level setting unit 140 calculates the gain level g (i) using the above formulas 8 to 12.

次いで、ステップＳ４０８で、遅延量設定部１２３は遅延量ｎを計算し、係数設定部１２４は重み付け係数ｒを計算する。例えば、遅延量設定部１２３は、上記数式５及び数式６を用いて遅延量ｎを計算する。例えば、係数設定部１２４は、上記数式１４を用いて重み付け係数ｒを計算する。 Next, in step S408, the delay amount setting unit 123 calculates the delay amount n, and the coefficient setting unit 124 calculates the weighting coefficient r. For example, the delay amount setting unit 123 calculates the delay amount n using Equations 5 and 6 above. For example, the coefficient setting unit 124 calculates the weighting coefficient r using the above formula 14.

次に、ステップＳ４１０で、帯域分割部１３０は、差分信号Ｓ（ｉ）を処理対象の帯域と処理対象外の帯域とに分割する。ここでの処理対象とは、ぼかし処理部１２０によるぼかし処理及びゲイン制御部１４１によるゲイン制御の対象を指す。例えば、帯域分割部１３０は、差分信号Ｓ（ｉ）を聴覚ノイズが目立ち易い帯域とそうでない帯域とに分割し、目立ち易い帯域を処理対象の帯域とし、そうでない帯域を処理対象外の帯域とする。 Next, in step S410, the band dividing unit 130 divides the difference signal S (i) into a band to be processed and a band not to be processed. The processing target here refers to a target of blur processing by the blur processing unit 120 and gain control by the gain control unit 141. For example, the band dividing unit 130 divides the differential signal S (i) into a band in which auditory noise is conspicuous and a band in which auditory noise is not conspicuous. To do.

次いで、ステップＳ４１２で、ぼかし処理部１２０は、処理対象の帯域においてぼかし信号Ｆ（ｉ）を計算する。例えば、ぼかし処理部１２０は、帯域分割部１３０により分割された複数の帯域のうち、聴覚ノイズが目立ち易い帯域の差分信号について、上記数式２を用いてぼかし信号Ｆ（ｉ）を計算する。 Next, in step S412, the blur processing unit 120 calculates the blur signal F (i) in the processing target band. For example, the blurring processing unit 120 calculates the blurring signal F (i) using the above Equation 2 for a difference signal in a band in which auditory noise is conspicuous among a plurality of bands divided by the band dividing unit 130.

次に、ステップＳ４１４で、ゲイン制御部１４１は、処理対象の帯域においてゲインが制御された信号Ｇ（ｉ）を計算する。例えば、ゲイン制御部１４１は、ぼかし処理部１２０により出力されたぼかし信号Ｆ（ｉ）について、上記数式１５を用いてゲインが制御された信号Ｇ（ｉ）を計算する。 Next, in step S414, the gain control unit 141 calculates a signal G (i) whose gain is controlled in the band to be processed. For example, the gain control unit 141 calculates a signal G (i) whose gain is controlled using the formula 15 for the blur signal F (i) output from the blur processing unit 120.

次いで、ステップＳ４１６で、合成部１３１は、上記ステップＳ４１２及びＳ４１４における処理後の信号と処理対象外の信号とを合成する。例えば、合成部１３１は、上記ステップＳ４１４においてゲイン制御された処理対象の帯域におけるゲインが制御された信号Ｇ（ｉ）と、上記ステップＳ４１０において分割された処理対象外の帯域における差分信号Ｓ（ｉ）とを合成する。 Next, in step S416, the synthesizer 131 synthesizes the signal after processing in steps S412 and S414 and the signal not to be processed. For example, the synthesizing unit 131 uses the signal G (i) whose gain in the processing target band whose gain is controlled in step S414 and the difference signal S (i) in the non-processing band divided in step S410. ) And.

そして、ステップＳ４１８で、合成部１３１は、上記ステップＳ４１６において合成された信号を出力する。 In step S418, the synthesis unit 131 outputs the signal synthesized in step S416.

以上説明したように、本実施形態によれば、信号処理装置１００は、第１の実施形態及び第２の実施形態の効果を両立させることが可能であり、より効果的に聴覚ノイズを防ぐことができる。 As described above, according to this embodiment, the signal processing apparatus 100 can achieve the effects of the first embodiment and the second embodiment, and more effectively prevent auditory noise. Can do.

＜５．第４の実施形態＞
本実施形態は、信号処理装置１００が周波数領域の信号に対してボーカル抑制処理を行う形態である。以下では、図２６及び図２７を参照して、本実施形態について説明する。 <5. Fourth Embodiment>
In the present embodiment, the signal processing apparatus 100 performs vocal suppression processing on a frequency domain signal. Hereinafter, the present embodiment will be described with reference to FIGS. 26 and 27.

図２６は、本実施形態に係る信号処理装置１００の論理的な構成の一例を示すブロック図である。図２６に示すように、本実施形態に係る信号処理装置１００は、ＦＦＴ部１５０、差分信号計算部１１０、ぼかし処理部１２０及びＩＦＦＴ部１５１を有する。 FIG. 26 is a block diagram illustrating an example of a logical configuration of the signal processing apparatus 100 according to the present embodiment. As shown in FIG. 26, the signal processing apparatus 100 according to the present embodiment includes an FFT unit 150, a difference signal calculation unit 110, a blurring processing unit 120, and an IFFT unit 151.

（１）ＦＦＴ部１５０
ＦＦＴ部１５０は、入力された時間領域の信号を周波数領域の信号へ変換する機能を有する。例えば、ＦＦＴ部１５０は、ＦＦＴにより時間領域の信号を周波数領域の信号へ変換する。本変換処理には、ＦＦＴ以外の任意の方式が採用されてもよい。また、入力された音響信号が周波数領域の信号である場合、ＦＦＴ部１５０は省略されてもよい。入力された音響信号が周波数領域の信号である場合、上記第１〜第３の実施形態では、周波数領域の信号を時間領域の信号に変換する工程が要される。これに対し、本実施形態に係る信号処理装置１００は、本工程を省略可能であるので、処理が効率化される。 (1) FFT unit 150
The FFT unit 150 has a function of converting an input time domain signal into a frequency domain signal. For example, the FFT unit 150 converts a time domain signal into a frequency domain signal by FFT. Any method other than FFT may be employed for this conversion processing. Further, when the input acoustic signal is a frequency domain signal, the FFT unit 150 may be omitted. When the input acoustic signal is a frequency domain signal, the first to third embodiments require a step of converting the frequency domain signal into a time domain signal. On the other hand, since the signal processing apparatus 100 according to the present embodiment can omit this step, the processing is made efficient.

（２）差分信号計算部１１０
本実施形態に係る差分信号計算部１１０は、周波数領域で差分信号を計算する。例えば、差分信号計算部１１０は、Ｌｃｈ及びＲｃｈについて、対応するスケールファクターバンドのパワーを減算処理することで、差分信号を計算する。差分信号計算部１１０は、ＬｃｈからＲｃｈを減算してもよいし、ＲｃｈからＬｃｈを減算してもよい。 (2) Difference signal calculation unit 110
The difference signal calculation unit 110 according to the present embodiment calculates a difference signal in the frequency domain. For example, the difference signal calculation unit 110 calculates a difference signal by subtracting the power of the corresponding scale factor band for Lch and Rch. The difference signal calculation unit 110 may subtract Rch from Lch or subtract Lch from Rch.

（３）ぼかし処理部１２０
本実施形態に係るぼかし処理部１２０は、差分信号計算部１１０により計算された周波数領域の差分信号に、当該差分信号を処理した周波数領域の信号を加算する。例えば、ぼかし処理部１２０は、差分信号を処理した信号として、差分信号を遅延させた遅延信号を生成する。そして、ぼかし処理部１２０は、周波数領域の差分信号に、周波数領域の遅延信号を加算する。以下、図２７を参照して、本実施形態に係るぼかし処理部１２０のシグナルフローの一例を説明する。 (3) Blur processing unit 120
The blurring processing unit 120 according to the present embodiment adds a frequency domain signal obtained by processing the differential signal to the frequency domain differential signal calculated by the differential signal calculation unit 110. For example, the blurring processing unit 120 generates a delayed signal obtained by delaying the differential signal as a signal obtained by processing the differential signal. Then, the blurring processing unit 120 adds the frequency domain delay signal to the frequency domain differential signal. Hereinafter, an example of the signal flow of the blur processing unit 120 according to the present embodiment will be described with reference to FIG.

図２７は、本実施形態に係るぼかし処理部１２０のシグナルフローの一例を示す図である。図２７では、時間的に連続する２つのフレームのスペクトルを用いてぼかし処理する例を図示している。図２７に示すように、ぼかし処理部１２０は、入力された信号を１フレーム遅延させる遅延器１２２を有し、差分信号Ｓ（ｉ）に１フレーム遅延した遅延信号を重み付け加算することで、ぼかし信号Ｆ（ｉ）を得る。符号４０１及び符号４０２は、差分信号Ｓ（ｉ）のスケールファクターバンドごとのパワーを示している。例えば、符号４０１は、差分信号の第ｈ番目のフレームのスケールファクターバンドごとのパワーであり、符号４０２は、差分信号の第ｈ−１番目のフレームのスケールファクターバンドごとのパワーである。符号４０３は、ぼかし信号Ｆ（ｉ）のスケールファクターバンドごとのパワーを示している。詳しくは、符号４０３は、符号４０１に示した信号と符号４０２に示した信号とを０．５ずつの重みで加重平均した信号の、スケールファクターバンドごとのパワーである。符号４０３に示すように、出力信号Ｆ（ｉ）のスケールファクターバンドごとのパワーの時間方向の変化の急峻さは抑制されており、その結果、聴覚ノイズが抑制される。 FIG. 27 is a diagram illustrating an example of a signal flow of the blur processing unit 120 according to the present embodiment. FIG. 27 illustrates an example in which blurring processing is performed using the spectra of two temporally continuous frames. As shown in FIG. 27, the blurring processing unit 120 includes a delay unit 122 that delays an input signal by one frame, and blurs the difference signal S (i) by weighting and adding a delayed signal delayed by one frame. A signal F (i) is obtained. Reference numerals 401 and 402 indicate the power for each scale factor band of the differential signal S (i). For example, reference numeral 401 is the power for each scale factor band of the h-th frame of the difference signal, and reference numeral 402 is the power for each scale factor band of the (h−1) -th frame of the difference signal. Reference numeral 403 indicates the power for each scale factor band of the blur signal F (i). Specifically, reference numeral 403 denotes power for each scale factor band of a signal obtained by weighted averaging the signal indicated by reference numeral 401 and the signal indicated by reference numeral 402 with a weight of 0.5. As indicated by reference numeral 403, the steepness of the change in the time direction of the power for each scale factor band of the output signal F (i) is suppressed, and as a result, auditory noise is suppressed.

なお、図２７に示した例では、説明の簡略化のため、ぼかし処理部１２０はひとつの遅延器１２２を有するとし、重み付け係数ｒ＝０．５としているが、他の任意の設定であってもよい。また、図２７では、ＦＩＲフィルタを用いて遅延信号を生成する例を説明したが、ＩＩＲフィルタが用いられてもよい。 In the example shown in FIG. 27, for simplification of explanation, the blurring processing unit 120 has one delay unit 122 and the weighting coefficient r = 0.5, but other arbitrary settings may be used. May be. Further, in FIG. 27, the example in which the delay signal is generated using the FIR filter has been described, but an IIR filter may be used.

（４）ＩＦＦＴ部１５１
ＩＦＦＴ部１５１は、入力された周波数領域の信号を時間領域の信号へ変換する機能を有する。例えば、ＩＦＦＴ部１５１は、ＩＦＦＴにより時間領域の信号を周波数領域の信号へ変換する。本変換処理には、ＩＦＦＴ以外の任意の方式が採用されてもよい。また、出力する信号が周波数領域の信号である場合、ＩＦＦＴ部１５１は省略されてもよい。 (4) IFFT unit 151
The IFFT unit 151 has a function of converting an input frequency domain signal into a time domain signal. For example, the IFFT unit 151 converts a time-domain signal into a frequency-domain signal by IFFT. Any method other than IFFT may be employed for this conversion processing. If the output signal is a frequency domain signal, IFFT unit 151 may be omitted.

以上説明したように、本実施形態によれば、周波数領域の音響信号についても、特定音を抑制しつつ、聴覚上のノイズの発生を防止することができる。 As described above, according to the present embodiment, it is possible to prevent the generation of auditory noise while suppressing the specific sound for the acoustic signal in the frequency domain.

＜６．ハードウェア構成＞
最後に、図２８を参照して、本実施形態に係る情報処理装置のハードウェア構成について説明する。図２８は、本実施形態に係る情報処理装置のハードウェア構成の一例を示すブロック図である。なお、図２８に示す情報処理装置９００は、例えば、図１、図１６、図１７、図１９、図２０、図２３、図２４及び図２６にそれぞれ示した各実施形態に係る信号処理装置１００を実現し得る。各実施形態に係る信号処理装置１００による情報処理は、ソフトウェアと、以下に説明するハードウェアとの協働により実現される。 <6. Hardware configuration>
Finally, the hardware configuration of the information processing apparatus according to the present embodiment will be described with reference to FIG. FIG. 28 is a block diagram illustrating an example of a hardware configuration of the information processing apparatus according to the present embodiment. Note that the information processing apparatus 900 illustrated in FIG. 28 is, for example, the signal processing apparatus 100 according to each embodiment illustrated in FIGS. 1, 16, 17, 19, 20, 23, 24, and 26. Can be realized. Information processing by the signal processing apparatus 100 according to each embodiment is realized by cooperation of software and hardware described below.

図２８に示すように、情報処理装置９００は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）９０１、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）９０２、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）９０３及びホストバス９０４ａを備える。また、情報処理装置９００は、ブリッジ９０４、外部バス９０４ｂ、インタフェース９０５、入力装置９０６、出力装置９０７、ストレージ装置９０８、ドライブ９０９、接続ポート９１１、通信装置９１３及びセンサ９１５を備える。情報処理装置９００は、ＣＰＵ９０１に代えて、又はこれとともに、ＤＳＰ若しくはＡＳＩＣ等の処理回路を有してもよい。 As shown in FIG. 28, the information processing apparatus 900 includes a CPU (Central Processing Unit) 901, a ROM (Read Only Memory) 902, a RAM (Random Access Memory) 903, and a host bus 904a. The information processing apparatus 900 includes a bridge 904, an external bus 904b, an interface 905, an input device 906, an output device 907, a storage device 908, a drive 909, a connection port 911, a communication device 913, and a sensor 915. The information processing apparatus 900 may include a processing circuit such as a DSP or an ASIC in place of or in addition to the CPU 901.

ＣＰＵ９０１は、演算処理装置および制御装置として機能し、各種プログラムに従って情報処理装置９００内の動作全般を制御する。また、ＣＰＵ９０１は、マイクロプロセッサであってもよい。ＲＯＭ９０２は、ＣＰＵ９０１が使用するプログラムや演算パラメータ等を記憶する。ＲＡＭ９０３は、ＣＰＵ９０１の実行において使用するプログラムや、その実行において適宜変化するパラメータ等を一時記憶する。ＣＰＵ９０１は、例えば、図１、図１６、図１７、図１９、図２０、図２３、図２４及び図２６にそれぞれ示した各実施形態に係る信号処理装置１００に含まれる各構成要素を形成し得る。 The CPU 901 functions as an arithmetic processing device and a control device, and controls the overall operation in the information processing device 900 according to various programs. Further, the CPU 901 may be a microprocessor. The ROM 902 stores programs used by the CPU 901, calculation parameters, and the like. The RAM 903 temporarily stores programs used in the execution of the CPU 901, parameters that change as appropriate during the execution, and the like. The CPU 901 forms, for example, each component included in the signal processing device 100 according to each embodiment illustrated in FIGS. 1, 16, 17, 19, 20, 23, 24, and 26. obtain.

ＣＰＵ９０１、ＲＯＭ９０２及びＲＡＭ９０３は、ＣＰＵバスなどを含むホストバス９０４ａにより相互に接続されている。ホストバス９０４ａは、ブリッジ９０４を介して、ＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔ／Ｉｎｔｅｒｆａｃｅ）バスなどの外部バス９０４ｂに接続されている。なお、必ずしもホストバス９０４ａ、ブリッジ９０４および外部バス９０４ｂを分離構成する必要はなく、１つのバスにこれらの機能を実装してもよい。 The CPU 901, ROM 902 and RAM 903 are connected to each other by a host bus 904a including a CPU bus. The host bus 904a is connected to an external bus 904b such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 904. Note that the host bus 904a, the bridge 904, and the external bus 904b do not necessarily have to be configured separately, and these functions may be mounted on one bus.

入力装置９０６は、例えば、マウス、キーボード、タッチパネル、ボタン、マイクロフォン、スイッチ及びレバー等、ユーザによって情報が入力される装置によって実現される。また、入力装置９０６は、例えば、赤外線やその他の電波を利用したリモートコントロール装置であってもよいし、情報処理装置９００の操作に対応した携帯電話やＰＤＡ等の外部接続機器であってもよい。さらに、入力装置９０６は、例えば、上記の入力手段を用いてユーザにより入力された情報に基づいて入力信号を生成し、ＣＰＵ９０１に出力する入力制御回路などを含んでいてもよい。情報処理装置９００のユーザは、この入力装置９０６を操作することにより、情報処理装置９００に対して各種のデータを入力したり処理動作を指示したりすることができる。 The input device 906 is realized by a device to which information is input by a user, such as a mouse, a keyboard, a touch panel, a button, a microphone, a switch, and a lever. The input device 906 may be, for example, a remote control device using infrared rays or other radio waves, or may be an external connection device such as a mobile phone or a PDA that supports the operation of the information processing device 900. . Furthermore, the input device 906 may include, for example, an input control circuit that generates an input signal based on information input by the user using the above-described input means and outputs the input signal to the CPU 901. A user of the information processing apparatus 900 can input various data and instruct a processing operation to the information processing apparatus 900 by operating the input device 906.

出力装置９０７は、取得した情報をユーザに対して視覚的又は聴覚的に通知することが可能な装置で形成される。このような装置として、ＣＲＴディスプレイ装置、液晶ディスプレイ装置、プラズマディスプレイ装置、ＥＬディスプレイ装置及びランプ等の表示装置や、スピーカ及びヘッドホン等の音声出力装置や、プリンタ装置等がある。出力装置９０７は、例えば、情報処理装置９００が行った各種処理により得られた結果を出力する。具体的には、表示装置は、情報処理装置９００が行った各種処理により得られた結果を、テキスト、イメージ、表、グラフ等、様々な形式で視覚的に表示する。他方、音声出力装置は、再生された音声データや音響データ等からなるオーディオ信号をアナログ信号に変換して聴覚的に出力する。 The output device 907 is formed of a device capable of visually or audibly notifying acquired information to the user. Examples of such devices include CRT display devices, liquid crystal display devices, plasma display devices, EL display devices, display devices such as lamps, audio output devices such as speakers and headphones, printer devices, and the like. For example, the output device 907 outputs results obtained by various processes performed by the information processing device 900. Specifically, the display device visually displays results obtained by various processes performed by the information processing device 900 in various formats such as text, images, tables, and graphs. On the other hand, the audio output device converts an audio signal composed of reproduced audio data, acoustic data, and the like into an analog signal and outputs it aurally.

ストレージ装置９０８は、情報処理装置９００の記憶部の一例として形成されたデータ格納用の装置である。ストレージ装置９０８は、例えば、ＨＤＤ等の磁気記憶部デバイス、半導体記憶デバイス、光記憶デバイス又は光磁気記憶デバイス等により実現される。ストレージ装置９０８は、記憶媒体、記憶媒体にデータを記録する記録装置、記憶媒体からデータを読み出す読出し装置および記憶媒体に記録されたデータを削除する削除装置などを含んでもよい。このストレージ装置９０８は、ＣＰＵ９０１が実行するプログラムや各種データ及び外部から取得した各種のデータ等を格納する。 The storage device 908 is a data storage device formed as an example of a storage unit of the information processing device 900. The storage apparatus 908 is realized by, for example, a magnetic storage device such as an HDD, a semiconductor storage device, an optical storage device, a magneto-optical storage device, or the like. The storage device 908 may include a storage medium, a recording device that records data on the storage medium, a reading device that reads data from the storage medium, a deletion device that deletes data recorded on the storage medium, and the like. The storage device 908 stores programs executed by the CPU 901, various data, various data acquired from the outside, and the like.

ドライブ９０９は、記憶媒体用リーダライタであり、情報処理装置９００に内蔵、あるいは外付けされる。ドライブ９０９は、装着されている磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリ等のリムーバブル記憶媒体に記録されている情報を読み出して、ＲＡＭ９０３に出力する。また、ドライブ９０９は、リムーバブル記憶媒体に情報を書き込むこともできる。 The drive 909 is a storage medium reader / writer, and is built in or externally attached to the information processing apparatus 900. The drive 909 reads information recorded on a removable storage medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and outputs the information to the RAM 903. The drive 909 can also write information to a removable storage medium.

接続ポート９１１は、外部機器と接続されるインタフェースであって、例えばＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）などによりデータ伝送可能な外部機器との接続口である。 The connection port 911 is an interface connected to an external device, and is a connection port with an external device capable of transmitting data by, for example, USB (Universal Serial Bus).

通信装置９１３は、例えば、ネットワーク９２０に接続するための通信デバイス等で形成された通信インタフェースである。通信装置９１３は、例えば、有線若しくは無線ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＬＴＥ（ＬｏｎｇＴｅｒｍＥｖｏｌｕｔｉｏｎ）、Ｂｌｕｅｔｏｏｔｈ（登録商標）又はＷＵＳＢ（ＷｉｒｅｌｅｓｓＵＳＢ）用の通信カード等である。また、通信装置９１３は、光通信用のルータ、ＡＤＳＬ（ＡｓｙｍｍｅｔｒｉｃＤｉｇｉｔａｌＳｕｂｓｃｒｉｂｅｒＬｉｎｅ）用のルータ又は各種通信用のモデム等であってもよい。この通信装置９１３は、例えば、インターネットや他の通信機器との間で、例えばＴＣＰ／ＩＰ等の所定のプロトコルに則して信号等を送受信することができる。 The communication device 913 is a communication interface formed by a communication device for connecting to the network 920, for example. The communication device 913 is, for example, a communication card for wired or wireless LAN (Local Area Network), LTE (Long Term Evolution), Bluetooth (registered trademark), or WUSB (Wireless USB). The communication device 913 may be a router for optical communication, a router for ADSL (Asymmetric Digital Subscriber Line), a modem for various communication, or the like. The communication device 913 can transmit and receive signals and the like according to a predetermined protocol such as TCP / IP, for example, with the Internet and other communication devices.

なお、ネットワーク９２０は、ネットワーク９２０に接続されている装置から送信される情報の有線、または無線の伝送路である。例えば、ネットワーク９２０は、インターネット、電話回線網、衛星通信網などの公衆回線網や、Ｅｔｈｅｒｎｅｔ（登録商標）を含む各種のＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）などを含んでもよい。また、ネットワーク９２０は、ＩＰ−ＶＰＮ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ−ＶｉｒｔｕａｌＰｒｉｖａｔｅＮｅｔｗｏｒｋ）などの専用回線網を含んでもよい。 The network 920 is a wired or wireless transmission path for information transmitted from a device connected to the network 920. For example, the network 920 may include a public line network such as the Internet, a telephone line network, a satellite communication network, various LANs (Local Area Network) including Ethernet (registered trademark), a WAN (Wide Area Network), and the like. The network 920 may also include a dedicated line network such as an IP-VPN (Internet Protocol-Virtual Private Network).

以上、本実施形態に係る情報処理装置９００の機能を実現可能なハードウェア構成の一例を示した。上記の各構成要素は、汎用的な部材を用いて実現されていてもよいし、各構成要素の機能に特化したハードウェアにより実現されていてもよい。従って、本実施形態を実施する時々の技術レベルに応じて、適宜、利用するハードウェア構成を変更することが可能である。 Heretofore, an example of the hardware configuration capable of realizing the functions of the information processing apparatus 900 according to the present embodiment has been shown. Each of the above components may be realized using a general-purpose member, or may be realized by hardware specialized for the function of each component. Therefore, it is possible to change the hardware configuration to be used as appropriate according to the technical level at the time of carrying out this embodiment.

なお、上述のような本実施形態に係る情報処理装置９００の各機能を実現するためのコンピュータプログラムを作製し、ＰＣ等に実装することが可能である。また、このようなコンピュータプログラムが格納された、コンピュータで読み取り可能な記録媒体も提供することができる。記録媒体は、例えば、磁気ディスク、光ディスク、光磁気ディスク、フラッシュメモリ等である。また、上記のコンピュータプログラムは、記録媒体を用いずに、例えばネットワークを介して配信されてもよい。 Note that a computer program for realizing each function of the information processing apparatus 900 according to the present embodiment as described above can be produced and mounted on a PC or the like. In addition, a computer-readable recording medium storing such a computer program can be provided. The recording medium is, for example, a magnetic disk, an optical disk, a magneto-optical disk, a flash memory, or the like. Further, the above computer program may be distributed via a network, for example, without using a recording medium.

＜７．まとめ＞
以上、図１〜図２８を参照して、本開示の一実施形態について詳細に説明した。上記説明したように、本実施形態係る信号処理装置１００は、入力された音響信号を形成する第１のチャネルの音響信号及び第２のチャネルの音響信号の差分信号を計算して、差分信号に差分信号を処理した信号を加算する。信号処理装置１００は、差分信号に差分信号を処理した信号を加算することにより、時間方向の急峻なレベルの変化を緩和して、耳障りな聴覚ノイズの発生を防ぐことができる。この効果は、入力された音響信号がジョイントステレオ符号化方式等により圧縮されていた場合により顕著に得られる。本実施形態によれば、聴覚ノイズが発生する大きな原因のひとつである時間方向の急峻なレベルの変化を、直接的に緩和することが可能である。このため、本実施形態に係る信号処理装置１００は、時間方向の急峻なレベルの変化を間接的に緩和し得る第２の比較例と比較して、聴覚ノイズの発生の防止効果が高く且つ効率的であると考えられる。また、信号処理装置１００は、特定音が抑制された差分信号を処理した信号を加算するので、特定音の抑制性能を代償とすることがなく、高い抑制性能を実現することが可能である。 <7. Summary>
The embodiment of the present disclosure has been described in detail above with reference to FIGS. As described above, the signal processing apparatus 100 according to the present embodiment calculates the difference signal between the acoustic signal of the first channel and the acoustic signal of the second channel that forms the input acoustic signal, and generates the difference signal. The signal obtained by processing the difference signal is added. By adding the signal obtained by processing the difference signal to the difference signal, the signal processing apparatus 100 can alleviate a sharp level change in the time direction and prevent generation of annoying auditory noise. This effect is remarkably obtained when the input acoustic signal is compressed by a joint stereo encoding method or the like. According to the present embodiment, it is possible to directly mitigate a steep level change in the time direction, which is one of the major causes of auditory noise. For this reason, the signal processing apparatus 100 according to the present embodiment has a higher effect of preventing the generation of auditory noise and is more efficient than the second comparative example that can indirectly relieve a steep level change in the time direction. It is considered to be appropriate. Moreover, since the signal processing apparatus 100 adds the signal which processed the differential signal by which the specific sound was suppressed, it is possible to implement | achieve a high suppression performance, without paying the suppression performance of a specific sound.

以上、添付図面を参照しながら本開示の好適な実施形態について詳細に説明したが、本開示の技術的範囲はかかる例に限定されない。本開示の技術分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本開示の技術的範囲に属するものと了解される。 The preferred embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings, but the technical scope of the present disclosure is not limited to such examples. It is obvious that a person having ordinary knowledge in the technical field of the present disclosure can come up with various changes or modifications within the scope of the technical idea described in the claims. Of course, it is understood that it belongs to the technical scope of the present disclosure.

例えば、本実施形態に係る信号処理装置１００は多様な機器に搭載されることができる。例えば、ステレオコンポーネントシステム等の音源を再生する装置に本実施形態に係る信号処理装置１００が搭載される場合、ユーザは、楽曲のボーカルを抑制して再生することで、手軽にカラオケを楽しむことができる。また、カーナビゲーションシステム等の音声ガイドを再生する装置に本実施形態に係る信号処理装置１００が搭載される場合、信号処理装置１００は、音声ガイドが再生されるときに再生中の楽曲のボーカルを抑制してもよい。その場合、音声ガイドが楽曲のボーカルに掻き消されることが防止されるので、ユーザは、楽曲の再生を楽しみつつ音声ガイドを鮮明に聞き取ることが可能となる。 For example, the signal processing apparatus 100 according to the present embodiment can be mounted on various devices. For example, when the signal processing apparatus 100 according to the present embodiment is installed in an apparatus that reproduces a sound source such as a stereo component system, the user can easily enjoy karaoke by playing while suppressing vocals of music. it can. In addition, when the signal processing apparatus 100 according to the present embodiment is mounted on an apparatus that reproduces a voice guide such as a car navigation system, the signal processing apparatus 100 uses the vocal of the music being played when the voice guide is played It may be suppressed. In this case, since the voice guide is prevented from being erased by the vocal of the music, the user can hear the voice guide clearly while enjoying the reproduction of the music.

なお、本明細書において説明した各装置は、単独の装置として実現されてもよく、一部又は全部が別々の装置として実現されても良い。例えば、信号処理装置１００の一部又は全部の構成要素がネットワーク等で接続されたサーバ等の装置に備えられていても良く、サーバ単体で又はサーバと信号処理装置１００との協働で上述した処理が行われてもよい。 Each device described in this specification may be realized as a single device, or a part or all of the devices may be realized as separate devices. For example, some or all of the components of the signal processing device 100 may be provided in a device such as a server connected by a network or the like. Processing may be performed.

また、本明細書においてフローチャート及びシーケンス図を用いて説明した処理は、必ずしも図示された順序で実行されなくてもよい。いくつかの処理ステップは、並列的に実行されてもよい。また、追加的な処理ステップが採用されてもよく、一部の処理ステップが省略されてもよい。 Further, the processing described with reference to the flowcharts and sequence diagrams in this specification may not necessarily be executed in the order shown. Some processing steps may be performed in parallel. Further, additional processing steps may be employed, and some processing steps may be omitted.

また、本明細書に記載された効果は、あくまで説明的または例示的なものであって限定的ではない。つまり、本開示に係る技術は、上記の効果とともに、または上記の効果に代えて、本明細書の記載から当業者には明らかな他の効果を奏しうる。 Further, the effects described in the present specification are merely illustrative or exemplary and are not limited. That is, the technology according to the present disclosure can exhibit other effects that are apparent to those skilled in the art from the description of the present specification in addition to or instead of the above effects.

なお、以下のような構成も本開示の技術的範囲に属する。
（１）
入力された音響信号を形成する第１のチャネルの音響信号及び第２のチャネルの音響信号の差分信号を計算する差分信号計算部と、
前記差分信号計算部により計算された前記差分信号に前記差分信号を処理した信号を加算する処理部と、
を備える信号処理装置。
（２）
前記処理部は、前記差分信号を処理した信号として、前記差分信号を遅延させた遅延信号を生成する、前記（１）に記載の信号処理装置。
（３）
前記信号処理装置は、前記遅延信号の遅延量を設定する遅延量設定部をさらに備える、前記（２）に記載の信号処理装置。
（４）
前記遅延量設定部は、前記入力された音響信号の圧縮符号化情報を用いて前記遅延量を設定する、前記（３）に記載の信号処理装置。
（５）
前記遅延量設定部は、前記遅延量を前記圧縮符号化情報が示すフレーム幅以下に設定する、前記（４）に記載の信号処理装置。
（６）
前記処理部は、ＩＩＲ（Infinite impulse response）フィルタを用いて前記遅延信号を生成する、前記（２）〜（５）のいずれか一項に記載の信号処理装置。
（７）
前記処理部は、ＦＩＲ（Finite impulse response）フィルタを用いて前記遅延信号を生成する、前記（２）〜（５）のいずれか一項に記載の信号処理装置。
（８）
前記信号処理装置は、前記処理部による前記加算に係る重み付け係数を設定する係数設定部をさらに備える、前記（２）〜（６）のいずれか一項に記載の信号処理装置。
（９）
前記係数設定部は、前記入力された音響信号の圧縮符号化情報に基づいて前記重み付け係数を設定する、前記（８）に記載の信号処理装置。
（１０）
前記係数設定部は、前記入力された音響信号がモノラルに近い度合に基づいて、前記重み付け係数を設定する、前記（８）又は（９）に記載の信号処理装置。
（１１）
前記信号処理装置は、
前記差分信号を複数の帯域に分割する帯域分割部と、
前記帯域分割部により分割された複数の前記差分信号を合成する合成部と、
をさらに備え、
前記処理部は、前記帯域分割部により分割された複数の帯域のうち少なくともひとつの帯域において前記差分信号に前記差分信号を処理した信号を加算し、
前記合成部は、前記処理部により処理された帯域の前記差分信号と前記帯域分割部により分割された複数の帯域のうち前記処理部による処理がなされなかった帯域の前記差分信号とを合成する、前記（１）〜（１０）のいずれか一項に記載の信号処理装置。
（１２）
前記信号処理装置は、
前記差分信号のゲインレベルを設定するゲインレベル設定部と、
前記ゲインレベル設定部により設定されたゲインレベルを用いて前記差分信号のゲインを制御するゲイン制御部と、
をさらに備える、前記（１）〜（１１）のいずれか一項に記載の信号処理装置。
（１３）
前記ゲインレベル設定部は、前記入力された音響信号がモノラルに近い度合に基づいて前記ゲインレベルを設定する、前記（１２）に記載の信号処理装置。
（１４）
前記信号処理装置は、
前記差分信号を複数の帯域に分割する帯域分割部と、
前記帯域分割部により分割された複数の前記差分信号を合成する合成部と、
をさらに備え、
前記ゲイン制御部は、前記帯域分割部により分割された複数の帯域のうち少なくともひとつの帯域において、前記ゲインレベル設定部により設定された前記ゲインレベルを用いて前記差分信号のゲインを制御し、
前記合成部は、前記ゲイン制御部により制御された帯域の前記差分信号と前記帯域分割部により分割された複数の帯域のうち前記ゲイン制御部による制御がなされなかった帯域の前記差分信号とを合成する、前記（１２）又は（１３）に記載の信号処理装置。
（１５）
前記処理部は、前記帯域分割部により分割された複数の帯域のうち少なくともひとつの帯域において前記差分信号に前記差分信号を処理した信号を加算し、
前記ゲイン制御部は、前記ゲインレベル設定部により設定された前記ゲインレベルを用いて前記処理部により処理された信号のゲインを制御し、
前記合成部は、前記ゲイン制御部により制御された信号と前記帯域分割部により分割された複数の帯域のうち前記ゲイン制御部による制御がなされなかった帯域の前記差分信号とを合成する、前記（１４）に記載の信号処理装置。
（１６）
前記差分信号計算部は、時間領域で前記差分信号を計算する、前記（１）〜（１５）のいずれか一項に記載の信号処理装置。
（１７）
前記差分信号計算部は、周波数領域で前記差分信号を計算する、前記（１）〜（１５）のいずれか一項に記載の信号処理装置。
（１８）
入力された音響信号を形成する第１のチャネルの音響信号及び第２のチャネルの音響信号の差分信号を計算することと、
計算された前記差分信号に前記差分信号を処理した信号をプロセッサにより加算することと、
を含む信号処理方法。
（１９）
コンピュータを、
入力された音響信号を形成する第１のチャネルの音響信号及び第２のチャネルの音響信号の差分信号を計算する差分信号計算部と、
前記差分信号計算部により計算された前記差分信号に前記差分信号を処理した信号を加算する処理部と、
として機能させるためのプログラム。 The following configurations also belong to the technical scope of the present disclosure.
(1)
A differential signal calculation unit that calculates a differential signal between the acoustic signal of the first channel and the acoustic signal of the second channel forming the input acoustic signal;
A processing unit for adding a signal obtained by processing the differential signal to the differential signal calculated by the differential signal calculating unit;
A signal processing apparatus comprising:
(2)
The signal processing apparatus according to (1), wherein the processing unit generates a delayed signal obtained by delaying the differential signal as a signal obtained by processing the differential signal.
(3)
The signal processing device according to (2), further including a delay amount setting unit that sets a delay amount of the delay signal.
(4)
The signal processing device according to (3), wherein the delay amount setting unit sets the delay amount using compression encoding information of the input acoustic signal.
(5)
The signal processing device according to (4), wherein the delay amount setting unit sets the delay amount to be equal to or less than a frame width indicated by the compression encoding information.
(6)
The signal processing apparatus according to any one of (2) to (5), wherein the processing unit generates the delayed signal using an IIR (Infinite impulse response) filter.
(7)
The signal processing apparatus according to any one of (2) to (5), wherein the processing unit generates the delayed signal using a FIR (Finite impulse response) filter.
(8)
The signal processing device according to any one of (2) to (6), further including a coefficient setting unit that sets a weighting coefficient related to the addition performed by the processing unit.
(9)
The signal processing apparatus according to (8), wherein the coefficient setting unit sets the weighting coefficient based on compression coding information of the input acoustic signal.
(10)
The signal processing device according to (8) or (9), wherein the coefficient setting unit sets the weighting coefficient based on a degree to which the input acoustic signal is close to monaural.
(11)
The signal processing device includes:
A band dividing unit for dividing the difference signal into a plurality of bands;
A combining unit that combines the plurality of difference signals divided by the band dividing unit;
Further comprising
The processing unit adds a signal obtained by processing the difference signal to the difference signal in at least one band among a plurality of bands divided by the band dividing unit,
The synthesizing unit synthesizes the difference signal of the band processed by the processing unit and the difference signal of the band not processed by the processing unit among the plurality of bands divided by the band dividing unit. The signal processing device according to any one of (1) to (10).
(12)
The signal processing device includes:
A gain level setting unit for setting a gain level of the differential signal;
A gain control unit for controlling the gain of the differential signal using the gain level set by the gain level setting unit;
The signal processing apparatus according to any one of (1) to (11), further including:
(13)
The signal processing apparatus according to (12), wherein the gain level setting unit sets the gain level based on a degree to which the input acoustic signal is close to monaural.
(14)
The signal processing device includes:
A band dividing unit for dividing the difference signal into a plurality of bands;
A combining unit that combines the plurality of difference signals divided by the band dividing unit;
Further comprising
The gain control unit controls the gain of the differential signal using the gain level set by the gain level setting unit in at least one of the plurality of bands divided by the band dividing unit,
The synthesizing unit synthesizes the differential signal in the band controlled by the gain control unit and the differential signal in a band not controlled by the gain control unit among a plurality of bands divided by the band dividing unit. The signal processing device according to (12) or (13).
(15)
The processing unit adds a signal obtained by processing the difference signal to the difference signal in at least one band among a plurality of bands divided by the band dividing unit,
The gain control unit controls the gain of the signal processed by the processing unit using the gain level set by the gain level setting unit,
The synthesizing unit synthesizes the signal controlled by the gain control unit and the differential signal in a band not controlled by the gain control unit among a plurality of bands divided by the band dividing unit; 14) The signal processing apparatus according to 14).
(16)
The signal processing device according to any one of (1) to (15), wherein the difference signal calculation unit calculates the difference signal in a time domain.
(17)
The signal processing device according to any one of (1) to (15), wherein the difference signal calculation unit calculates the difference signal in a frequency domain.
(18)
Calculating a differential signal between the acoustic signal of the first channel and the acoustic signal of the second channel forming the input acoustic signal;
Adding a signal obtained by processing the difference signal to the calculated difference signal by a processor;
A signal processing method including:
(19)
Computer
A differential signal calculation unit that calculates a differential signal between the acoustic signal of the first channel and the acoustic signal of the second channel forming the input acoustic signal;
A processing unit for adding a signal obtained by processing the differential signal to the differential signal calculated by the differential signal calculating unit;
Program to function as.

１００信号処理装置
１１０差分信号計算部
１２０ぼかし処理部
１２１遅延バッファＤＢ
１２２遅延器
１２３遅延量設定部
１２４係数設定部
１２５ぼかしレベル計算部
１３０帯域分割部
１３１合成部
１４０ゲインレベル設定部
１４１ゲイン制御部
１５０ＦＦＴ部
１５１ＩＦＦＴ部
DESCRIPTION OF SYMBOLS 100 Signal processing apparatus 110 Differential signal calculation part 120 Blur processing part 121 Delay buffer DB
122 delay unit 123 delay amount setting unit 124 coefficient setting unit 125 blurring level calculation unit 130 band division unit 131 synthesis unit 140 gain level setting unit 141 gain control unit 150 FFT unit 151 IFFT unit

Claims

A differential signal calculation unit that calculates a differential signal between the acoustic signal of the first channel and the acoustic signal of the second channel forming the input acoustic signal;
A processing unit for adding a delayed signal obtained by delaying the differential signal to the differential signal calculated by the differential signal calculating unit;
A delay amount setting unit for setting a delay amount of the delay signal to be equal to or less than a frame width indicated by the compression encoding information of the input acoustic signal;
A signal processing apparatus comprising:

The signal processing apparatus according to claim 1 , wherein the processing unit generates the delayed signal using an IIR (Infinite impulse response) filter.

The signal processing apparatus according to claim 1 , wherein the processing unit generates the delayed signal using a FIR (Finite impulse response) filter.

The signal processing unit, the processing unit further comprising a coefficient setting unit for setting a weighting coefficient according to the addition by the signal processing apparatus according to any one of claims 1 to 3.

The signal processing device according to claim 4 , wherein the coefficient setting unit sets the weighting coefficient based on a bit rate indicated by the compression encoding information of the input acoustic signal and / or a use situation of joint stereo encoding. .

The signal processing device according to claim 4 , wherein the coefficient setting unit sets the weighting coefficient based on a degree to which the input acoustic signal is close to monaural.

The signal processing device includes:
A band dividing unit for dividing the difference signal into a plurality of bands;
A combining unit that combines the plurality of difference signals divided by the band dividing unit;
Further comprising
The processing unit adds the delay signal to the differential signal in at least one band among a plurality of bands divided by the band dividing unit,
The synthesizing unit synthesizes the difference signal of the band processed by the processing unit and the difference signal of the band not processed by the processing unit among the plurality of bands divided by the band dividing unit. The signal processing device according to any one of claims 1 to 6 .

The signal processing device includes:
A gain level setting unit for setting a gain level of the differential signal;
A gain control unit that controls the gain of the signal obtained by adding the delay signal to the differential signal using the gain level set by the gain level setting unit;
The signal processing apparatus according to any one of claims 1 to 6 , further comprising:

The signal processing apparatus according to claim 8 , wherein the gain level setting unit sets the gain level based on a degree to which the input acoustic signal is close to monaural.

The signal processing device includes:
A band dividing unit for dividing the difference signal into a plurality of bands;
A combining unit that combines the plurality of difference signals divided by the band dividing unit;
Further comprising
The processing unit adds the delay signal to the differential signal in at least one band among a plurality of bands divided by the band dividing unit,
The gain control unit controls the gain of the signal processed by the processing unit using the gain level set by the gain level setting unit,
The synthesizer synthesizes the signal controlled by the gain controller and the differential signal in a band not controlled by the gain controller among a plurality of bands divided by the band divider. The signal processing device according to 8 or 9 .

The difference signal calculating unit calculates the difference signal in the time domain, the signal processing apparatus according to any one of claims 1-10.

The difference signal calculating unit calculates the difference signal in the frequency domain, the signal processing apparatus according to any one of claims 1-10.

Calculating a differential signal between the acoustic signal of the first channel and the acoustic signal of the second channel forming the input acoustic signal;
Adding a delayed signal obtained by delaying the differential signal to the calculated differential signal by a processor;
Setting the delay amount of the delay signal to be equal to or less than the frame width indicated by the compression encoding information of the input acoustic signal;
A signal processing method including:

Computer
A differential signal calculation unit that calculates a differential signal between the acoustic signal of the first channel and the acoustic signal of the second channel forming the input acoustic signal;
A processing unit for adding a delayed signal obtained by delaying the differential signal to the differential signal calculated by the differential signal calculating unit;
A delay amount setting unit for setting a delay amount of the delay signal to be equal to or less than a frame width indicated by the compression encoding information of the input acoustic signal;
Program to function as.