JP2016148818A

JP2016148818A - Signal processor

Info

Publication number: JP2016148818A
Application number: JP2015026824A
Authority: JP
Inventors: 佳孝浦谷; Yoshitaka Uratani
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2015-02-13
Filing date: 2015-02-13
Publication date: 2016-08-18
Anticipated expiration: 2035-02-13
Also published as: JP6531418B2

Abstract

PROBLEM TO BE SOLVED: To perform signal processing of, when a plurality of acoustic signals are input to a signal processor, giving an acoustic effect to an acoustic signal while keeping the balance of all of the sounds issued from the plurality of acoustic signals.SOLUTION: A signal as the object of signal processing of giving an acoustic effect is set as a main acoustic signal, and the other signals are set as sub acoustic signals. The intensity of the acoustic effect is determined in accordance with the correlation values between a plurality of signals, namely between the main acoustic signal and one or the plurality of sub acoustic signals, and the signal processing is applied to the main acoustic signal. Therefore, after that, even when compression processing of a dynamic range is applied and added to the main acoustic signal and one or the plurality of sub acoustic signals, the balance between the main acoustic signal and one or the plurality of sub acoustic signals of an issued sound is not lost.SELECTED DRAWING: Figure 1

Description

本発明は、信号処理装置に関する。 The present invention relates to a signal processing apparatus.

近年では、カーステレオや家庭用カラオケ装置などの様々な種類のオーディオシステムが一般に普及している。この種のオーディオシステムで再生される音の音量が小さく聴き取り難い場合、単純に音量を大きくするだけでは音が割れる、すなわち音に歪みが生じてしまうことがある。特に音の信号レベルの大小の差（すなわち、ダイナミックレンジ）が広い場合には、音の歪みの発生が顕著である。そのため、音量を大きくしても音の歪みが発生しないように、音量調整に先立ってダイナミックレンジを圧縮する信号処理技術が一般的となっている。このような技術は、例えば特許文献１および特許文献２や非特許文献１に開示されている。 In recent years, various types of audio systems such as car stereos and home karaoke apparatuses have become popular. When the volume of the sound reproduced by this type of audio system is small and difficult to hear, the sound may be broken, that is, the sound may be distorted by simply increasing the volume. In particular, when the difference in the level of the sound signal level (that is, the dynamic range) is wide, the occurrence of sound distortion is significant. Therefore, a signal processing technique that compresses the dynamic range prior to volume adjustment is generally used so that sound distortion does not occur even when the volume is increased. Such a technique is disclosed in, for example, Patent Document 1, Patent Document 2, and Non-Patent Document 1.

これらの技術では、音響信号を一定時間長ずつ区切って得られるフレーム毎にダイナミックレンジの圧縮が行われる。音量調整に先立ってこのようなダイナミックレンジの圧縮を行うことで、時間経過に伴いフレーム間でダイナミックレンジが緩やかに変化する場合は、聴感上自然な音を聞くことができる。しかし、フレーム間でダイナミックレンジが急変すると聴感上不自然な音に聞こえてしまう。そのため、例えば特許文献３に開示の信号処理技術では、音の信号レベルのダイナミックレンジが急変する場合には、音響信号をフレームに区切る時間間隔を短くすることで、聴感上自然な音を聞くことができるようにしている。 In these techniques, dynamic range compression is performed for each frame obtained by dividing an acoustic signal by a certain length of time. By performing such dynamic range compression prior to volume adjustment, if the dynamic range changes gradually between frames over time, a natural sound can be heard. However, if the dynamic range suddenly changes between frames, it will sound like an unnatural sound. For this reason, for example, in the signal processing technique disclosed in Patent Document 3, when the dynamic range of the sound signal level changes suddenly, the sound signal is heard in a natural sound by shortening the time interval for dividing the acoustic signal into frames. To be able to.

さらに特許文献４には、車内でカーステレオから音が放音されている場合に、予め音響信号を記憶しておいたサイレン音を車の周囲で検出すると、カーステレオの音の音量を下げるという信号処理技術が開示されている。 Furthermore, in Patent Document 4, when the sound is emitted from the car stereo in the car, the sound volume of the car stereo is reduced when a siren sound that has been stored in advance is detected around the car. A signal processing technique is disclosed.

特許第３１２３０５２号公報Japanese Patent No. 3123052 特許第２９６６８４６号公報Japanese Patent No. 2996646 特開２００２−２３２２４７号公報JP 2002-232247 A 特開平１１−１３６０５９号公報Japanese Patent Laid-Open No. 11-136059

ＤｙｎａｍｉｃＲａｎｇｅＣｏｎｔｒｏｌ оｆＤｉｇｉｔａｌＡｕｄｉｏＳｉｇｎａｌｓ／Ｇ．Ｗ．Ｍｃｎａｌｌｙ、インターネット＜ＵＲＬ：ｈｔｔｐ：／／ｗｗｗ．ｃｓ．ｔｕｔ．ｆｉ／〜ｓｇｎ１４００６／ＰＤＦ／Ｌ０５−ｄｙｎａｍｉｃｓ．ｐｄｆ＞Dynamic Range Control® Digital Audio Signals / G. W. McNally, Internet <URL: http: // www. cs. tut. fi / ~ sgn14006 / PDF / L05-dynamics. pdf>

特許文献１および特許文献２や非特許文献１に開示された信号処理技術をカラオケ装置に適用すると不具合が生じる場合がある。その理由は以下の通りである。カラオケ装置にはボーカル音を表す音響信号と当該音響信号と時間軸を共有する音響信号（すなわち、ボーカル音と同期再生される音を表す音響信号、例えば、伴奏音を表す音響信号）とが入力され、カラオケ装置はこれらの音響信号を加算して放音する。カラオケ装置に特許文献１等に開示の信号処理技術を適用し、ボーカル音と伴奏音の音響信号に対してダイナミックレンジの圧縮を別個独立に行うと、例えばボーカル音の音響信号の信号レベルがピークとなっているにも関わらず伴奏音の音響信号の信号レベルが下がり切らないなど、ボーカル音と伴奏音から構成される曲全体のバランスが崩れてしまう場合がある。 When the signal processing techniques disclosed in Patent Document 1, Patent Document 2, and Non-Patent Document 1 are applied to a karaoke apparatus, a problem may occur. The reason is as follows. The karaoke apparatus receives an acoustic signal representing a vocal sound and an acoustic signal sharing a time axis with the acoustic signal (that is, an acoustic signal representing a sound reproduced in synchronization with the vocal sound, for example, an acoustic signal representing an accompaniment sound). The karaoke apparatus adds these acoustic signals and emits the sound. When the signal processing technology disclosed in Patent Document 1 is applied to a karaoke device and the dynamic range compression is performed independently for vocal sound and accompaniment sound signal, for example, the signal level of the sound signal of vocal sound peaks. However, the balance of the entire song composed of the vocal sound and the accompaniment sound may be lost.

また、特許文献３に開示の信号処理をカラオケ装置のボーカル音と伴奏音の音響信号に対して施すと、ボーカル音或いは伴奏音の音響信号の信号レベルのダイナミックレンジが急変しても、自然な聴感を維持することができる。しかし、カラオケ装置にこの信号処理技術を適用しても、ボーカル音と伴奏音から構成される曲全体のバランスを保ち続けることはできない。 Further, when the signal processing disclosed in Patent Document 3 is applied to the vocal sound and the accompaniment sound signal of the karaoke apparatus, even if the dynamic range of the signal level of the vocal sound or the accompaniment sound signal changes suddenly, it is natural. Hearing can be maintained. However, even if this signal processing technique is applied to a karaoke apparatus, it is impossible to keep the balance of the entire song composed of vocal sounds and accompaniment sounds.

特許文献４の信号処理技術をカラオケ装置に適用しても、同様に、ボーカル音と伴奏音から構成される曲全体のバランスを保ち続けることはできない。そもそも特許文献４の信号処理技術は、音楽の聴感を考慮した技術ではない。なお、カラオケ装置から放音される音のボーカル音と伴奏音のバランスが崩れてしまうのは、ボーカル音と伴奏音の音響信号に対してダイナミックレンジを圧縮する信号処理を行う場合だけでなく、リバーブ効果を付与する信号処理を行う場合も同様である。 Similarly, even if the signal processing technique of Patent Document 4 is applied to a karaoke apparatus, the balance of the entire music composed of vocal sounds and accompaniment sounds cannot be maintained. In the first place, the signal processing technique of Patent Document 4 is not a technique that takes into account the sense of music. The balance between the vocal sound and accompaniment sound emitted from the karaoke device is not only lost when performing signal processing to compress the dynamic range for the vocal sound and the accompaniment sound signal. The same applies to the case of performing signal processing that gives a reverb effect.

本発明は以上に説明した課題に鑑みて為されたものであり、信号処理装置に入力される音響信号が複数ある場合に、この複数ある音響信号により放音される音全体のバランスを保ちつつ、音響信号のダイナミックレンジの圧縮や音響信号へのリバーブ効果の付与を行うことができるようにする技術を提供することを目的とする。 The present invention has been made in view of the problems described above, and when there are a plurality of sound signals input to the signal processing device, the balance of the whole sound emitted by the plurality of sound signals is maintained. An object of the present invention is to provide a technique that enables compression of a dynamic range of an acoustic signal and imparting a reverb effect to the acoustic signal.

上記課題を解決するために本発明は、音響効果を与える信号処理の対象となるメイン音響信号と、１又は複数のサブ音響信号と、の中の複数の信号間の相関値を算出する相互相関検出部と、前記相互相関検出部において算出された前記相関値に応じて前記音響効果の強度を決定する信号処理部と、を有することを特徴とする信号処理装置を提供する。 In order to solve the above-described problems, the present invention provides a cross-correlation for calculating a correlation value between a plurality of signals among a main acoustic signal to be subjected to signal processing that gives an acoustic effect and one or more sub-acoustic signals. There is provided a signal processing device comprising: a detection unit; and a signal processing unit that determines the intensity of the acoustic effect according to the correlation value calculated in the cross-correlation detection unit.

本発明によれば、メイン音響信号と１又は複数のサブ音響信号との中の複数の信号間の相関値に応じて音響効果の強度が決定され、メイン音響信号に信号処理が施される。このため、当該メイン音響信号と１又は複数のサブ音響信号にその後ダイナミックレンジの圧縮処理が施され加算されたとしても、放音される音におけるメイン音響信号と１又は複数のサブ音響信号のバランスが崩れることはない。 According to the present invention, the intensity of the sound effect is determined according to the correlation value between a plurality of signals in the main sound signal and one or a plurality of sub sound signals, and signal processing is performed on the main sound signal. For this reason, even if the main acoustic signal and one or more sub-acoustic signals are subsequently subjected to compression processing of the dynamic range and added, the balance between the main acoustic signal and the one or more sub-acoustic signals in the sound to be emitted Will not collapse.

より好ましい態様においては、前記複数の信号のうちの少なくとも１つの信号の可聴域に重み付けを施して前記相互相関検出部に与える重み付け部を有する。このような重み付けにより、メイン音響信号と１又は複数のサブ音響信号との相関値に対するメイン音響信号と１又は複数のサブ音響信号との少なくとも一方の可聴域成分の寄与が大きくなる。このため、可聴域におけるメイン音響信号と１又は複数のサブ音響信号との相関が高いにも関わらず、他の帯域における相関の低さに起因して両信号の相関が低くなっているような場合であっても、聴感上の両信号のバランスを崩すことなく、ダイナミックレンジの圧縮処理等を行うことが可能となる。 In a more preferred aspect, there is provided a weighting unit that weights an audible range of at least one of the plurality of signals and gives the audible range to the cross-correlation detection unit. Such weighting increases the contribution of at least one audible range component of the main sound signal and the one or more sub sound signals to the correlation value between the main sound signal and the one or more sub sound signals. For this reason, although the correlation between the main acoustic signal and one or more sub-acoustic signals in the audible range is high, the correlation between both signals is low due to the low correlation in the other bands. Even in such a case, it is possible to perform dynamic range compression processing or the like without losing the balance of both signals in the sense of hearing.

より好ましい態様においては、前記相互相関検出部は、前記複数の信号の中の１つの信号と、当該信号と時間が異なる前記複数の信号の中の１又は複数の信号との相関値を複数算出し、それら複数の相関値から１つの相関値を求める。複数の信号の中の１つの信号と、当該複数の信号の中の１又は複数の信号との時間が異なっていても、各信号の相関値を求める際に、各信号の時間が異なることによる影響が緩和されるので、放音される音のメイン音響信号と１又は複数のサブ音響信号とのバランスが崩れることがない。 In a more preferred aspect, the cross-correlation detection unit calculates a plurality of correlation values between one signal of the plurality of signals and one or more signals of the plurality of signals that are different in time from the signal. Then, one correlation value is obtained from the plurality of correlation values. Even when the time of one signal among a plurality of signals and one or more signals among the plurality of signals are different, the time of each signal is different when obtaining the correlation value of each signal. Since the influence is alleviated, the balance between the main sound signal of the sound to be emitted and the one or more sub sound signals is not lost.

より好ましい態様においては、前記メイン音響信号と前記１又は複数のサブ音響信号とのうちの少なくとも１つの信号に帯域分割を施す帯域分割部を有し、前記相互相関検出部は、前記帯域分割部により分割された帯域の信号のうちの少なくとも１つを用いて前記相関値を算出する。このような態様によれば、メイン音響信号に対して、サブ音響信号との相関に応じた信号処理を周波数帯域毎に細やかに施すことが可能になる。 In a more preferred aspect, there is provided a band dividing unit that performs band division on at least one of the main acoustic signal and the one or more sub-acoustic signals, and the cross-correlation detecting unit includes the band dividing unit The correlation value is calculated using at least one of the signals in the band divided by. According to such an aspect, it becomes possible to finely perform signal processing corresponding to the correlation with the sub-acoustic signal on the main acoustic signal for each frequency band.

より好ましい態様においては、前記メイン音響信号は、前記１又は複数のサブ音響信号の少なくとも１つを含んだ合成信号である。このような態様によれば、メイン音響信号からサブ音響信号を分離する音源分離を施して上記信号処理を施す場合と類似の効果を得ることができる。 In a more preferred aspect, the main acoustic signal is a synthesized signal including at least one of the one or more sub-acoustic signals. According to such an aspect, an effect similar to the case where the signal processing is performed by performing sound source separation that separates the sub sound signal from the main sound signal can be obtained.

この発明の第１実施形態である信号処理装置１０１のブロック図である。1 is a block diagram of a signal processing device 101 according to a first embodiment of the present invention. 同信号処理装置１０１のノーマイズ値ｘと乗数ｙの関係を示すグラフである。It is a graph which shows the relationship between the normalization value x of the signal processing apparatus 101, and the multiplier y. この発明の第２実施形態である信号処理装置１０２のブロック図である。It is a block diagram of the signal processing apparatus 102 which is 2nd Embodiment of this invention. 同信号処理装置１０２の重み付け部５０が参照するラウドネスカーブを示すグラフである。It is a graph which shows the loudness curve which the weighting part 50 of the signal processing apparatus 102 refers. この発明の第３実施形態である信号処理装置１０３のブロック図である。It is a block diagram of the signal processing apparatus 103 which is 3rd Embodiment of this invention. この発明の第４実施形態である信号処理装置１０４のブロック図である。It is a block diagram of the signal processing apparatus 104 which is 4th Embodiment of this invention. 同実施形態におけるノーマライズ値ｘとリバーブのミックス値ｇ_ｍｉｘの関係を示すグラフである。It is a graph which shows the relationship between the normalization value x in the same embodiment, and the reverb mix value g _mix . この発明の第６実施形態である信号処理装置１０６のブロック図である。It is a block diagram of the signal processing apparatus 106 which is 6th Embodiment of this invention. この発明の第８実施形態である信号処理装置１０８のブロック図である。It is a block diagram of the signal processing apparatus 108 which is 8th Embodiment of this invention. この発明の第１０実施形態である信号処理装置１１０のブロック図である。It is a block diagram of the signal processing apparatus 110 which is 10th Embodiment of this invention.

以下、図面を参照しつつ本発明の実施の形態について説明する。
＜第１実施形態＞
図１は、この発明の第１実施形態である信号処理装置１０１のブロック図である。信号処理装置１０１は、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）であり、２種類の音響信号の入力を受ける。これら２種類の音響信号の各々は、何れも音の時間波形を表すデジタル信号である。本実施形態では、信号処理装置１０１に入力されるデジタル信号のうち、信号処理装置１０１が処理を施す信号をメイン音響信号と呼び、メイン音響信号と時間軸を共有するが信号処理装置１０１が処理を施さない信号をサブ音響信号と呼ぶ（以下、他の実施形態も同様に処理対象の信号をメイン音響信号と呼ぶ）。信号処理装置１０１は、メイン音響信号とサブ音響信号の相関を算出し、その算出結果を基にメイン音響信号の信号レベルの増幅処理を行い、出力信号として出力する。出力された出力信号に対してダイナミックレンジの圧縮や音量の増幅等の信号処理が施されるが、これらの信号処理は周知の信号処理技術を用いればよい。そのため本実施形態では、これらの信号処理の図示や説明を省略する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
<First Embodiment>
FIG. 1 is a block diagram of a signal processing apparatus 101 according to the first embodiment of the present invention. The signal processing apparatus 101 is a DSP (Digital Signal Processor) and receives two types of acoustic signals. Each of these two types of acoustic signals is a digital signal representing a time waveform of sound. In the present embodiment, among the digital signals input to the signal processing device 101, a signal that is processed by the signal processing device 101 is called a main acoustic signal, and shares a time axis with the main acoustic signal, but is processed by the signal processing device 101. A signal that is not subjected to the processing is referred to as a sub acoustic signal (hereinafter, a signal to be processed is also referred to as a main acoustic signal in other embodiments). The signal processing device 101 calculates the correlation between the main sound signal and the sub sound signal, performs amplification processing of the signal level of the main sound signal based on the calculation result, and outputs it as an output signal. Signal processing such as dynamic range compression and volume amplification is performed on the output signal that has been output. These signal processing may be performed using known signal processing techniques. Therefore, in this embodiment, illustration and description of these signal processes are omitted.

信号処理装置１０１は、時間周波数変換部１０、相互相関検出部２０、パラメータ算出部３０および乗算部４０を有する。時間周波数変換部１０等は、信号処理装置１０１が実行するマイクロプログラムにより実現される機能の一部である。なお、信号処理装置１０１としてＤＳＰを用いず、時間周波数変換部１０等を電子回路で構成し、これらの電子回路を組み合わせて信号処理装置１０１を実現してもよい。 The signal processing apparatus 101 includes a time frequency conversion unit 10, a cross correlation detection unit 20, a parameter calculation unit 30, and a multiplication unit 40. The time frequency conversion unit 10 and the like are a part of functions realized by a microprogram executed by the signal processing apparatus 101. Note that the DSP may not be used as the signal processing device 101, and the time frequency conversion unit 10 and the like may be configured by an electronic circuit, and the signal processing device 101 may be realized by combining these electronic circuits.

時間周波数変換部１０は、メイン音響信号とサブ音響信号の２種類のデジタル信号の入力を受け、各信号を一定時間長のフレームに区切って（例えばＴ個のフレームに区切って）、フレーム毎に時間周波数変換、すなわちフーリエ変換を行う。時間周波数変換部１０は、フーリエ変換によって算出したフーリエ変換データを相互相関検出部２０に出力する。 The time-frequency converter 10 receives two types of digital signals, a main sound signal and a sub sound signal, and divides each signal into frames of a certain time length (for example, divides into T frames), and for each frame. Perform time-frequency transform, that is, Fourier transform. The time frequency conversion unit 10 outputs the Fourier transform data calculated by the Fourier transform to the cross correlation detection unit 20.

相互相関検出部２０は、時間周波数変換部１０から入力されたメイン音響信号とサブ音響信号の相関関係を検出する。具体的には、相互相関検出部２０は、次式（１）に示す相互相関関数を用い、メイン音響信号とサブ音響信号のフーリエ変換データに基づいて各信号の相関値Ｒ_ｍｓを算出する。

なお、Ｃ_ｍｓは次式（２）を用いて算出される。

Ｃ_ｍｍとＣ_ｓｓは、メイン音響信号とサブ音響信号の自己相関を各々示し、次式（３）〜（４）を用いて算出される。

ｎは各フレームを一意に示すフレーム番号であり、１≦ｎ≦Ｔである。ｆ_ｍ（ｎ）とｆ_ｓ（ｎ）は、メイン音響信号とサブ音響信号のｎ番目のフレームのフーリエ変換データである。相関値Ｒ_ｍｓは−１≦Ｒ_ｍｓ≦１を満たし、相関値Ｒ_ｍｓが１の時にはメイン音響信号とサブ音響信号は完全相関であり、相関値Ｒ_ｍｓが小さくなるほどメイン音響信号とサブ音響信号の相関は低下する。相互相関検出部２０は、相関値Ｒ_ｍｓをパラメータ算出部３０に出力する。 The cross-correlation detection unit 20 detects the correlation between the main sound signal and the sub sound signal input from the time-frequency conversion unit 10. Specifically, the cross-correlation detection unit 20 calculates the correlation value R _ms of each signal based on the Fourier transform data of the main acoustic signal and the sub-acoustic signal using the cross-correlation function shown in the following equation (1).

C _ms is calculated using the following equation (2).

C _mm and C _ss indicate autocorrelations of the main acoustic signal and the sub acoustic signal, respectively, and are calculated using the following equations (3) to (4).

n is a frame number uniquely indicating each frame, and 1 ≦ n ≦ T. f _m (n) and f _s (n) are Fourier transform data of the nth frame of the main sound signal and the sub sound signal. The correlation value R _ms satisfies −1 ≦ R _ms ≦ 1, and when the correlation value R _ms is 1, the main acoustic signal and the sub acoustic signal are completely correlated, and the main acoustic signal and the sub acoustic signal decrease as the correlation value R _ms decreases. The correlation decreases. The cross-correlation detection unit 20 outputs the correlation value R _ms to the parameter calculation unit 30.

パラメータ算出部３０は、相互相関検出部２０から入力された相関値Ｒ_ｍｓに基づいて、増幅率ｙを算出し、乗算部４０に出力する。増幅率ｙとは、メイン音響信号の信号レベルに対して施す増幅処理の強さを示す値である。より詳細に説明すると、パラメータ算出部３０は、まず次式（５）を用いてノーマライズ値ｘを算出する。

相関値Ｒ_ｍｓは、−１≦Ｒ_ｍｓ≦１を満たすので、ノーマライズ値ｘは、０≦ｘ≦１となる。図２は、ノーマライズ値ｘと増幅率ｙの関係を示すグラフである。図２のｔｈは閾値である。ノーマライズ値ｘが閾値ｔｈよりも小さければ、パラメータ算出部３０は増幅率ｙを０ｄＢとし、ノーマライズ値ｘが閾値ｔｈよりも大きければ、パラメータ算出部３０はノーマライズ値ｘが１に近づくにつれて増幅率ｙを線形に小さくする。図２では、ノーマライズ値ｘが１の時に増幅率ｙは−１０ｄＢとなるが、この増幅率ｙの値は−１０ｄＢに限られるものではなく、実験により好適な値を定めればよい。さらに、図２では、ノーマライズ値ｘが閾値ｔｈよりも大きくなると増幅率ｙが線形に小さくなる場合について例示されているが、ノーマライズ値ｘと増幅率ｙの関数関係は非線形でもよく、実験により好適な態様を取ればよい。 The parameter calculation unit 30 calculates the amplification factor y based on the correlation value R _ms input from the cross correlation detection unit 20 and outputs it to the multiplication unit 40. The amplification factor y is a value indicating the strength of amplification processing performed on the signal level of the main acoustic signal. More specifically, the parameter calculation unit 30 first calculates the normalized value x using the following equation (5).

Since the correlation value R _ms satisfies −1 ≦ R _ms ≦ 1, the normalized value x satisfies 0 ≦ x ≦ 1. FIG. 2 is a graph showing the relationship between the normalized value x and the amplification factor y. In FIG. 2, th is a threshold value. If the normalization value x is smaller than the threshold th, the parameter calculation unit 30 sets the amplification factor y to 0 dB. If the normalization value x is larger than the threshold th, the parameter calculation unit 30 determines that the amplification factor y increases as the normalization value x approaches 1. Is reduced linearly. In FIG. 2, when the normalized value x is 1, the amplification factor y is −10 dB. However, the amplification factor y is not limited to −10 dB, and a suitable value may be determined by experiment. Further, FIG. 2 illustrates the case where the amplification factor y decreases linearly when the normalization value x becomes larger than the threshold th, but the functional relationship between the normalization value x and the amplification factor y may be non-linear and is suitable by experiment. What is necessary is just to take this aspect.

乗算部４０は、パラメータ算出部３０から入力された増幅率ｙに基づいて、メイン音響信号の信号レベルを増幅し、増幅したメイン音響信号を出力信号として出力する。つまり乗算部４０は、メイン音響信号に増幅処理を施す信号処理部として機能する。 The multiplier 40 amplifies the signal level of the main acoustic signal based on the amplification factor y input from the parameter calculator 30, and outputs the amplified main acoustic signal as an output signal. That is, the multiplication unit 40 functions as a signal processing unit that performs amplification processing on the main acoustic signal.

メイン音響信号とサブ音響信号が完全相関、すなわち相関値Ｒ_ｍｓが１であると、式（５）よりノーマライズ値ｘは１となる。この場合、図２より増幅率ｙは−１０ｄＢとなり、信号処理装置１０１は、入力された時よりも信号レベルが低くなったメイン音響信号を出力信号として出力する。メイン音響信号とサブ音響信号が完全相関であるので、信号処理装置１０１から出力されたメイン音響信号にダイナミックレンジの圧縮等の信号処理が施され、サブ音響信号と加算されると、メイン音響信号の信号レベルの低下分はサブ音響信号により補完される。そのため、信号処理装置１０１からメイン音響信号が出力された後、メイン音響信号とサブ音響信号が加算され、最終的に放音されても、メイン音響信号とサブ音響信号のバランスが崩れることがない。 If the main sound signal and the sub sound signal are completely correlated, that is, if the correlation value R _ms is 1, the normalized value x is 1 from the equation (5). In this case, the amplification factor y is −10 dB from FIG. 2, and the signal processing apparatus 101 outputs the main acoustic signal whose signal level is lower than the input level as an output signal. Since the main acoustic signal and the sub acoustic signal are completely correlated, the main acoustic signal output from the signal processing apparatus 101 is subjected to signal processing such as dynamic range compression and added to the sub acoustic signal. The decrease in the signal level is supplemented by the sub-acoustic signal. Therefore, even after the main sound signal is output from the signal processing device 101, the main sound signal and the sub sound signal are added, and even if the sound is finally emitted, the balance between the main sound signal and the sub sound signal will not be lost. .

メイン音響信号とサブ音響信号の相関が低くなる、すなわち相関値Ｒ_ｍｓが−１に近づくと、式（５）よりノーマライズ値ｘは０に近くなる。この場合、図２より増幅率ｙは０であり、信号処理装置１０１は、入力されたメイン音響信号の信号レベルを増幅せずに出力信号として出力する。メイン音響信号とサブ音響信号の相関は低いので、メイン音響信号のピーク（或いはディップ）とサブ音響信号のピーク（或いはディップ）がぶつかり合うことはない。そのため、信号処理装置１０１から出力されたメイン音響信号にダイナミックレンジの圧縮等の信号処理が施され、サブ音響信号と加算され、最終的に放音されても、メイン音響信号とサブ音響信号のバランスが崩れることがない。 When the correlation between the main acoustic signal and the sub-acoustic signal becomes low, that is, when the correlation value R _ms approaches −1, the normalized value x approaches 0 according to the equation (5). In this case, the amplification factor y is 0 from FIG. 2, and the signal processing apparatus 101 outputs the output signal as an output signal without amplifying the signal level of the input main sound signal. Since the correlation between the main sound signal and the sub sound signal is low, the peak (or dip) of the main sound signal and the peak (or dip) of the sub sound signal do not collide with each other. Therefore, the main acoustic signal output from the signal processing device 101 is subjected to signal processing such as compression of the dynamic range, added to the sub-acoustic signal, and finally output the sound even if the main acoustic signal and the sub-acoustic signal are emitted. There is no loss of balance.

このように信号処理装置１０１を用いることで、メイン音響信号とサブ音響信号を単純に加算することによって生じるメイン音響信号とサブ音響信号のバランスの崩れを防止することができる。 By using the signal processing device 101 in this way, it is possible to prevent the balance between the main sound signal and the sub sound signal that is generated by simply adding the main sound signal and the sub sound signal.

例えば、信号処理装置１０１のカラオケ装置への適用を想定する。カラオケ装置にはボーカル音を表す音響信号と伴奏音を表す音響信号が入力される。以下では伴奏音の中でもドラム音について考え、中域（２００〜１０００Ｈｚ）に周波数が集中しているボーカル音の音響信号をメイン音響信号とし、高域（１０００Ｈｚ以上）に周波数が集中しているドラム音の音響信号をサブ音響信号とする。従来技術では、ボーカル音とドラム音との相関を考慮しないので、ボーカル音の音響信号と伴奏音の音響信号の各々にダイナミックレンジの圧縮等の処理を施し加算すると、ボーカル音とドラム音の音響信号のピーク（或いはディップ）がぶつかり合い、最終的に放音される音においてボーカル音とドラム音のバランスが崩れてしまう場合があった。これに対して信号処理装置１０１を用いると、ボーカル音とドラム音の相関が高いと、ボーカル音の信号レベルを低下させた後にダイナミックレンジの圧縮が行われ、同じくダイナミックレンジの圧縮が行われたドラム音との加算が行われる。そのため、最終的に放音される音においてボーカル音とドラム音のバランスが崩れることがない。 For example, application of the signal processing device 101 to a karaoke device is assumed. The karaoke apparatus receives an acoustic signal representing a vocal sound and an acoustic signal representing an accompaniment sound. In the following, the drum sound is considered among the accompaniment sounds, and the sound signal of the vocal sound whose frequency is concentrated in the middle range (200 to 1000 Hz) is set as the main acoustic signal, and the drum whose frequency is concentrated in the high range (1000 Hz or more). The sound signal of sound is defined as a sub sound signal. In the prior art, since the correlation between the vocal sound and the drum sound is not taken into account, if the vocal sound signal and the accompaniment sound signal are subjected to processing such as compression of the dynamic range and added, the sound of the vocal sound and the drum sound is added. In some cases, the peak (or dip) of the signal collides, and the balance between the vocal sound and the drum sound is lost in the sound finally emitted. On the other hand, when the signal processing device 101 is used, if the correlation between the vocal sound and the drum sound is high, the dynamic range is compressed after the signal level of the vocal sound is lowered, and the dynamic range is also compressed. Addition with drum sound is performed. Therefore, the balance between the vocal sound and the drum sound is not lost in the sound finally emitted.

最終的に放音される音においてメイン音響信号とサブ音響信号のバランスが崩れないようにするために、メイン音響信号のピーク（或いはディップ）とサブ音響信号のピーク（或いはディップ）がぶつかり合う周波数帯域の信号をメイン音響信号から減算して、その後メイン音響信号にダイナミックレンジの圧縮等の信号処理を施してサブ音響信号と加算して放音するという方法が考えられる。しかし、この方法では、メイン音響信号とサブ音響信号がぶつかり合う周波数帯域の信号をメイン音響信号から減算するスペクトル減算の時に人工的ノイズが生じ、最終的に放音される音にノイズが発生することがある。信号処理装置１０１では、スペクトル減算を用いることがないため、人工的ノイズが生じ、最終的に放音される音にノイズが発生することがない。 The frequency at which the peak (or dip) of the main sound signal and the peak (or dip) of the sub sound signal collide with each other so that the balance between the main sound signal and the sub sound signal does not collapse in the sound that is finally emitted. A method is conceivable in which a band signal is subtracted from the main sound signal, and then the main sound signal is subjected to signal processing such as compression of a dynamic range and added to the sub sound signal to be emitted. However, in this method, an artificial noise is generated at the time of spectral subtraction in which a signal in a frequency band in which the main acoustic signal and the sub acoustic signal collide are subtracted from the main acoustic signal, and noise is finally generated in the sound to be emitted. Sometimes. Since the signal processing apparatus 101 does not use spectral subtraction, artificial noise is generated, and noise is not generated in the sound finally emitted.

＜第２実施形態＞
図３は、この発明の第２実施形態である信号処理装置１０２のブロック図である。図３では、図１におけるものと同一の構成要素には同一の符号が付されている。図１と図３を比較すれば明らかなように、信号処理装置１０２の信号処理装置１０１との違いは、時間周波数変換部１０と相互相関検出部２０との間に重み付け部５０を設けた点である。以下では、第１実施形態との相違点である重み付け部５０を中心に説明する。 Second Embodiment
FIG. 3 is a block diagram of the signal processing apparatus 102 according to the second embodiment of the present invention. In FIG. 3, the same components as those in FIG. 1 are denoted by the same reference numerals. As apparent from a comparison between FIG. 1 and FIG. 3, the difference between the signal processing device 102 and the signal processing device 101 is that a weighting unit 50 is provided between the time-frequency conversion unit 10 and the cross-correlation detection unit 20. It is. Below, it demonstrates centering on the weighting part 50 which is a difference with 1st Embodiment.

重み付け部５０は、時間周波数変換部１０から入力されたメイン音響信号とサブ音響信号のフーリエ変換データに対し、ラウドネスカーブに基づく重み付けを行う。具体的には、重み付け部５０には、図４に示すラウドネスカーブの４０ｐｈоｎの曲線を表すデータが予め記憶されている。重み付け部５０は、このデータの中でも人間が聴き取り易い周波数帯域（以下、可聴域）である周波数２０Ｈｚ〜１０ｋＨｚのデータを参照し、参照したデータの波形の最大音圧を１、最低音圧を０として正規化した重み付け乗算をメイン音響信号とサブ音響信号のフーリエ変換データに行う。この重み付け乗算により、可聴域に含まれる音のフーリエ変換データがより大きくなり、聴き取り難い周波数帯域に含まれる音のフーリエ変換データがより小さくなる。重み付け部５０は、重み付け乗算を行ったメイン音響信号とサブ音響信号のフーリエ変換データを相互相関検出部２０に出力する。 The weighting unit 50 weights the Fourier transform data of the main sound signal and the sub sound signal input from the time frequency conversion unit 10 based on the loudness curve. Specifically, the weighting unit 50 stores in advance data representing a 40 phon curve of the loudness curve shown in FIG. The weighting unit 50 refers to data having a frequency of 20 Hz to 10 kHz, which is a frequency band (hereinafter, audible range) that is easy for humans to listen to, and sets the maximum sound pressure of the waveform of the referenced data to 1, and the minimum sound pressure. Weighted multiplication normalized as 0 is performed on the Fourier transform data of the main sound signal and the sub sound signal. By this weighting multiplication, the Fourier transform data of the sound included in the audible range becomes larger, and the Fourier transform data of the sound included in the frequency band difficult to hear becomes smaller. The weighting unit 50 outputs the Fourier transform data of the main sound signal and the sub sound signal subjected to weighting multiplication to the cross correlation detection unit 20.

相互相関検出部２０は、重み付け部５０から入力されたメイン音響信号とサブ音響信号のフーリエ変換データから式（１）〜（４）に基づいて相関値Ｒ_ｍｓを算出し、その相関値Ｒ_ｍｓをパラメータ算出部３０に出力する。その後は第１実施形態と同様であるので説明を省略する。 The cross-correlation detection unit 20 calculates a correlation value R _ms from the Fourier transform data of the main sound signal and the sub sound signal input from the weighting unit 50 based on the equations (1) to (4), and the correlation value R _ms. Is output to the parameter calculation unit 30. After that, since it is the same as that of 1st Embodiment, description is abbreviate | omitted.

信号処理装置１０２では、重み付け部５０においてラウドネスカーブに基づく重み付けが行われているため、最終的に放音される音におけるメイン音響信号とサブ音響信号により放音される各音のバランスが崩れることがない上にさらに、相関値Ｒ_ｍｓに対する可聴域の音の寄与が大きくなる。例えば、信号処理装置１０２をカラオケ装置へ適用し、メイン音響信号がボーカル音の音響信号であり、サブ音響信号がドラム音の音響信号であるとする。この場合、ボーカル音とドラム音の全周波数にわたる相関は低いものの、可聴域における両信号の相関が高い場合であっても、最終的に放音される音のボーカル音とドラム音のバランスが崩れることはない。もちろん、メイン音響信号或いはサブ音響信号のいずれか一方のみに重み付け部５０を用いる態様であってもよい。 In the signal processing device 102, since the weighting unit 50 performs weighting based on the loudness curve, the balance of each sound emitted by the main sound signal and the sub sound signal in the sound finally emitted is lost. In addition, the contribution of audible sound to the correlation value R _ms increases. For example, it is assumed that the signal processing device 102 is applied to a karaoke device, the main sound signal is a vocal sound sound signal, and the sub sound signal is a drum sound sound signal. In this case, although the correlation over the entire frequency of the vocal sound and the drum sound is low, the balance between the vocal sound and the drum sound of the sound finally emitted is lost even if the correlation between both signals in the audible range is high. There is nothing. Of course, the weighting unit 50 may be used only for either the main sound signal or the sub sound signal.

＜第３実施形態＞
図５は、この発明の第３実施形態である信号処理装置１０３のブロック図である。図５では、図１におけるものと同一の構成要素には同一の符号が付されている。図５と図１を比較すれば明らかなように、信号処理装置１０３と信号処理装置１０１の違いは、相互相関検出部２０に換えて、相互相関検出部２１と平均算出部９０を設けた点である。以下では、第１実施形態との相違点である相互相関検出部２１と平均算出部９０を中心に説明する。 <Third Embodiment>
FIG. 5 is a block diagram of a signal processing apparatus 103 according to the third embodiment of the present invention. In FIG. 5, the same components as those in FIG. 1 are denoted by the same reference numerals. As apparent from a comparison between FIG. 5 and FIG. 1, the difference between the signal processing device 103 and the signal processing device 101 is that a cross-correlation detection unit 21 and an average calculation unit 90 are provided instead of the cross-correlation detection unit 20. It is. Below, it demonstrates centering on the cross correlation detection part 21 and the average calculation part 90 which are the different points from 1st Embodiment.

相互相関検出部２１は、次式（６）に基づいて、時間周波数変換部１０から入力されたメイン音響信号とメイン音響信号以外の他の信号であるサブ音響信号とのフーリエ変換データから相関値Ｒ_ｍｓ（τ）を算出する。なお相関値Ｒ_ｍｓ（τ）は、メイン音響信号とメイン音響信号からτだけ遅れたサブ音響信号との相関を表す値である。

τは整数値であり、Ｃ_ｍｓ（τ）は次式（７）を用いて算出される。

ｎはフレーム番号であり、１≦ｎ≦Ｔである。すなわち、ｆ_ｍ（ｎ）はメイン音響信号のｎ番目のフレームのフーリエ変換データであり、ｆ_ｓ（ｎ＋τ）はサブ音響信号のｎ＋τ番目のフレームのフーリエ変換データである。相関値Ｒ_ｍｓ（τ）は−１≦Ｒ_ｍｓ（τ）≦１を満たし、相関値Ｒ_ｍｓ（τ）が１の時にはメイン音響信号とメイン音響信号からτだけ遅れたサブ音響信号とは完全相関であり、相関値Ｒ_ｍｓ（τ）が小さくなるほどメイン音響信号とサブ音響信号の相関は低下する。相互相関検出部２１は、τ＝０、１、２の場合の相関値Ｒ_ｍｓ（τ）を平均算出部９０に出力する。なお、時間周波数変換部１０と相互相関検出部２１の間に遅延器を設け、遅延器は、時間周波数変換部１０からメイン音響信号とサブ音響信号のフーリエ変換データを入力され、相互相関検出部にメイン音響信号とメイン音響信号からτだけ遅れたサブ音響信号とのフーリエ変換データを出力する態様であってもよい。この態様では、相互相関検出部２１は、遅延器から入力されたメイン音響信号とサブ音響信号から相関値Ｒ_ｍｓ（τ）を算出し平均算出部９０に出力するだけで、サブ音響信号をメイン音響信号からτだけ遅れたようにする処理は行わない。 Based on the following equation (6), the cross-correlation detection unit 21 calculates a correlation value from Fourier transform data of the main acoustic signal input from the time-frequency conversion unit 10 and a sub acoustic signal that is a signal other than the main acoustic signal. R _ms (τ) is calculated. The correlation value R _ms (τ) is a value representing the correlation between the main acoustic signal and the sub acoustic signal delayed by τ from the main acoustic signal.

τ is an integer value, and C _ms (τ) is calculated using the following equation (7).

n is a frame number, and 1 ≦ n ≦ T. That is, f _m (n) is the Fourier transform data of the nth frame of the main acoustic signal, and f _s (n + τ) is the Fourier transform data of the n + τth frame of the sub acoustic signal. The correlation value R _ms (τ) satisfies −1 ≦ R _ms (τ) ≦ 1, and when the correlation value R _ms (τ) is 1, the main acoustic signal and the sub acoustic signal delayed by τ from the main acoustic signal are completely The correlation between the main acoustic signal and the sub acoustic signal decreases as the correlation value R _ms (τ) decreases. The cross-correlation detector 21 outputs the correlation value R _ms (τ) when τ = 0, 1, and 2 to the average calculator 90. A delay unit is provided between the time-frequency conversion unit 10 and the cross-correlation detection unit 21, and the delay unit receives the Fourier transform data of the main acoustic signal and the sub-acoustic signal from the time-frequency conversion unit 10, and the cross-correlation detection unit Alternatively, Fourier transform data of the main acoustic signal and the sub acoustic signal delayed by τ from the main acoustic signal may be output. In this aspect, the cross-correlation detection unit 21 calculates the correlation value R _ms (τ) from the main acoustic signal and the sub acoustic signal input from the delay unit, and outputs the correlation value R _ms (τ) to the average calculation unit 90. No processing is performed to delay the acoustic signal by τ.

平均算出部９０は、相互相関検出部２１から入力された相関値Ｒ_ｍｓ（０）、Ｒ_ｍｓ（１）、Ｒ_ｍｓ（２）の相加平均を算出し、平均相関値Ｒ_ｍｓとしてパラメータ算出部３０に出力する。 The average calculation unit 90 calculates an arithmetic average of the correlation values R _ms (0), R _ms (1), and R _ms (2) input from the cross correlation detection unit 21, and calculates a parameter as the average correlation value R _ms. To the unit 30.

パラメータ算出部３０は、平均算出部９０から入力された平均相関値Ｒ_ｍｓから第１実施形態と同様に増幅率ｙを算出する。その後は第１実施形態と同様であるので説明を省略する。 The parameter calculation unit 30 calculates the amplification factor y from the average correlation value R _ms input from the average calculation unit 90 as in the first embodiment. The subsequent steps are the same as those in the first embodiment, and the description thereof is omitted.

本実施形態では、ｆ_ｍ（ｎ）とｆ_ｓ（ｎ＋τ）を用いて算出したＲ_ｍｓ（τ）の相加平均に応じた増幅率でメイン音響信号が増幅されるので、メイン音響信号とサブ音響信号の時間のずれを救うことができる。より詳細に説明すると、例えばカラオケ装置に第１実施形態の信号処理装置１０１を適用した場合、本来はボーカル音と伴奏音の相関は高いにも関わらず、伴奏音とボーカル音に時間のずれがあると、両者の相関は低いと判断されてしまう。そのため、最終的に放音される音のボーカル音と伴奏音のバランスが崩れてしまう恐れがある。一方、カラオケ装置に本実施形態の信号処理装置１０３を適用すると、伴奏音とボーカル音の時間のずれの影響が緩和され、最終的に放音される音のボーカル音と伴奏音のバランスが崩れることがない。つまり本実施形態によれば、メイン音響信号とサブ音響信号に時間のずれがあっても、そのずれによる影響を緩和しつつ、メイン音響信号に対する信号処理を行うことが可能となる。 In the present embodiment, the main acoustic signal is amplified with an amplification factor corresponding to the arithmetic average of R _ms (τ) calculated using f _m (n) and f _s (n + τ). The time lag of the acoustic signal can be saved. More specifically, for example, when the signal processing apparatus 101 according to the first embodiment is applied to a karaoke apparatus, although there is originally a high correlation between the vocal sound and the accompaniment sound, there is a time lag between the accompaniment sound and the vocal sound. If so, the correlation between the two is judged to be low. For this reason, the balance between the vocal sound and the accompaniment sound of the sound finally emitted may be lost. On the other hand, when the signal processing device 103 of the present embodiment is applied to a karaoke device, the influence of the time lag between the accompaniment sound and the vocal sound is alleviated, and the balance between the vocal sound and the accompaniment sound of the finally emitted sound is lost. There is nothing. That is, according to the present embodiment, even if there is a time lag between the main sound signal and the sub sound signal, it is possible to perform signal processing on the main sound signal while reducing the influence of the time lag.

本実施形態では、τを０、１、２とした相関値Ｒ_ｍｓ（０）、Ｒ_ｍｓ（１）、Ｒ_ｍｓ（２）の３つの値の相加平均から平均相関値Ｒ_ｍｓを算出していたが、τは０、１、２に限られず、例えばτを０、３、６のように他の整数値にして平均相関値Ｒ_ｍｓを算出してもよい。さらに、平均相関値Ｒ_ｍｓの算出に用いる相関値Ｒ_ｍｓ（τ）は３つの値に限られることはなく、相関値Ｒ_ｍｓ（τ）の数を増やしてもよい。例えば、τを０、１、２、３とした相関値Ｒ_ｍｓ（０）、Ｒ_ｍｓ（１）、Ｒ_ｍｓ（２）、Ｒ_ｍｓ（３）の４つの値の相加平均から平均相関値Ｒ_ｍｓを算出してもよい。その上、本実施形態では、平均算出部９０が平均相関値Ｒ_ｍｓの算出に相関値Ｒ_ｍｓ（τ）の相加平均を用いたが、一番大きい値を平均相関値Ｒ_ｍｓとしてもよい。例えば、τを０、１、２とした相関値Ｒ_ｍｓ（０）、Ｒ_ｍｓ（１）、Ｒ_ｍｓ（２）の中でＲ_ｍｓ（１）の値が一番大きいと、Ｒ_ｍｓ（１）を平均相関値Ｒ_ｍｓとして算出してもよい。もちろん平均相関値Ｒ_ｍｓの算出方法はこれら以外でもよく、例えば相乗平均や重み付け平均でもよい。信号処理装置１０３に求められる精度に応じて平均相関値Ｒ_ｍｓの好適な算出方法を決定すればよい。 In this embodiment, the average correlation value R _ms is calculated from the arithmetic average of the three values of correlation values R _ms (0), R _ms (1), and R _ms (2) with τ set to 0, 1, and 2. However, τ is not limited to 0, 1, and 2. For example, the average correlation value R _ms may be calculated by setting τ to another integer value such as 0, 3, and 6, for example. Furthermore, the correlation value R _ms (τ) used for calculating the average correlation value R _ms is not limited to three values, and the number of correlation values R _ms (τ) may be increased. For example, an average correlation value from an arithmetic mean of four values of correlation values R _ms (0), R _ms (1), R _ms (2), and R _ms (3) with τ set to 0, 1, 2, 3 R _ms may be calculated. Moreover, in the present embodiment, the average calculation unit 90 uses the arithmetic average of the correlation values R _ms (τ) for calculating the average correlation value R _ms , but the largest value may be used as the average correlation value R _ms. . For example, when the value of R _ms (1) is the largest among the correlation values R _ms (0), R _ms (1), and R _ms (2) with τ set to 0, 1, and 2, R _ms (1 ) _May be calculated as the average correlation value R _ms . Of course, the calculation method of the average correlation value R _ms may be other than these, for example, a geometric average or a weighted average. What is necessary is just to determine the suitable calculation method of average correlation value _Rms according to the precision calculated _| required by the signal processing apparatus 103. FIG.

＜第４実施形態＞
図６は、この発明の第４実施形態である信号処理装置１０４のブロック図である。図６では、図１におけるものと同一の構成要素には同一の符号が付されている。図６と図１を比較すれば明らかなように、信号処理装置１０４と信号処理装置１０１との違いは、パラメータ算出部３０と乗算部４０に換えて、パラメータ算出部３１と、乗算部４１と、残響信号生成部６０と、加算部７０とを設けた点である。以下では、第１実施形態との相違点であるパラメータ算出部３１と、乗算部４１と、残響信号生成部６０と、加算部７０とを中心に説明する。乗算部４１、残響信号生成部６０および加算部７０は、メイン音響信号にリバーブ効果を付与する信号処理部として機能する。 <Fourth embodiment>
FIG. 6 is a block diagram of a signal processing apparatus 104 according to the fourth embodiment of the present invention. In FIG. 6, the same components as those in FIG. 1 are denoted by the same reference numerals. 6 and 1, the difference between the signal processing device 104 and the signal processing device 101 is that the parameter calculation unit 31, the multiplication unit 41, and the parameter calculation unit 30 are replaced with the parameter calculation unit 30 and the multiplication unit 40. The reverberation signal generator 60 and the adder 70 are provided. Below, it demonstrates centering on the parameter calculation part 31, the multiplication part 41, the reverberation signal production | generation part 60, and the addition part 70 which are the different points from 1st Embodiment. The multiplication unit 41, the reverberation signal generation unit 60, and the addition unit 70 function as a signal processing unit that gives a reverb effect to the main acoustic signal.

第１実施形態では、信号処理装置１０１に入力されたメイン音響信号は時間周波数変換部１０と乗算部４０に入力されたが、本実施形態の信号処理装置１０４では、メイン音響信号は時間周波数変換部１０、残響信号生成部６０および加算部７０に入力される。本実施形態では、残響信号生成部６０に入力されるメイン音響信号をＷｅｔ信号と呼び、加算部７０に入力されるメイン音響信号をＤｒｙ信号と呼ぶ。 In the first embodiment, the main acoustic signal input to the signal processing device 101 is input to the time-frequency conversion unit 10 and the multiplication unit 40. However, in the signal processing device 104 of the present embodiment, the main acoustic signal is converted to the time-frequency conversion. Input to unit 10, reverberation signal generation unit 60, and addition unit 70. In the present embodiment, the main acoustic signal input to the reverberation signal generation unit 60 is referred to as a Wet signal, and the main acoustic signal input to the addition unit 70 is referred to as a Dry signal.

残響信号生成部６０は、Ｗｅｔ信号に基づいて残響信号を生成して乗算部４１に出力する。残響信号生成部６０の残響信号生成アルゴリズムについては既存のものを用いればよい。Ｄｒｙ信号に残響信号を加算する、すなわちリバーブ効果を付与することにより、加算後の信号は聴感上奥行き感のある音を示す音響信号となる。 The reverberation signal generation unit 60 generates a reverberation signal based on the Wet signal and outputs the reverberation signal to the multiplication unit 41. The existing reverberation signal generation algorithm of the reverberation signal generation unit 60 may be used. By adding a reverberation signal to the Dry signal, that is, by adding a reverb effect, the signal after the addition becomes an acoustic signal indicating a sound with a sense of depth in terms of hearing.

パラメータ算出部３１は、相互相関検出部２０から入力された相関値Ｒ_ｍｓから式（３）に基づいてノーマライズ値ｘを算出して乗算部４１に出力する。０≦ｘ≦１となるのは上記で説明した通りである。パラメータ算出部３１は次式（８）に基づいて、ミックス値ｇ_ｍｉｘを算出する。なおミックス値ｇ_ｍｉｘとは、残響信号の信号レベルに対して施す増幅処理の強さを示す値であり、Ｄｒｙ信号に対する残響信号の混合比に対応する。

ｘ_０は閾値であり、図７は、式（８）をグラフ化した図である。ノーマライズ値ｘがｘ_０よりも小さいとミックス値ｇ_ｍｉｘはａｘ＋ｂとなり、ノーマライズ値が増加するに連れてミックス値ｇ_ｍｉｘは線形に大きくなる。ノーマライズ値ｘがｘ_０以上であるとミックス値ｇ_ｍｉｘは一定値ｃ（ｃ＝ａｘ_０＋ｂ）となる。なお、ミックス値ｇ_ｍｉｘがノーマライズ値が増加するにつれて非線形に大きくなってもよく、実験により好適な態様を取ればよい。 The parameter calculation unit 31 calculates a normalized value x from the correlation value R _ms input from the cross correlation detection unit 20 based on Expression (3), and outputs it to the multiplication unit 41. 0 ≦ x ≦ 1 is as described above. The parameter calculation unit 31 calculates the mix value g _mix based on the following equation (8). The mix value g _mix is a value indicating the strength of amplification processing applied to the signal level of the reverberation signal, and corresponds to the mixing ratio of the reverberation signal to the Dry signal.

x ₀ is the threshold, Fig. 7 is a diagram showing a graph of the equation (8). Normalize value x is small, the mix value _{g mix} than _{x 0} Mix value _{g mix} him to ax + b, and the is normalized value increases increases linearly. Normalize value x _x Mixed value _{g mix} When it is ₀ or more becomes a constant value _{c (c = ax 0 + b} ). Note that the mix value g _mix may increase non-linearly as the normalization value increases, and a suitable aspect may be taken by experiment.

乗算部４１は、残響信号生成部６０から残響信号の入力を受け、さらにパラメータ算出部３１からミックス値ｇ_ｍｉｘの入力を受ける。乗算部４１は、ミックス値ｇ_ｍｉｘに基づいて残響信号の信号レベルを増幅し、加算部７０に出力する。 The multiplication unit 41 receives an input of a reverberation signal from the reverberation signal generation unit 60 and further receives an input of a mix value g _mix from the parameter calculation unit 31. The multiplier 41 amplifies the signal level of the reverberation signal based on the mix value g _mix and outputs the amplified signal level to the adder 70.

加算部７０は、Ｄｒｙ信号と乗算部４１による増幅を受けた残響信号とを加算して出力信号として出力する。 The adder 70 adds the Dry signal and the reverberation signal amplified by the multiplier 41 and outputs the result as an output signal.

メイン音響信号とサブ音響信号の相関が高い場合、メイン音響信号に付与するリバーブ効果を強くし過ぎると残響音が目立ってしまい、メイン音響信号とサブ音響信号により最終的に放音される音のバランスが崩れてしまう。信号処理装置１０４であれば、閾値ｘ_０よりもノーマライズ値ｘが大きい場合はミックス値ｇ_ｍｉｘが一定値であるｃとなる。このｃの値を残響音が目立たない程度の値に実験等により定めておけば、メイン音響信号とサブ音響信号の相関が高くても、残響音が目立つほどのリバーブ効果の付与がメイン音響信号に施されることなく、最終的に放音される音におけるメイン音響信号とサブ音響信号により放音される各音のバランスが崩れることはない。 When the correlation between the main sound signal and the sub sound signal is high, if the reverb effect added to the main sound signal is too strong, the reverberant sound becomes noticeable, and the sound finally emitted by the main sound signal and the sub sound signal The balance will be lost. If the signal processing apparatus 104, when normalized value x is larger than the threshold value x ₀ is the c-mix value g _mix is constant value. If the value of c is set to a value such that the reverberant sound is inconspicuous by experiment or the like, even if the correlation between the main acoustic signal and the sub-acoustic signal is high, the reverberation effect so that the reverberant sound is conspicuous is given. Therefore, the balance of each sound emitted by the main sound signal and the sub sound signal in the sound finally emitted is not lost.

＜第５実施形態＞
本実施形態の信号処理装置１０５は、信号処理装置１０３と信号処理装置１０４を組み合わせたものである。詳しく説明すると、信号処理装置１０５は、信号処理装置１０４に対して、信号処理装置１０４の相互相関検出部２０の換わりに信号処理装置１０３の相互相関検出部２１と平均算出部９０を設けたものに相当する。 <Fifth Embodiment>
The signal processing device 105 of this embodiment is a combination of the signal processing device 103 and the signal processing device 104. More specifically, the signal processing device 105 is provided with a cross-correlation detection unit 21 and an average calculation unit 90 of the signal processing device 103 instead of the cross-correlation detection unit 20 of the signal processing device 104 with respect to the signal processing device 104. It corresponds to.

信号処理装置１０５では、相互相関検出部２１と平均算出部９０が、τ＝０、１、２の場合の相関値Ｒ_ｍｓ（τ）の相加平均から平均相関値Ｒ_ｍｓを算出して、リバーブ効果の付与の強弱を決定している。そのため、信号処理装置１０５を用いると、メイン音響信号とメイン音響信号以外の他の信号であるサブ音響信号とに時間のずれがあっても、最終的に放音される音にはメイン音響信号とサブ音響信号のまじりの良いリバーブ効果が付与され、リバーブ効果によってメイン音響信号とサブ音響信号により放音される各音のバランスが崩れることはない。 In the signal processing device 105, the cross-correlation detection unit 21 and the average calculation unit 90 calculate the average correlation value R _ms from the arithmetic average of the correlation values R _ms (τ) when τ = 0, 1, and 2, Determines the strength of the reverb effect. Therefore, when the signal processing device 105 is used, even if there is a time lag between the main sound signal and the sub sound signal that is a signal other than the main sound signal, the sound that is finally emitted is the main sound signal. In addition, the reverb effect with which the sub-acoustic signal is well-rounded is applied, and the balance between the sounds emitted by the main acoustic signal and the sub-acoustic signal is not lost by the reverb effect.

本実施形態においても第３実施形態と同様に、τは０、１、２に限られるものではなく、また平均相関値Ｒ_ｍｓを算出するのに用いるτの数は３つに限られない。さらに、複数の相関値Ｒ_ｍｓ（τ）から平均相関値Ｒ_ｍｓを算出する方法として相加平均を用いるだけでなく、他の算出方法を用いてもよい。要は、最終的に放音される音にメイン音響信号とサブ音響信号のまじりの良いリバーブ効果を付与できれば平均相関値Ｒ_ｍｓの算出方法は何でもよく、実験により好適な算出方法を決定すればよい。 Also in the present embodiment, as in the third embodiment, τ is not limited to 0, 1, and 2, and the number of τ used to calculate the average correlation value R _ms is not limited to three. Furthermore, not only an arithmetic mean is used as a method for calculating the average correlation value R _ms from a plurality of correlation values R _ms (τ), but other calculation methods may be used. In short, the average correlation value R _ms can be calculated by any method as long as the main sound signal and sub-acoustic signal can be given a good reverb effect to the sound finally emitted. Good.

＜第６実施形態＞
図８は、この発明の第６実施形態である信号処理装置１０６のブロック図である。図８では、図１におけるものと同一の構成要素には同一の符号が付されている。図８と図１を比較すれば明らかなように、信号処理装置１０６と信号処理装置１０１との違いは、時間周波数変換部１０の前段にハイパスフィルタ（以下、ＨＰＦ）８１、バンドパスフィルタ（以下、ＢＰＦ）８２、ローパスフィルタ（以下、ＬＰＦ）８３を設けた点と、乗算部４０の後段に加算部７１を設けた点である。以下では第１実施形態との相違点である、ＨＰＦ８１、ＢＰＦ８２、ＬＰＦ８３と加算部７１を中心に説明する。 <Sixth Embodiment>
FIG. 8 is a block diagram of a signal processing device 106 according to the sixth embodiment of the present invention. In FIG. 8, the same components as those in FIG. 1 are denoted by the same reference numerals. As apparent from a comparison between FIG. 8 and FIG. 1, the difference between the signal processing device 106 and the signal processing device 101 is that a high-pass filter (hereinafter, HPF) 81, a band-pass filter (hereinafter, “HPF”) 81 , BPF) 82 and a low-pass filter (hereinafter referred to as LPF) 83, and an addition unit 71 subsequent to the multiplication unit 40. Below, it demonstrates centering on HPF81, BPF82, LPF83, and the addition part 71 which are the different points from 1st Embodiment.

ＨＰＦ８１は入力された信号の周波数ｆ_ＨＬ以上の高周波数周波数帯域のみを出力するフィルタであり、ＢＰＦ８２は周波数ｆ_ＢＬから周波数ｆ_ＢＨの中間周波数帯域のみを出力するフィルタであり（周波数ｆ_ＢＬ＜周波数ｆ_ＢＨ）、ＬＰＦ８３は周波数ｆ_ＬＨ以下の低周波数帯域のみを出力するフィルタである。本実施形態では、周波数ｆ_ＨＬ＝周波数ｆ_ＢＨであり、周波数ｆ_ＢＬ＝周波数ｆ_ＬＨであるが、各周波数は等しくなくてもよい。各周波数の値と各周波数間の関係は、実験により適宜決定すればよい。ＨＰＦ８１、ＢＰＦ８２、ＬＰＦ８３は、メイン音響信号やサブ音響信号に帯域分割処理を施す帯域分割部として機能する。 The HPF 81 is a filter that outputs only a high frequency frequency band that is equal to or higher than the frequency f _HL of the input signal, and the BPF 82 is a filter that outputs only an intermediate frequency band from the frequency f _BL to the frequency f _BH (frequency f _BL <frequency f _BH ), LPF 83 is a filter that outputs only a low frequency band of frequency f _LH or lower. In this embodiment, frequency f _HL = frequency f _BH and frequency f _BL = frequency f _LH , but the frequencies may not be equal. What is necessary is just to determine suitably the value between each frequency value and each frequency by experiment. The HPF 81, BPF 82, and LPF 83 function as a band dividing unit that performs band division processing on the main acoustic signal and the sub acoustic signal.

メイン音響信号とサブ音響信号は、信号処理装置１０６に入力されると、まずＨＰＦ８１、ＢＰＦ８２、ＬＰＦ８３に入力される。メイン音響信号については、ＨＰＦ８１、ＢＰＦ８２、ＬＰＦ８３が、各フィルタに対応した周波数帯域のみの信号を時間周波数変換部１０と乗算部４０に出力する。サブ音響信号については、ＨＰＦ８１、ＢＰＦ８２、ＬＰＦ８３が、各フィルタに対応した周波数帯域のみの信号を時間周波数変換部１０に出力する。 When the main sound signal and the sub sound signal are input to the signal processing device 106, first, the main sound signal and the sub sound signal are input to the HPF 81, the BPF 82, and the LPF 83. For the main acoustic signal, the HPF 81, the BPF 82, and the LPF 83 output a signal of only a frequency band corresponding to each filter to the time frequency conversion unit 10 and the multiplication unit 40. For the sub-acoustic signal, the HPF 81, the BPF 82, and the LPF 83 output a signal of only a frequency band corresponding to each filter to the time frequency conversion unit 10.

ＨＰＦ８１、ＢＰＦ８２、ＬＰＦ８３がメイン音響信号とサブ音響信号を時間周波数変換部１０に出力し、メイン音響信号を乗算部４０に出力してからは、高周波帯域、中間周波数帯域、低周波数帯域ごとに第１実施形態と同様の処理が行わるため説明を省略する。乗算部４０は、高周波数帯域、中間周波数帯域、低周波数帯域の周波数帯域毎に信号レベルを増幅したメイン音響信号を加算部７１に出力する。 After the HPF 81, the BPF 82, and the LPF 83 output the main sound signal and the sub sound signal to the time frequency conversion unit 10 and output the main sound signal to the multiplication unit 40, the high frequency band, the intermediate frequency band, and the low frequency band Since the same processing as in the first embodiment is performed, the description is omitted. The multiplication unit 40 outputs a main acoustic signal obtained by amplifying the signal level for each of the high frequency band, the intermediate frequency band, and the low frequency band to the adding unit 71.

加算部７１は、周波数帯域毎に信号レベルを増幅したメイン音響信号を加算して出力信号として出力する。 The adder 71 adds the main acoustic signal whose signal level is amplified for each frequency band and outputs the result as an output signal.

このように信号処理装置１０６は、ＨＰＦ８１、ＢＰＦ８２、ＬＰＦ８３を用いて、メイン音響信号およびサブ音響信号に帯域分割処理を施し、周波数帯域毎にメイン音響信号とサブ音響信号の相関値Ｒ_ｍｓを算出し、相関値Ｒ_ｍｓに基づいて周波数帯域毎にメイン音響信号の信号レベルを増幅して最後に加算する。信号処理装置１０６によれば、周波数帯域毎のサブ音響信号との相関に応じてメイン音響信号の増幅制御を周波数帯域毎に細やかに行うことができる。 As described above, the signal processing device 106 performs band division processing on the main acoustic signal and the sub acoustic signal using the HPF 81, the BPF 82, and the LPF 83, and calculates a correlation value R _ms between the main acoustic signal and the sub acoustic signal for each frequency band. Then, the signal level of the main acoustic signal is amplified for each frequency band based on the correlation value R _ms and finally added. According to the signal processing device 106, the amplification control of the main acoustic signal can be finely performed for each frequency band in accordance with the correlation with the sub acoustic signal for each frequency band.

＜第７実施形態＞
本実施形態の信号処理装置１０７と信号処理装置１０１との違いは、時間周波数変換部１０の前段にＨＰＦ８１とＢＰＦ８２を設けた点である。信号処理装置１０７では、メイン音響信号はＢＰＦ８２を介して時間周波数変換部１０に入力されるとともに乗算部４０に入力され、サブ音響信号はＨＰＦ８１を介して時間周波数変換部１０に入力される。 <Seventh embodiment>
The difference between the signal processing device 107 and the signal processing device 101 of this embodiment is that an HPF 81 and a BPF 82 are provided in the previous stage of the time-frequency conversion unit 10. In the signal processing device 107, the main acoustic signal is input to the time-frequency conversion unit 10 via the BPF 82 and also input to the multiplication unit 40, and the sub-acoustic signal is input to the time-frequency conversion unit 10 via the HPF 81.

本実施形態では、ＨＰＦ８１やＢＰＦ８２により帯域分割された音響信号に対してフーリエ変換や相関値Ｒ_ｍｓの算出が行われるため、最終的に放音される音におけるメイン音響信号とサブ音響信号により放音される各音のバランスが崩れることはなく、さらに信号処理装置１０６に比べて装置の処理負荷を減らすことができる。 In the present embodiment, since the Fourier transform and the correlation value R _ms are calculated for the acoustic signal band-divided by the HPF 81 and the BPF 82, the main acoustic signal and the sub acoustic signal in the sound finally emitted are emitted. The balance of each sound to be sounded is not lost, and the processing load of the device can be reduced as compared with the signal processing device 106.

信号処理装置１０７では、メイン音響信号に対してＢＰＦ８２を用い、サブ音響信号に対してＨＰＦ８１を用いた。しかし、信号処理装置１０７にＨＰＦ８１、ＢＰＦ８２、ＬＰＦ８３を設けておき、信号処理装置１０７に入力されるメイン音響信号とサブ音響信号の集中している周波数帯域に応じて、ＨＰＦ８１、ＢＰＦ８２、ＬＰＦ８３の中から異なるフィルタを選択してもよい。もちろん、メイン音響信号或いはサブ音響信号のいずれか一方のみに対して、ＨＰＦ８１、ＢＰＦ８２、ＬＰＦ８３のいずれかを用いる態様であってもよい。 In the signal processing device 107, the BPF 82 is used for the main sound signal, and the HPF 81 is used for the sub sound signal. However, the signal processing device 107 is provided with HPF 81, BPF 82, and LPF 83, and the HPF 81, BPF 82, and LPF 83 are arranged according to the frequency band where the main acoustic signal and the sub acoustic signal input to the signal processing device 107 are concentrated. Different filters may be selected. Of course, an aspect in which any one of the HPF 81, the BPF 82, and the LPF 83 is used for only one of the main sound signal and the sub sound signal.

＜第８実施形態＞
図９は、この発明の第８実施形態である信号処理装置１０８のブロック図である。信号処理装置１０８は例えばカラオケ装置に適用される。図９では、図１におけるものと同一の構成要素には同一の符号が付されている。図９と図１を比較すれば明らかなように、信号処理装置１０８と信号処理装置１０１との違いは、時間周波数変換部１０にサブ音響信号としてボーカル音の音響信号と伴奏音の音響信号が入力され、乗算部４０にメイン音響信号としてボーカル音と伴奏音の音響信号の合成信号（以下、合成信号）が入力される点である。 <Eighth Embodiment>
FIG. 9 is a block diagram of a signal processing apparatus 108 according to the eighth embodiment of the present invention. The signal processing device 108 is applied to, for example, a karaoke device. In FIG. 9, the same components as those in FIG. 1 are denoted by the same reference numerals. As is clear from comparison between FIG. 9 and FIG. 1, the difference between the signal processing device 108 and the signal processing device 101 is that the time-frequency conversion unit 10 receives a vocal sound signal and an accompaniment sound signal as sub-acoustic signals. This is a point where a composite signal (hereinafter referred to as a composite signal) of a vocal sound and an accompaniment sound signal is input to the multiplier 40 as a main sound signal.

信号処理装置１０８では、ボーカル音と伴奏音の音響信号の相関値Ｒ_ｍｓに基づいて、パラメータ算出部３０が増幅率ｙを算出する。乗算部４０は、この増幅率ｙに基づいて、合成信号に増幅処理を施す。 In the signal processing device 108, the parameter calculation unit 30 calculates the amplification factor y based on the correlation value R _ms between the vocal sound and the accompaniment sound. The multiplier 40 performs amplification processing on the combined signal based on the amplification factor y.

本実施形態のように、ボーカル音の音響信号に対してではなく、合成信号に対して直接増幅処理を施すことで、合成信号からボーカル音の音響信号を音源分離して増幅する場合と類似の効果を期待できる。すなわち、本実施形態の出力信号は、合成信号をボーカル音の音響信号と伴奏音の音響信号に音源分離して、分離したボーカル音の音響信号と伴奏音の音響信号の相関値Ｒ_ｍｓを算出して、ボーカル音の音響信号に増幅を施して出力する出力信号に類似する。音源分離を用いた方法では、音源分離の際に人工的なノイズが発生することがあり、音源分離後の音質が問題となることが多い。本実施形態では、信号処理装置１０８に入力されるボーカル音の音響信号と伴奏音の音響信号が合成信号に音源分離を施すことにより生成されたものであったとしても、増幅処理の対象となる合成信号は音源分離により生成されたものではないため、ボーカル音と伴奏音のバランスが崩れることがなく、音源分離の性能に出力信号の音質が依存しづらくなるという利点が生じる。 Similar to the case where the vocal sound signal is amplified by separating the sound signal from the synthesized signal by directly amplifying the synthesized signal instead of the vocal sound signal as in this embodiment. The effect can be expected. That is, the output signal of this embodiment is obtained by separating the synthesized signal into a vocal sound signal and an accompaniment sound signal, and calculating a correlation value R _ms between the separated vocal sound signal and the accompaniment sound signal. Thus, the output signal is similar to an output signal that is amplified by a vocal sound signal and output. In the method using sound source separation, artificial noise may occur during sound source separation, and the sound quality after sound source separation often becomes a problem. In the present embodiment, even if the vocal sound signal and the accompaniment sound signal input to the signal processing device 108 are generated by subjecting the synthesized signal to sound source separation, they are subjected to amplification processing. Since the synthesized signal is not generated by sound source separation, there is an advantage that the balance between the vocal sound and the accompaniment sound is not lost, and the sound quality of the output signal does not easily depend on the performance of the sound source separation.

信号処理装置１０８では、合成信号はサブ音響信号として入力された音響信号のみから構成されていたが、この態様に限られるわけではなく、合成信号がサブ音響信号として入力された音響信号の中の少なくとも１つの音響信号と他の音響信号とから構成された態様であってもよい。この態様では、例えば、サブ音響信号がボーカル音の音響信号とドラム音の音響信号であり、メイン音響信号がドラム音とギター音の音響信号の合成信号である。さらに、サブ音響信号として３種類以上の音響信号が入力される場合には、相互相関検出部２０が音響信号の中から相関値Ｒ_ｍｓを算出する音響信号を選択する態様であってもよい。この態様では、例えばサブ音響信号としてボーカル音、ドラム音およびギター音の音響信号が各々入力されると、相互相関検出部２０は、入力された音響信号からボーカル音の音響信号とドラム音の音響信号を選択し、両信号の相関値Ｒ_ｍｓを算出する。さらに、この選択をユーザの指示に応じて切り換えるようにしてもよい。この場合のメイン音響信号は、ボーカル音とドラム音の音響信号の合成信号であってもよいし、ドラム音とギター音の音響信号の合成信号であってもよい。もちろん、メイン音響信号がボーカル音、ドラム音およびギター音の音響信号の合成信号であってもよい。 In the signal processing device 108, the synthesized signal is composed only of the acoustic signal input as the sub-acoustic signal. However, the present invention is not limited to this mode, and the synthesized signal is included in the acoustic signal input as the sub-acoustic signal. The aspect comprised from the at least 1 acoustic signal and the other acoustic signal may be sufficient. In this aspect, for example, the sub sound signal is a vocal sound sound signal and a drum sound sound signal, and the main sound signal is a composite signal of a drum sound and a guitar sound sound signal. Furthermore, when three or more types of acoustic signals are input as sub acoustic signals, the cross correlation detection unit 20 may select an acoustic signal for calculating the correlation value R _ms from the acoustic signals. In this aspect, for example, when a vocal sound, a drum sound, and a guitar sound acoustic signal are input as sub-acoustic signals, the cross-correlation detector 20 detects the vocal sound acoustic signal and the drum sound acoustic from the input acoustic signal. A signal is selected, and a correlation value R _ms of both signals is calculated. Further, this selection may be switched according to a user instruction. The main acoustic signal in this case may be a synthesized signal of vocal sound and drum sound, or may be a synthesized signal of drum sound and guitar sound. Of course, the main sound signal may be a composite signal of vocal sound, drum sound and guitar sound.

＜第９実施形態＞
本実施形態の信号処理装置１０９も例えばカラオケ装置に適用される。信号処理装置１０９と信号処理装置１０１との違いは、メイン音響信号としてボーカル音と伴奏音の合成音の時間波形を表す合成信号が時間周波数変換部１０と乗算部４０に入力され、サブ音響信号として伴奏音の音響信号が時間周波数変換部１０に入力されることである。 <Ninth Embodiment>
The signal processing device 109 of the present embodiment is also applied to, for example, a karaoke device. The difference between the signal processing device 109 and the signal processing device 101 is that a synthesized signal representing a time waveform of a synthesized sound of a vocal sound and an accompaniment sound is input to the time frequency conversion unit 10 and the multiplication unit 40 as a main acoustic signal, and the sub acoustic signal Is that the acoustic signal of the accompaniment sound is input to the time-frequency converter 10.

信号制御装置１０９では、伴奏音の音響信号と合成信号との相関値Ｒ_ｍｓに基づいて増幅率ｙが算出され、その増幅率ｙに基づいて、合成信号に増幅処理が施されている。 In the signal control device 109, the amplification factor y is calculated based on the correlation value R _ms between the accompaniment sound signal and the synthesized signal, and the synthesized signal is subjected to amplification processing based on the amplification factor y.

本実施形態では、ボーカル音の音響信号の入力がなくても伴奏音の音響信号と合成信号との入力があれば、ボーカル音と伴奏音のバランスを崩すことなく、合成信号に増幅処理を施すことができる。 In the present embodiment, if there is an input of an acoustic signal of an accompaniment sound and a composite signal without an input of an acoustic signal of a vocal sound, the composite signal is subjected to amplification processing without breaking the balance of the vocal sound and the accompaniment sound. be able to.

＜第１０実施形態＞
図１０は、この第１０実施形態である信号処理装置１１０のブロック図である。信号処理装置１１０も例えばカラオケ装置に適用される。図１０では、図１におけるものと同一の構成要素には同一の符号が付されている。図１０と図１を比較すれば明らかなように、信号処理装置１１０と信号処理装置１０１との違いは、時間周波数変換部１０に複数種類のサブ音響信号が入力される点と平均算出部９０が設けられた点である。本実施形態では、時間周波数変換部１０に入力されたサブ音響信号の種類はＮ種類（Ｎ≧２）である。Ｎ種類のサブ音響信号の具体例としては、ドラム音、ギター音、キーボード音といった伴奏楽器毎の伴奏音の音響信号が挙げられる。 <Tenth Embodiment>
FIG. 10 is a block diagram of a signal processing apparatus 110 according to the tenth embodiment. The signal processing device 110 is also applied to, for example, a karaoke device. 10, the same components as those in FIG. 1 are denoted by the same reference numerals. As is apparent from a comparison between FIG. 10 and FIG. 1, the difference between the signal processing device 110 and the signal processing device 101 is that a plurality of types of sub-acoustic signals are input to the time frequency conversion unit 10 and the average calculation unit 90. Is a point provided. In the present embodiment, there are N types (N ≧ 2) of types of sub-acoustic signals input to the time-frequency conversion unit 10. Specific examples of the N types of sub sound signals include sound signals of accompaniment sounds for each accompaniment instrument such as drum sounds, guitar sounds, and keyboard sounds.

時間周波数変換部１０は、メイン音響信号とＮ種類のサブ音響信号のフーリエ変換データを相互相関検出部２０に出力する。相互相関検出部２０は、時間周波数変換部１０から入力されたメイン音響信号とＮ種類のサブ音響信号との各々の相関を検出する。より詳細に説明すると、相互相関検出部２０は、式（１）と式（２）に基づいて、メイン音響信号とサブ音響信号１の相関値Ｒ_ｍｓ１を算出し、メイン音響信号とサブ音響信号２の相関値Ｒ_ｍｓ２を算出する。以下同様に、相互相関検出部２０は、メイン音響信号とサブ音響信号Ｎの相関値Ｒ_ｍｓＮまで算出する。相互相関検出部２０は算出した相関値Ｒ_ｍｓ１〜Ｒ_ｍｓＮを平均算出部９０に出力する。平均算出部９０は、入力された相関値Ｒ_ｍｓ１〜Ｒ_ｍｓＮの相加平均を平均相関値Ｒ_ｍｓとして算出する。平均算出部９０は、算出した平均相関値Ｒ_ｍｓをパラメータ算出部３０に出力し、乗算部４０は、パラメータ算出部３０により算出された増幅率ｙに基づいてメイン音響信号に増幅処理を施し、出力信号として出力する。 The time frequency conversion unit 10 outputs the Fourier transform data of the main sound signal and the N types of sub sound signals to the cross correlation detection unit 20. The cross correlation detection unit 20 detects the correlation between the main acoustic signal input from the time frequency conversion unit 10 and the N types of sub acoustic signals. More specifically, the cross-correlation detection unit 20 calculates a correlation value R _ms1 between the main acoustic signal and the sub acoustic signal 1 based on the equations (1) and (2), and the main acoustic signal and the sub acoustic signal. 2 correlation value R _ms2 is calculated. Similarly, the cross correlation detection unit 20 calculates up to the correlation value R _msN between the main acoustic signal and the sub acoustic signal N. The cross-correlation detection unit 20 outputs the calculated correlation values R _{ms1 to} R _msN to the average calculation unit 90. The average calculator 90 calculates the arithmetic average of the input correlation values R _{ms1 to} R _msN as the average correlation value R _ms . The average calculation unit 90 outputs the calculated average correlation value R _ms to the parameter calculation unit 30, and the multiplication unit 40 performs amplification processing on the main acoustic signal based on the amplification factor y calculated by the parameter calculation unit 30, Output as an output signal.

信号処理装置１１０では、全てのサブ音響信号とメイン音響信号との平均相関値Ｒ_ｍｓが算出され、この平均相関値Ｒ_ｍｓに応じた増幅率でメイン音響信号の増幅が行われる。このため、最終的に放音される音におけるメイン音響信号と複数のサブ音響信号のバランスが崩れることはない。 In the signal processing device 110, an average correlation value R _ms between all the sub-acoustic signals and the main acoustic signal is calculated, and the main acoustic signal is amplified at an amplification factor corresponding to the average correlation value R _ms . For this reason, the balance between the main sound signal and the plurality of sub sound signals in the sound finally emitted is not lost.

＜変形例＞
（１）上記実施形態を各々組み合わせた態様をとってもよい。例えば、第７実施形態と第１０実施形態を組み合わせても良い。この場合、入力されるサブ音響信号が複数あっても、最終的に放音される音におけるメイン音響信号と複数のサブ音響信号により放音される各音のバランスが崩れることなく、メイン音響信号に増幅処理を施すことができる。さらにこの場合、サブ音響信号が１つしかない、すなわち第１実施形態の装置の処理負荷にサブ音響信号の数を乗じた処理負荷よりも処理負荷を減らすことができる。 <Modification>
(1) The aspect which combined each said embodiment may be taken. For example, the seventh embodiment and the tenth embodiment may be combined. In this case, even if there are a plurality of input sub sound signals, the main sound signal is not lost in the balance between the main sound signal in the sound finally emitted and the sounds emitted by the plurality of sub sound signals. Can be amplified. Furthermore, in this case, there is only one sub acoustic signal, that is, the processing load can be reduced more than the processing load obtained by multiplying the processing load of the apparatus of the first embodiment by the number of sub acoustic signals.

（２）上記各実施形態では、メイン音響信号に増幅処理或いはリバーブ効果を付与する処理を施していたが、メイン音響信号に施す信号処理はこれらの処理に限られることはなく、他の音響効果を与える処理をメイン音響信号に施してもよい。 (2) In each of the above embodiments, the main sound signal is subjected to amplification processing or reverberation processing, but the signal processing applied to the main sound signal is not limited to these processing, and other sound effects are also provided. May be applied to the main acoustic signal.

（３）上記各実施形態では、時間周波数変換部１０が、入力を受けたメイン音響信号とサブ音響信号の２種類のデジタル信号を一定時間長のフレームに区切ってフーリエ変換を行い、フーリエ変換によって算出したフーリエ変換データを相互相関検出部２０に出力していたが、この時間周波数変換部１０が存在しない態様をとってもよい。この態様では、入力されたメイン音響信号とサブ音響信号が相互相関検出部２０に直接入力され、相互相関検出部２０は、式（１）或いは式（６）を用い、メイン音響信号とサブ音響信号に基づいて相関値Ｒ_ｍｓ或いは相関値Ｒ_ｍｓ（τ）を算出する。この場合、τは音響信号を構成する各サンプルを示すインデックスである。この態様によっても、メイン音響信号とサブ音響信号により放音される音全体のバランスを保ちつつ、音響信号のダイナミックレンジの圧縮や音響信号へのリバーブ効果の付与を行うことができる。 (3) In each of the embodiments described above, the time-frequency conversion unit 10 performs Fourier transform by dividing the two types of digital signals, the main sound signal and the sub-acoustic signal, that are received into frames of a certain time length, and performs Fourier transform. Although the calculated Fourier transform data is output to the cross-correlation detecting unit 20, an aspect in which the time-frequency converting unit 10 does not exist may be employed. In this aspect, the input main acoustic signal and sub-acoustic signal are directly input to the cross-correlation detection unit 20, and the cross-correlation detection unit 20 uses the formula (1) or the formula (6), and the main acoustic signal and the sub-acoustic signal. Based on the signal, the correlation value R _ms or the correlation value R _ms (τ) is calculated. In this case, τ is an index indicating each sample constituting the acoustic signal. Also according to this aspect, it is possible to compress the dynamic range of the acoustic signal and to apply the reverb effect to the acoustic signal while maintaining the balance of the entire sound emitted by the main acoustic signal and the sub acoustic signal.

（４）上記各実施形態では、メイン音響信号とサブ音響信号の両者がデジタル信号であったが、一方（或いは両方）がアナログ信号であってもよい。各実施形態の信号処理装置にアナログ信号が入力される場合には、信号処理装置の前段にＡ／Ｄ変換器を設けておけばよい。 (4) In the above embodiments, both the main sound signal and the sub sound signal are digital signals, but one (or both) may be analog signals. When an analog signal is input to the signal processing device of each embodiment, an A / D converter may be provided in the previous stage of the signal processing device.

（５）上記各実施形態では、本発明の信号処理装置特有の機能をプログラム（ＤＳＰのマイクロプログラム）により実現したが、このプログラム単体で提供してもよい。例えば、カラオケ装置の既存のプログラムに上記各実施形態のいずれかのプログラムを追加することで、最終的に放音される音のボーカル音と伴奏音のバランスを崩すことなく、ボーカル音の音響信号にダイナミックレンジの圧縮やリバーブ効果の付与を行うことができる。 (5) In the above embodiments, functions specific to the signal processing apparatus of the present invention are realized by a program (DSP microprogram). However, the program may be provided alone. For example, by adding the program of any of the above embodiments to the existing program of the karaoke apparatus, the acoustic signal of the vocal sound is maintained without breaking the balance between the vocal sound of the sound finally emitted and the accompaniment sound. Dynamic range compression and reverb effect can be applied.

（６）上記各実施形態による増幅処理或いはリバーブ効果の付与をＡＳＰ（アプリケーションサービスプロバイダ）形式の通信サービスで提供してもよい。例えば、第１実施形態の増幅処理をＡＳＰ形式で提供する場合は、信号処理装置１０１を通信回線に接続しておく。そして、信号処理装置１０１にメイン音響信号とサブ音響信号が通信回線経由で入力され、信号処理装置１０１は、メイン音響信号とサブ音響信号の相関に基づいて増幅処理を施したメイン音響信号を出力信号として出力し、その出力信号を通信回線経由で返信する。 (6) The amplification processing or the reverberation effect according to each of the above embodiments may be provided by an ASP (Application Service Provider) type communication service. For example, when the amplification processing of the first embodiment is provided in the ASP format, the signal processing device 101 is connected to a communication line. Then, the main acoustic signal and the sub acoustic signal are input to the signal processing device 101 via the communication line, and the signal processing device 101 outputs the main acoustic signal that has been subjected to amplification processing based on the correlation between the main acoustic signal and the sub acoustic signal. Output as a signal, and return the output signal via the communication line.

１０１，１０２，１０３，１０４，１０５，１０６，１０７，１０８，１０９，１１０……信号処理装置、１０……時間周波数変換部、２０，２１……相互相関検出部、３０，３１……パラメータ算出部、４０，４１……乗算部、５０……重み付け部、６０……残響信号生成部、７０，７１……加算部、８１……ＨＰＦ、８２……ＢＰＦ、８３……ＬＰＦ、９０……平均算出部。
101, 102, 103, 104, 105, 106, 107, 108, 109, 110 ... signal processing device, 10 ... time frequency converter, 20, 21 ... cross-correlation detector, 30, 31 ... parameter calculation , 40, 41... Multiplying unit, 50... Weighting unit, 60... Reverberation signal generation unit, 70, 71 ...... Addition unit, 81 ...... HPF, 82 ...... BPF, 83 ...... LPF, 90 ...... Average calculator.

Claims

A cross-correlation detector that calculates a correlation value between a plurality of signals among a main acoustic signal to be subjected to signal processing that gives an acoustic effect and one or a plurality of sub-acoustic signals;
A signal processing unit that determines the intensity of the acoustic effect according to the correlation value calculated in the cross-correlation detection unit;
A signal processing apparatus comprising:

The signal processing apparatus according to claim 1, further comprising a weighting unit that weights an audible range of at least one of the plurality of signals and applies the audible range to the cross-correlation detection unit.

The cross-correlation detection unit calculates a plurality of correlation values between one signal of the plurality of signals and one or more signals of the plurality of signals that are different in time from the signal, and the plurality of correlations 3. The signal processing apparatus according to claim 1, wherein one correlation value is obtained from the value.

A band dividing unit that performs band division on at least one of the main acoustic signal and the one or more sub-acoustic signals;
The said cross correlation detection part calculates the said correlation value using at least 1 of the signal of the zone | band divided | segmented by the said band division part. The claim of any one of Claims 1-3 characterized by the above-mentioned. A signal processing device according to 1.