JP5392057B2

JP5392057B2 - Audio processing apparatus, audio processing method, and audio processing program

Info

Publication number: JP5392057B2
Application number: JP2009291156A
Authority: JP
Inventors: 定浩安良
Original assignee: JVCKenwood Corp
Current assignee: JVCKenwood Corp
Priority date: 2009-12-22
Filing date: 2009-12-22
Publication date: 2014-01-22
Anticipated expiration: 2029-12-22
Also published as: JP2011133568A

Description

本発明は、デジタル音声信号を分析し、その分析結果を用いてデジタル音声信号を加工処理する音声処理装置、音声処理方法および音声処理プログラムに関する。 The present invention relates to a voice processing apparatus, a voice processing method, and a voice processing program for analyzing a digital voice signal and processing the digital voice signal using the analysis result.

近年、音声符号化技術の進歩により、ＣＤ（Compact Disc）等に収録されている楽曲の音質を極力維持したままファイルサイズを小さくすることが可能となり、その結果として、メモリタイプのポータブルオーディオプレーヤに例えば大量の楽曲を収録して携帯することが可能となった。 In recent years, due to advances in audio coding technology, it has become possible to reduce the file size while maintaining the sound quality of music recorded on CDs (Compact Discs) as much as possible. For example, a large amount of music can be recorded and carried.

しかし、上述した音声符号化技術は、人間の聴覚特性を利用して通常聞き取れない高周波数領域の音声信号をカットしたり、マスキング効果により聞き取れない音のデータを間引いたりしているため、原音と比較すると、音の伸び、広がり、ダイナミックレンジ、艶っぽさに乏しくなる。そのため、音声符号化技術により圧縮されたデジタル音声信号の音質を改善する技術が開発されている。 However, the above-described speech coding technology cuts out high-frequency speech signals that are not normally heard using human auditory characteristics, or thins out sound data that cannot be heard due to the masking effect. In comparison, the sound will be less stretched, spread, dynamic range and glossy. Therefore, a technique for improving the sound quality of a digital audio signal compressed by an audio encoding technique has been developed.

例えば、デジタル音声信号の極大値と極小値とを特定し、極小値から極大値まで、または、極大値から極小値までのサンプル数を計数し、極大値および極小値を除くサンプル毎に、前のサンプルの値との差分を算出してこれにサンプル数に応じた係数を乗算し、この乗算結果を、極大値や極小値に近いサンプル位置に対して加減算する技術が開示されている（例えば、特許文献１）。 For example, the maximum and minimum values of a digital audio signal are identified, the number of samples from the minimum value to the maximum value, or from the maximum value to the minimum value is counted, and for each sample excluding the maximum value and the minimum value, A technique is disclosed in which a difference from a sample value is calculated, a coefficient corresponding to the number of samples is multiplied, and the multiplication result is added to or subtracted from a sample position close to a maximum value or a minimum value (for example, Patent Document 1).

また、同様に、極値間のサンプル数を計数し、極大値や極小値と各々１サンプル前の値との差分を算出してこれにサンプル数に応じた係数を乗算し、この乗算結果を、極大値や極小値に直接、または、極大値や極小値に近いサンプル位置に加減算する技術も知られている（例えば、特許文献２）。 Similarly, the number of samples between extreme values is counted, the difference between the maximum value or the minimum value and the value one sample before is calculated, and this is multiplied by a coefficient corresponding to the number of samples. There is also known a technique of adding or subtracting directly to or from a sample position close to a maximum value or minimum value (for example, Patent Document 2).

特許第３４０１１７１号公報Japanese Patent No. 3401171 特許第３６５９４８９号公報Japanese Patent No. 3659489

上述した特許文献１や２の技術では、極小値から極大値までの区間と極大値から極小値までの区間とが独立して制御され、それぞれの区間のサンプル数に応じて、乗算する係数や加減算対象となるサンプル位置が決定される。したがって、デジタル音声信号の不規則な変化に対しても半周期に相当する区間毎に適切な音質改善処理を行うことができる。 In the techniques of Patent Documents 1 and 2 described above, the interval from the minimum value to the maximum value and the interval from the maximum value to the minimum value are controlled independently, and the coefficient to be multiplied according to the number of samples in each interval, The sample position to be added / subtracted is determined. Therefore, appropriate sound quality improvement processing can be performed for each section corresponding to a half cycle even for irregular changes in the digital audio signal.

一方、デジタル音声信号に含まれる各信号の周波数とサンプリング周波数とが等しくなる保証はないので、デジタル音声信号が仮に規則的な正弦波であったとしても、その周波数がサンプリング周波数と異なれば、半周期毎のサンプル数、すなわち極小値から極大値までのサンプル数と極大値から極小値までのサンプル数とが異なる場合が生じ得る。 On the other hand, since there is no guarantee that the frequency of each signal included in the digital audio signal is equal to the sampling frequency, even if the digital audio signal is a regular sine wave, if the frequency is different from the sampling frequency, it is half There may occur a case where the number of samples per period, that is, the number of samples from the minimum value to the maximum value is different from the number of samples from the maximum value to the minimum value.

このように極値間のサンプル数が異なると、乗算する係数や加減算対象となるサンプル位置が変わり、半周期毎に音質の改善量が異なることとなるので、規則的な正弦波に対しても音質改善処理が偏る結果を招き、十分に音質改善効果が発揮されない場合があった。 Thus, if the number of samples between extreme values is different, the coefficient to be multiplied and the sample position to be added or subtracted will change, and the amount of improvement in sound quality will differ every half cycle, so even for regular sine waves In some cases, the sound quality improvement processing is biased, and the sound quality improvement effect is not sufficiently exhibited.

本発明は、このような課題に鑑み、デジタル音声信号に含まれる各信号に対して画一的に音声改善処理を施すことで、音質改善の均一化を図ることが可能な、音声処理装置、音声処理方法および音声処理プログラムを提供することを目的としている。 In view of such a problem, the present invention provides a sound processing apparatus capable of achieving uniform sound quality improvement by uniformly performing sound improvement processing on each signal included in a digital sound signal, An object is to provide a voice processing method and a voice processing program.

上記課題を解決するために、本発明の音声処理装置は、入力されたデジタル音声信号の周波数分析を行い、デジタル音声信号を１または複数の基本波信号と１または複数の基本波信号を除いた残差信号とに分離する信号分離部と、１または複数の基本波信号それぞれに対し、振幅の絶対値が拡大されるような補正信号を生成して基本波信号に加算する補正信号加算部と、補正信号がそれぞれ加算された１または複数の基本波信号に残差信号を加算する残差信号加算部と、を備え、１または複数の基本波信号は、相異なる周波数の複数の基本波信号であり、信号分離部は、１または複数の基本波信号と同じ周波数の複数の基本波信号をそれぞれ単独でデジタル音声信号から減算した場合の差分信号を求め、差分信号のエネルギーが小さい順にデジタル音声信号から１または複数の基本波信号を順次減算してデジタル音声信号を１または複数の基本波信号と残差信号とに分離することを特徴とする。 In order to solve the above-described problem, the audio processing apparatus of the present invention performs frequency analysis of an input digital audio signal, and removes one or more fundamental wave signals and one or more fundamental wave signals from the digital audio signal. A signal separation unit that separates the residual signal into a residual signal, and a correction signal addition unit that generates a correction signal whose amplitude is expanded for each of one or a plurality of fundamental wave signals and adds the correction signal to the fundamental signal. A residual signal adding unit that adds a residual signal to one or a plurality of fundamental signals to which correction signals are added, respectively , wherein the one or the plurality of fundamental signals are a plurality of fundamental signals having different frequencies. The signal separation unit obtains a difference signal when a plurality of fundamental wave signals having the same frequency as that of one or more fundamental wave signals are subtracted from the digital audio signal, respectively, and the difference signal is decremented in ascending order of energy of the difference signal. Characterized that you separate the digital audio signal into a one or more of the fundamental wave signal and the residual signal by sequentially subtracting one or more of the fundamental wave signal from the barrel audio signal.

音声処理装置は、デジタル音声信号を所定のフレーム単位で切り出し、その所定のフレームごとのデジタル音声信号を生成するフレーム化部と、入力されたフレーム単位のデジタル音声信号を、隣り合うフレームのデジタル音声信号の一部がオーバラップするように合成するオーバラップ合成部と、をさらに備え、信号分離部に入力されるデジタル音声信号はフレーム化部で生成された所定のフレームに区切られたデジタル音声信号であり、オーバラップ合成部に入力されるフレーム単位のデジタル音声信号は残差信号加算部から入力されてもよい。 The audio processing device cuts out a digital audio signal in a predetermined frame unit, generates a digital audio signal for each predetermined frame, and inputs the input digital audio signal in a frame unit into a digital audio of an adjacent frame. An overlap synthesis unit that synthesizes the signals so that part of the signals overlap, and the digital audio signal input to the signal separation unit is a digital audio signal divided into predetermined frames generated by the framing unit The digital audio signal in units of frames input to the overlap synthesis unit may be input from the residual signal addition unit.

上述した１または複数の基本波信号は、所定の周波数と、所定の周波数を有する正弦波および余弦波のそれぞれの振幅とで表される信号であってもよい。 The one or more fundamental wave signals described above may be signals represented by a predetermined frequency and the amplitudes of a sine wave and a cosine wave having the predetermined frequency.

補正信号加算部は、１または複数の基本波信号それぞれの周波数と正弦波の振幅と余弦波の振幅とに応じて補正信号を生成してもよい。具体的に、補正信号加算部は、基本波信号の周波数と、振幅が１である正弦波および余弦波の各サンプル位置における補正信号の値とが予め対応付けられた補正テーブルを参照し、１または複数の基本波信号それぞれの周波数に応じて振幅が１である正弦波および余弦波の各サンプル位置における補正信号の値を抽出し、１または複数の基本波信号それぞれの正弦波の振幅と余弦波の振幅とを乗じて補正信号を生成してもよい。 The correction signal adding unit may generate a correction signal according to the frequency of each of the one or more fundamental wave signals, the amplitude of the sine wave, and the amplitude of the cosine wave. Specifically, the correction signal adding unit refers to a correction table in which the frequency of the fundamental wave signal and the value of the correction signal at each sample position of the sine wave and cosine wave having an amplitude of 1 are associated in advance. Alternatively, the value of the correction signal at each sample position of the sine wave and cosine wave whose amplitude is 1 is extracted according to the frequency of each of the plurality of fundamental wave signals, and the amplitude and cosine of the sine wave of each of the one or more fundamental wave signals. The correction signal may be generated by multiplying the amplitude of the wave.

上記課題を解決するために、本発明の音声処理方法は、入力されたデジタル音声信号の周波数分析を行い、周波数が相異なる１または複数の基本波信号と同じ周波数の複数の基本波信号をそれぞれ単独でデジタル音声信号から減算した場合の差分信号を求め、差分信号のエネルギーが小さい順にデジタル音声信号から１または複数の基本波信号を順次減算してデジタル音声信号を１または複数の基本波信号と１または複数の基本波信号を除いた残差信号とに分離し、１または複数の基本波信号それぞれに対し、振幅の絶対値が拡大されるような補正信号を生成して基本波信号に加算し、補正信号がそれぞれ加算された１または複数の基本波信号に残差信号を加算することを特徴とする。 In order to solve the above-described problem, the audio processing method of the present invention performs frequency analysis of an input digital audio signal, and each of a plurality of fundamental wave signals having the same frequency as one or a plurality of fundamental wave signals having different frequencies. A difference signal when subtracted from the digital audio signal alone is obtained, and one or more fundamental wave signals are sequentially subtracted from the digital audio signal in order of increasing energy of the difference signal, so that the digital audio signal is converted into one or more fundamental wave signals. Separated into residual signals excluding one or a plurality of fundamental signals, a correction signal is generated for each of the one or more fundamental signals so that the absolute value of the amplitude is expanded and added to the fundamental signal The residual signal is added to one or a plurality of fundamental wave signals to which the correction signals are added.

上記課題を解決するために、本発明の音声処理プログラムは、コンピュータに、入力されたデジタル音声信号の周波数分析を行い、周波数が相異なる１または複数の基本波信号と同じ周波数の複数の基本波信号をそれぞれ単独でデジタル音声信号から減算した場合の差分信号を求め、差分信号のエネルギーが小さい順にデジタル音声信号から１または複数の基本波信号を順次減算してデジタル音声信号を１または複数の基本波信号と１または複数の基本波信号を除いた残差信号とに分離する信号分離ステップと、１または複数の基本波信号それぞれに対し、振幅の絶対値が拡大されるような補正信号を生成して基本波信号に加算する補正信号生成ステップと、補正信号がそれぞれ加算された１または複数の基本波信号に残差信号を加算する残差信号加算ステップと、を実行させることを特徴とする。
In order to solve the above problems, a sound processing program of the present invention performs frequency analysis of a digital sound signal input to a computer, and a plurality of fundamental waves having the same frequency as one or a plurality of fundamental wave signals having different frequencies. A difference signal is obtained when each signal is subtracted from the digital audio signal alone, and one or more fundamental wave signals are sequentially subtracted from the digital audio signal in order of increasing energy of the difference signal to thereby obtain one or more basic digital audio signals. A signal separation step for separating a wave signal and a residual signal from which one or more fundamental wave signals are removed, and a correction signal that increases the absolute value of the amplitude for each of the one or more fundamental wave signals And a correction signal generating step for adding to the fundamental signal, and a residual signal for adding the residual signal to one or a plurality of fundamental signals to which the correction signal has been added. Characterized in that to execute a signal adding step.

以上説明したように、本発明によれば、デジタル音声信号に含まれる各信号に対して画一的に音声改善処理を施すことで、音質改善の均一化を図ることが可能となる。 As described above, according to the present invention, it is possible to achieve uniform sound quality improvement by uniformly performing sound improvement processing on each signal included in a digital sound signal.

音声処理装置の利用状態を説明するための説明図である。It is explanatory drawing for demonstrating the utilization condition of a speech processing unit. 音声処理装置の概略的な構成を説明するための機能ブロック図である。It is a functional block diagram for demonstrating the schematic structure of a speech processing unit. フレーム化部におけるフレーム信号の生成過程を説明するための説明図である。It is explanatory drawing for demonstrating the production | generation process of the frame signal in a framing part. 周波数分析候補となる所定数の周波数の一例を示した説明図である。It is explanatory drawing which showed an example of the predetermined number of frequency used as a frequency analysis candidate. 補正信号加算部のさらに具体的な構成を説明するための機能ブロック図である。It is a functional block diagram for demonstrating the more specific structure of a correction signal addition part. サンプル数と係数との関係を示した係数テーブルである。It is a coefficient table showing the relationship between the number of samples and coefficients. 補正信号加算部による音質改善処理の動作を説明するための説明図である。It is explanatory drawing for demonstrating operation | movement of the sound quality improvement process by the correction signal addition part. 補正信号加算部による音質改善処理の動作を説明するための説明図である。It is explanatory drawing for demonstrating operation | movement of the sound quality improvement process by the correction signal addition part. 補正信号加算部における音質改善処理を説明するための説明図である。It is explanatory drawing for demonstrating the sound quality improvement process in a correction signal addition part. オーバラップ合成部の動作を説明するための説明図である。It is explanatory drawing for demonstrating operation | movement of an overlap synthetic | combination part. コンピュータの典型例を示した機能ブロック図である。It is a functional block diagram showing a typical example of a computer. 音声分析合成方法の全体的な流れを示したフローチャートである。It is the flowchart which showed the whole flow of the speech analysis synthesis method.

以下に添付図面を参照しながら、本発明の好適な実施形態について詳細に説明する。かかる実施形態に示す寸法、材料、その他具体的な数値等は、発明の理解を容易とするための例示にすぎず、特に断る場合を除き、本発明を限定するものではない。なお、本明細書および図面において、実質的に同一の機能、構成を有する要素については、同一の符号を付することにより重複説明を省略し、また本発明に直接関係のない要素は図示を省略する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. The dimensions, materials, and other specific numerical values shown in the embodiments are merely examples for facilitating the understanding of the invention, and do not limit the present invention unless otherwise specified. In the present specification and drawings, elements having substantially the same function and configuration are denoted by the same reference numerals, and redundant description is omitted, and elements not directly related to the present invention are not illustrated. To do.

（音声処理装置１００）
図１は、音声処理装置１００の利用状態を説明するための説明図である。音声処理装置１００は、放送局１０２から放送波を通じて、コンテンツサーバ１０４から通信網１０６を通じて、または、記憶媒体１０８から直接、デジタル音声信号を取得し、そのデジタル音声信号に高周波数成分を付加することで、デジタル音声信号の音質を改善する。ユーザは、改善されたデジタル音声信号を、音声処理装置１００から直接、または、ポータブルオーディオプレーヤや携帯電話といった再生装置１１０に転送して聴くことができる。 (Speech processor 100)
FIG. 1 is an explanatory diagram for explaining a usage state of the speech processing apparatus 100. The audio processing apparatus 100 acquires a digital audio signal through broadcast waves from the broadcast station 102, through the communication network 106 from the content server 104, or directly from the storage medium 108, and adds a high frequency component to the digital audio signal. To improve the sound quality of digital audio signals. The user can listen to the improved digital audio signal directly from the audio processing device 100 or by transferring it to a playback device 110 such as a portable audio player or a mobile phone.

また、コンテンツサーバ１０４が、音声処理装置１００を有しても良く、その場合、コンテンツサーバ１０４の音声処理装置１００によって、高周波数成分が付加された音声信号はパーソナルコンピュータやポータブルオーディオプレーヤ、携帯電話といった再生装置１１０へ、通信網１０６を通じて、配信される。 Further, the content server 104 may include the audio processing device 100. In this case, the audio signal to which the high frequency component is added by the audio processing device 100 of the content server 104 is used as a personal computer, a portable audio player, or a mobile phone. To the playback device 110 through the communication network 106.

また、ポータブルオーディオプレーヤ、携帯電話といった再生装置１１０が音声処理装置１００を有しても良い。その場合、コンテンツサーバ１０４から通信網１０６を通じて配信されるデジタル音声信号は、ポータブルオーディオプレーヤ、携帯電話といった再生装置１１０の音声処理装置１００によって、高周波数成分が付加されて再生される。 Further, the playback device 110 such as a portable audio player or a mobile phone may include the sound processing device 100. In this case, a digital audio signal distributed from the content server 104 through the communication network 106 is reproduced with a high frequency component added by the audio processing device 100 of the reproduction device 110 such as a portable audio player or a mobile phone.

音声処理装置１００が取得可能なデジタル音声信号としては、ＣＤやＤＶＤ（Digital Versatile Disk）規格に基づいた音声信号の他、ＭＰＥＧ（Moving Picture Expert Group）−２、ＡＡＣ（Advanced Audio Coding）、ＨＥ−ＡＡＣ（High efficiency-AAC）、ＡＴＲＡＣ（Adaptive TRansform Acoustic Coding）、ＭＰ３（MPEG Audio Layer-3）、ＷＭＡ（Windows（登録商標） Media Audio）等の音声符号化処理によって周波数帯域が狭められた音声信号も含む。ここでは、入力されるデジタル音声信号の一例として、サンプリング周波数ｆｓ＝４４．１ｋＨｚ、量子化ビット数１６ビットのデジタル音声信号（ＣＤ規格）を挙げて音声処理装置１００の各機能を説明する。 Digital audio signals that can be acquired by the audio processing apparatus 100 include audio signals based on CD and DVD (Digital Versatile Disk) standards, MPEG (Moving Picture Expert Group) -2, AAC (Advanced Audio Coding), and HE-. Audio signal whose frequency band is narrowed by audio encoding processing such as AAC (High Efficiency-AAC), ATRAC (Adaptive TRansform Acoustic Coding), MP3 (MPEG Audio Layer-3), WMA (Windows (registered trademark) Media Audio) Including. Here, as an example of the input digital audio signal, each function of the audio processing apparatus 100 will be described by taking a digital audio signal (CD standard) having a sampling frequency fs = 44.1 kHz and a quantization bit number of 16 bits.

図２は、音声処理装置１００の概略的な構成を説明するための機能ブロック図である。音声処理装置１００は、フレーム化部１２０と、信号分離部１２２と、補正信号加算部１２４と、残差信号加算部１２６と、オーバラップ合成部１２８とを含んで構成される。 FIG. 2 is a functional block diagram for explaining a schematic configuration of the sound processing apparatus 100. The speech processing apparatus 100 includes a framing unit 120, a signal separation unit 122, a correction signal addition unit 124, a residual signal addition unit 126, and an overlap synthesis unit 128.

フレーム化部１２０は、音声処理装置１００が取得したデジタル音声信号を、処理単位である所定のフレーム単位（所定サンプル数長）で順次切り出し、フレーム単位のデジタル音声信号（以下、単にフレーム信号という）を生成する。 The framing unit 120 sequentially cuts out the digital audio signal acquired by the audio processing apparatus 100 in a predetermined frame unit (predetermined number of samples) as a processing unit, and a digital audio signal in frame units (hereinafter simply referred to as a frame signal). Is generated.

図３は、フレーム化部１２０におけるフレーム信号の生成過程を説明するための説明図である。図３に示したように、連続する１のデジタル音声信号が入力された場合、フレーム化部１２０は、まず、入力されたデジタル音声信号から所定の長さで区切られる一部のデジタル音声信号Ａのみを切り出してフレーム信号０を生成する。このとき、デジタル音声信号Ａの前段にはデジタル音声信号が存在しないので、デジタル音声信号Ａを含む所定サンプル数長のフレーム信号０は、図３のようにヌル値とデジタル音声信号Ａとから形成される。また、フレーム化部１２０は、次のフレーム信号のため、デジタル音声信号Ａの所定の位置から後尾までの所定長のデータである後部信号Ａ’をバッファ（図示せず）に一時的に保持する。 FIG. 3 is an explanatory diagram for explaining a frame signal generation process in the framing unit 120. As shown in FIG. 3, when one continuous digital audio signal is input, the framing unit 120 firstly selects a part of the digital audio signal A that is delimited by a predetermined length from the input digital audio signal. Only the frame signal 0 is cut out to generate the frame signal 0. At this time, since there is no digital audio signal before the digital audio signal A, the frame signal 0 having a predetermined number of samples including the digital audio signal A is formed from a null value and the digital audio signal A as shown in FIG. Is done. The framing unit 120 temporarily stores a rear signal A ′, which is data of a predetermined length from a predetermined position to the tail of the digital audio signal A, in a buffer (not shown) for the next frame signal. .

続いて、フレーム化部１２０は、引き続き入力されるデジタル音声信号に応じて、デジタル音声信号Ｂを切り出し、保持していたデジタル音声信号Ａの後部信号Ａ’とデジタル音声信号Ｂとをその順に接続し、所定サンプル数長のフレーム信号１を生成する。以後、フレーム化部１２０は、デジタル音声信号Ｂの後部信号Ｂ’と次に切り出したデジタル音声信号Ｃとでフレーム信号２を生成するといった具合にフレーム信号の生成を繰り返す。 Subsequently, the framing unit 120 cuts out the digital audio signal B in accordance with the digital audio signal that is continuously input, and connects the rear signal A ′ and the digital audio signal B that have been held in that order. Then, a frame signal 1 having a predetermined sample length is generated. Thereafter, the framing unit 120 repeats the generation of the frame signal such that the frame signal 2 is generated from the rear signal B ′ of the digital audio signal B and the digital audio signal C cut out next.

したがって、フレーム化部１２０によって生成されたフレーム信号は、前後のフレーム信号と一部がオーバラップすることとなる。例えば、フレーム信号０とフレーム信号１とは後部信号Ａ’に相当するデータがオーバラップする。以後の信号分離部１２２、補正信号加算部１２４、残差信号加算部１２６、オーバラップ合成部１２８では、かかるフレーム信号（フレーム単位のデジタル音声信号）に対して処理が遂行される。また、ここでは、オーバラップされる後部信号Ａ’、Ｂ’、…の長さを、所定サンプル数長のフレーム信号に対して1／３の長さとして図示しているが、かかる場合に限らず、１／２以下の任意の数値とすることができる。 Therefore, the frame signal generated by the framing unit 120 partially overlaps with the preceding and following frame signals. For example, the data corresponding to the rear signal A ′ overlaps between the frame signal 0 and the frame signal 1. In the subsequent signal separation unit 122, correction signal addition unit 124, residual signal addition unit 126, and overlap synthesis unit 128, processing is performed on the frame signal (digital audio signal in frame units). In addition, here, the length of the overlapped rear signals A ′, B ′,... Is illustrated as 1/3 of the length of a frame signal having a predetermined number of samples. However, it can be an arbitrary numerical value of 1/2 or less.

本実施形態においては、このように生成された各フレーム信号を、後述するオーバラップ合成部１２８においてオーバラップさせつつ再度合成する。かかるオーバラップ部分によってデジタル音声信号の連続性を確保することが可能となり、当該実施形態に基づいて新たに生成される高周波数成分を形成する補正信号の連続性も確保される。こうして、フレーム信号を切り出すことによる端部（エッジ）の影響を回避することができ、安定した音質改善効果を得ることが可能となる。 In the present embodiment, the frame signals generated in this way are combined again while being overlapped by an overlap combining unit 128 described later. Such an overlap portion makes it possible to ensure the continuity of the digital audio signal, and also ensure the continuity of the correction signal that forms a high frequency component newly generated based on the embodiment. In this way, it is possible to avoid the influence of edges (edges) caused by cutting out the frame signal, and to obtain a stable sound quality improvement effect.

そして、フレーム化部１２０は、生成したフレーム信号を順次、信号分離部１２２に送信する。 Then, the framing unit 120 sequentially transmits the generated frame signal to the signal separation unit 122.

信号分離部１２２は、フレーム化部１２０から受信した所定のフレームに区切られたフレーム信号の周波数分析を行い、フレーム信号を１または複数の基本波信号と、１または複数の基本波信号を除いた残差信号とに分離する。本実施形態において、信号分離部１２２は、一般調和解析（ＧＨＡ：Generalized Harmonic Analysis）を用いて基本波信号と残差信号とを分離する。 The signal separation unit 122 performs frequency analysis of the frame signal divided into predetermined frames received from the framing unit 120, and removes the frame signal from one or more fundamental wave signals and one or more fundamental wave signals. Separated into residual signals. In the present embodiment, the signal separation unit 122 separates the fundamental signal and the residual signal using Generalized Harmonic Analysis (GHA).

かかる一般調和解析は、周波数分析法として広く用いられている高速フーリエ変換（ＦＦＴ：Fast Fourier Transform）と比較して、演算負荷は重くなるものの、（１）高速フーリエ変換よりも周波数分析精度が高い、（２）雑音を抑圧することができる、（３）分析対象となるフレーム信号以外の波形を予測することができるといった点で有利である。 Such general harmonic analysis has a higher computational load compared to Fast Fourier Transform (FFT), which is widely used as a frequency analysis method, but (1) has higher frequency analysis accuracy than Fast Fourier Transform. (2) It is advantageous in that noise can be suppressed and (3) a waveform other than the frame signal to be analyzed can be predicted.

また、高速フーリエ変換を用いてフレーム信号の周波数分析を実行すると、そのフレーム信号は、フレーム単位の周期関数として扱われるので、端部において不連続な周波数成分が生じ、原信号となるデジタル音声信号に含まれていない新たな周波数成分を検出してしまう。さらに、フレーム信号の端部の連続性を確保すべく窓関数を施すと、高速フーリエ変換の周波数分析結果が常に窓関数の影響を受けてしまう。 When the frequency analysis of the frame signal is performed using the fast Fourier transform, the frame signal is handled as a periodic function in units of frames, so that a discontinuous frequency component is generated at the end, and the digital audio signal that becomes the original signal A new frequency component not included in is detected. Furthermore, if a window function is applied to ensure the continuity of the end of the frame signal, the frequency analysis result of the fast Fourier transform is always affected by the window function.

一方、本実施形態の一般調和解析では、フレーム信号から、残差エネルギーが最小となる適切な正弦波や余弦波の組合せを導出しているので、時間分解能に依存しない高い周波数分解能で周波数分析を遂行することができる。このように信号分離部１２２は、一般調和解析を用いて基本波信号と残差信号とを分離することが最も望ましいが、これに限定されるわけではなく、種々の周波数分析法を用いることができる。 On the other hand, in the general harmonic analysis of this embodiment, since an appropriate combination of sine wave and cosine wave that minimizes residual energy is derived from the frame signal, frequency analysis is performed with high frequency resolution that does not depend on time resolution. Can be carried out. As described above, it is most desirable that the signal separation unit 122 separates the fundamental wave signal and the residual signal using general harmonic analysis, but the present invention is not limited to this, and various frequency analysis methods may be used. it can.

信号分離部１２２は、このような一般調和解析に従い、まず、サンプリング周波数ｆｓに基づいて周波数分析候補となる所定数の相異なる周波数ｆ_ｋ（ｋは整数）を決定する。そして、決定された所定数の周波数ｆ_ｋの基本波信号ｂ_ｋ［ｉ］（ｉは０〜Ｌ−１の整数、Ｌはフレーム信号のサンプル数）を、それぞれ単独でフレーム信号から減算して差分信号ｅ_ｋ［ｉ］を求め、さらにその二乗和により差分信号のエネルギーＥ_ｋを導出する。 In accordance with such general harmonic analysis, the signal separation unit 122 first determines a predetermined number of different frequencies f _k (k is an integer) as frequency analysis candidates based on the sampling frequency fs. Then, the fundamental wave signal b _{k [i]} (i is 0 to L-1 integer, L is the number of samples of the frame signal) of the determined predetermined number of frequency f _k and singly by subtracting from the frame signal The difference signal e _k [i] is obtained, and the energy E _k of the difference signal is derived from the sum of squares thereof.

なお、信号分離部１２２は、図示しないデコーダがデジタル音声信号をデコードするときに抽出したデジタル音声信号のサンプリング周波数の情報を取得して、そのサンプリング周波数に応じて周波数分析候補となる所定数の相異なる周波数ｆ_ｋ（ｋは整数）を決定するようにしてもよい。ただし、ＣＤプレーヤのように入力されるデジタル音声信号のサンプリング周波数が常に一定である再生装置に、本実施形態に係わる音声処理装置１００を用いる場合、信号分離部１２２は、必ずしもサンプリング周波数の情報を取得する必要はない。 The signal separation unit 122 acquires information on the sampling frequency of the digital audio signal extracted when a decoder (not shown) decodes the digital audio signal, and a predetermined number of phases that are frequency analysis candidates according to the sampling frequency. Different frequencies f _k (k is an integer) may be determined. However, when the audio processing apparatus 100 according to the present embodiment is used in a playback apparatus in which the sampling frequency of the input digital audio signal is always constant, such as a CD player, the signal separation unit 122 does not necessarily have the sampling frequency information. There is no need to acquire.

図４は、周波数分析候補となる所定数の周波数の一例を示した説明図である。ここでは所定数の周波数として、その周波数の波形における極値の前後の半周期でサンプル数が同一となる周波数が選択される。本実施形態においてサンプリング周波数ｆｓは４４．１ｋＨｚであるから、極値の前後の半周期でサンプル数が同一となる周波数ｆ_ｋは、サンプリング周波数ｆｓの１／２の周波数をさらに半周期のサンプル数ＦＳ（ＦＳは整数）で分周した値ｆｓ／２／ＦＳとなる。 FIG. 4 is an explanatory diagram showing an example of a predetermined number of frequencies that are frequency analysis candidates. Here, as the predetermined number of frequencies, a frequency having the same number of samples in a half cycle before and after the extreme value in the waveform of the frequency is selected. In this embodiment, since the sampling frequency fs is 44.1 kHz, the frequency f _k in which the number of samples is the same in the half cycle before and after the extreme value is equal to half the sampling frequency fs and the number of samples in the half cycle. A value fs / 2 / FS divided by FS (FS is an integer) is obtained.

ただし、ＦＳ＝１の周波数ｆ_１（２２．０５ｋＨｚ）の周波数成分は、サンプリング定理に従って処理対象となるフレーム信号ｘ_０［ｉ］（ｉは０〜Ｌ−１の整数）に含まれないので、周波数ｆ_ｋは、図４に示すように、サンプル数ＦＳ＝２、３、４、…となる周波数に限られる。本実施形態では、サンプル数ＦＳ＝２、３、４、…、１０の相異なる９つの周波数ｆ_２〜ｆ_１０を周波数分析候補とする。周波数ｆ_ｋを極値の前後の半周期でサンプル数が同一となる周波数とした理由は後ほど述べる。 However, since the frequency component of the frequency f ₁ (22.05 kHz) of FS = 1 is not included in the frame signal x ₀ [i] (i is an integer of 0 to L−1) to be processed according to the sampling theorem, As shown in FIG. 4, the frequency f _k is limited to the frequency at which the number of samples FS = 2, 3, 4,. In this embodiment, nine different frequencies f _{2 to} f ₁₀ with the number of samples FS = 2, 3, 4,. The reason why the frequency _{fk is set} to the frequency at which the number of samples is the same in the half cycle before and after the extreme value will be described later.

また、周波数ｆ_ｋの基本波信号ｂ_ｋ［ｉ］は、数式１で表すことができる。ただし、ｉは０〜Ｌ−１、ｋは２、３、４、…、１０である。

…（数式１） Further, the fundamental wave signal b _k [i] having the frequency f _k can be expressed by Equation 1. However, i is 0 to L-1, and k is 2, 3, 4,.

... (Formula 1)

信号分離部１２２は、図４に示した周波数をＦＳの小さい順に選択し、処理対象となるフレーム信号ｘ_０［ｉ］に対する基本波信号ｂ_ｋ［ｉ］の正弦波の振幅Ｓ（ｆ_ｋ）を数式２を用いて導出し、余弦波の振幅Ｃ（ｆ_ｋ）を数式３を用いて導出する。ただし、ｋは２、３、４、…、１０である。

…（数式２）

…（数式３） The signal separation unit 122 selects the frequencies shown in FIG. 4 in ascending order of FS, and the amplitude S (f _k ) of the sine wave of the fundamental wave signal b _k [i] with respect to the frame signal x ₀ [i] to be processed. Is derived using Equation 2, and the amplitude C (f _k ) of the cosine wave is derived using Equation 3. However, k is 2, 3, 4,...

... (Formula 2)

... (Formula 3)

このようにして導出された振幅Ｓ（ｆ_ｋ）と振幅Ｃ（ｆ_ｋ）とを数式１に代入して基本波信号ｂ_ｋ［ｉ］を求め、処理対象となるフレーム信号ｘ_０［ｉ］から、その基本波信号ｂ_ｋ［ｉ］を、数式４のようにそれぞれ個別に減算して差分信号ｅ_ｋ［ｉ］を求める。

…（数式４） The fundamental wave signal b _k [i] is obtained by substituting the amplitude S (f _k ) and the amplitude C (f _k ) derived in this way into Equation 1, and the frame signal x ₀ [i] to be processed is obtained. Thus, the fundamental signal b _k [i] is subtracted individually as shown in Equation 4 to obtain a differential signal e _k [i].

... (Formula 4)

そして、差分信号ｅ_ｋ［ｉ］のエネルギーＥ_ｋを、数式５のように二乗和により導出し、その周波数ｆ_ｋに関連付けて一時的に保持する。

…（数式５） Then, the energy E _k of the difference signal e _k [i] is derived by the sum of squares as shown in Equation 5, and temporarily stored in association with the frequency f _k .

... (Formula 5)

ここでは、導出された所定数の差分信号ｅ_ｋ［ｉ］のエネルギーＥ_ｋが小さいほど、その周波数ｆ_ｋの基本波信号ｂ_ｋ［ｉ］が、処理対象となるフレーム信号ｘ_０［ｉ］に含まれる占有率（度合い）が高いことを表す。信号分離部１２２は、このような差分信号ｅ_ｋ［ｉ］のエネルギーＥ_ｋを図４に示すｆｓ／２／ＦＳ（ＦＳ＝２、３、４、…１０）の９つの周波数ｆ_ｋすべてに関して計算する。 Here, as the energy E _{k of} the predetermined number of derived difference signals e _k [i] is smaller, the fundamental signal b _k [i] of the frequency f _{k is} the frame signal x ₀ [i] to be processed. It represents that the occupation rate (degree) included in is high. The signal separation unit 122 uses the energy E _k of the differential signal e _k [i] for all nine frequencies f _{k of} fs / 2 / FS (FS = 2, 3, 4,... 10) shown in FIG. calculate.

このような差分信号ｅ_ｋ［ｉ］のエネルギーＥ_ｋを個々に求めたのは、１または複数の基本波信号ｂ_ｋ［ｉ］すべてを除いた最終的な残差信号を最小にするためには、一般調和解析の下、フレーム信号ｘ_０［ｉ］から占有率が高い基本波信号ｂ_ｋ［ｉ］を優先して分離する必要があるからである。したがって、信号分離部１２２は、差分エネルギーＥ_ｋが小さい順、すなわち基本波信号ｂ_ｋ［ｉ］における占有率が高い順に９つの周波数ｆ_ｋを並び替える。 The reason why the energy E _k of the differential signal e _k [i] is individually determined is to minimize the final residual signal except for all of the one or more fundamental wave signals b _k [i]. This is because, under general harmonic analysis, it is necessary to preferentially separate the fundamental wave signal b _k [i] having a high occupation rate from the frame signal x ₀ [i]. Therefore, the signal separation unit 122 rearranges the nine frequencies f _k in ascending order of the differential energy E _k , that is, in descending order of the occupation ratio in the fundamental wave signal b _k [i].

続いて、信号分離部１２２は、その９つの周波数ｆ_ｋに対応する９つの基本波信号ｂ_ｋ［ｉ］を、並び替えられた周波数ｆ_ｋの順に、原信号であるフレーム信号ｘ_０［ｉ］から順次減算する。ただし、上述した差分信号ｅ_ｋ［ｉ］を導出する工程では、毎回、原信号であるフレーム信号ｘ_０［ｉ］から改めて基本波信号ｂ_ｋ［ｉ］を減算したのに対し、ここでは、フレーム信号ｘ_０［ｉ］から１の基本波信号ｂ_ｋ［ｉ］を減算すると、その減算した後の残差信号ｄ［ｉ］に対して、次の基本波信号ｂ_ｋ［ｉ］の振幅Ｓ（ｆ_ｋ）および振幅Ｃ（ｆ_ｋ）を数式２、３を用いて改めて導出し、その基本波信号ｂ_ｋ［ｉ］を減算していく。したがって、減算する順番によっては基本波信号ｂ_ｋ［ｉ］の振幅Ｓ（ｆ_ｋ）および振幅Ｃ（ｆ_ｋ）が変化する。並び替えに用いた基本波信号ｂ_ｋ［ｉ］と当該フレーム信号ｘ_０［ｉ］から順次減算する基本波信号ｂ_ｋ［ｉ］とは所定の周波数ｆ_ｋを有する正弦波と余弦波で表されることで共通し、その正弦波と余弦波の振幅のみが異なる。並び替えに用いた基本波信号ｂ_ｋ［ｉ］は並び替えが完了すると用いられることはなくなり、振幅Ｓ（ｆ_ｋ）および振幅Ｃ（ｆ_ｋ）が変化した基本波信号ｂ_ｋ［ｉ］が最終的な基本波信号ｂ_ｋ［ｉ］として以後の処理でも用いられる。このような基本波信号ｂ_ｋ［ｉ］の減算を経て残差信号ｄ［ｉ］が導出される。したがって、残差信号ｄ［ｉ］は、数式６のように表すことができる。ただし、ｉは０〜Ｌ−１、ｋは２、３、４、…、１０である。

…（数式６） Then, the signal separation unit 122, the nine corresponding to the frequency _{f k} nine fundamental signal _b k [i], the order of the frequency _{f k} rearranged, a original signal frame signal _x 0 [i ] In turn. However, in the above-described step of deriving the differential signal e _k [i], the fundamental signal b _k [i] is subtracted from the frame signal x ₀ [i], which is the original signal, every time. When one fundamental wave signal b _k [i] is subtracted from the frame signal x ₀ [i], the amplitude of the next fundamental wave signal b _k [i] with respect to the residual signal d [i] after the subtraction. S (f _k ) and amplitude C (f _k ) are derived again using

Equations

2 and 3, and the fundamental wave signal b _k [i] is subtracted. Therefore, depending on the order of subtraction, the amplitude S (f _k ) and amplitude C (f _k ) of the fundamental wave signal b _k [i] change. Table sine wave and cosine wave having a predetermined frequency f _k is the fundamental wave signal b _{k [i]} are sequentially subtracted from the frame signal x _{0 [i]} the fundamental wave signal b _{k [i]} used in the rearrangement In common, only the amplitudes of the sine wave and cosine wave are different. Fundamental signal b _k used for the rearrangement _[i] will not be used and rearrangement is complete, the amplitude S (f _k) and the amplitude C (f _k) has changed the fundamental wave signal b _{k [i]} is The final fundamental wave signal b _k [i] is also used in the subsequent processing. The residual signal d [i] is derived through the subtraction of the fundamental wave signal b _k [i]. Therefore, the residual signal d [i] can be expressed as Equation 6. However, i is 0 to L-1, and k is 2, 3, 4,.

... (Formula 6)

こうしてフレーム信号ｘ_０［ｉ］から占有率が高い基本波信号ｂ_ｋ［ｉ］が順次分離され、残差信号ｄ［ｉ］のエネルギーは漸減する。 In this way, the fundamental signal b _k [i] having a high occupation rate is sequentially separated from the frame signal x ₀ [i], and the energy of the residual signal d [i] is gradually reduced.

このように、フレーム信号ｘ_０［ｉ］における占有率が高い基本波信号ｂ_ｋ［ｉ］から優先して分離する構成により、フレーム信号ｘ_０［ｉ］を１または複数の基本波信号ｂ_ｋ［ｉ］の組合せで適切に表すことができ、かつ、残差信号ｄ［ｉ］を最小限に抑えることが可能となる。 Thus, the frame signal x ₀ by the configuration which separates preferentially from the fundamental wave signal is high occupancy b _{k [i]} in the _[i], a frame signal x _{0 [i]} one or more of the fundamental wave signal b _k The combination of [i] can be appropriately expressed, and the residual signal d [i] can be minimized.

ここで、周波数分析候補となる所定数（ここでは９つ）の周波数ｆ_ｋは、サンプリング周波数ｆｓ（例えば４４．１ｋＨｚ）に対して図４の如く一義的に求まるので、サンプリング周波数ｆｓに応じて、所定数の周波数ｆ_ｋと基本波信号ｂ_ｋ［ｉ］とを一意に対応付けた基本波テーブルを予め作成しておくこともできる。ただし、基本波テーブルでは、振幅Ｓ（ｆ_ｋ）や振幅Ｃ（ｆ_ｋ）を所定値（例えば１）とした場合の各サンプルｉにおける正弦波および余弦波の値までが示されるに留まり、信号分離部１２２は、振幅Ｓ（ｆ_ｋ）や振幅Ｃ（ｆ_ｋ）を乗じて基本波信号ｂ_ｋ［ｉ］を導出することとなる。かかる基本波テーブルによって、演算負荷の軽減を図ることが可能となる。かかる基本波テーブルは図示しないメモリに保持されてもよく、通信網１０６から取得するとしてもよい。 Here, since a predetermined number (here, 9) of frequencies _fk as frequency analysis candidates are uniquely obtained as shown in FIG. 4 with respect to the sampling frequency fs (for example, 44.1 kHz), the frequency _fk is determined according to the sampling frequency fs. A fundamental wave table in which a predetermined number of frequencies f _k and fundamental wave signals b _k [i] are uniquely associated can be created in advance. However, the fundamental wave table only shows the values of the sine wave and cosine wave in each sample i when the amplitude S (f _k ) and the amplitude C (f _k ) are set to predetermined values (for example, 1). The separating unit 122 multiplies the amplitude S (f _k ) and the amplitude C (f _k ) to derive the fundamental wave signal b _k [i]. Such a fundamental wave table can reduce the calculation load. Such a fundamental wave table may be held in a memory (not shown) or may be acquired from the communication network 106.

信号分離部１２２は、引き続き並び替えられた順に基本波信号ｂ_ｋ［ｉ］を減算していき、周波数分析候補として準備されたすべての周波数ｆ_ｋに関する基本波信号ｂ_ｋ［ｉ］を減算し終わると、その残差信号ｄ［ｉ］を最終的な残差信号として残差信号加算部１２６に送信する。 Signal separator 122, gradually subtracting the fundamental wave signal b _{k [i]} to continue the sorted order, by subtracting the fundamental wave signal b _{k [i]} for all frequencies f _k, which is prepared as a frequency analysis candidates When finished, the residual signal d [i] is transmitted to the residual signal adding unit 126 as a final residual signal.

ここで、周波数分析候補として準備されたすべての周波数ｆ_ｋに関する基本波信号ｂ_ｋ［ｉ］を減算していなくとも、その残差信号ｄ［ｉ］のエネルギーが十分に小さくなっていれば、例えば、残差信号ｄ［ｉ］が所定のエネルギー以下となれば、フレーム信号ｘ_０［ｉ］を十分に分離できたとして、その時点で基本波信号ｂ_ｋ［ｉ］の分離を停止し、残差信号ｄ［ｉ］を残差信号加算部１２６に送信する。 Here, even if the fundamental signal b _k [i] for all the frequencies f _k prepared as frequency analysis candidates is not subtracted, if the energy of the residual signal d [i] is sufficiently small, For example, if the residual signal d [i] is equal to or lower than a predetermined energy, the separation of the fundamental signal b _k [i] is stopped at that time, assuming that the frame signal x ₀ [i] can be sufficiently separated. Residual signal d [i] is transmitted to residual signal adding section 126.

このとき、基本波信号ｂ_ｋ［ｉ］は、それぞれ、所定の周波数と、所定の周波数を有する正弦波の振幅と、余弦波の振幅とで表される信号なので、信号分離部１２２は、基本波信号ｂ_ｋ［ｉ］そのものではなく、基本波信号ｂ_ｋ［ｉ］の周波数を示す周波数情報と、正弦波成分の振幅情報と、余弦波成分の振幅情報といったパラメータと、基本波信号ｂ_ｋ［ｉ］の個数情報とを補正信号加算部１２４に送信する。かかる構成により、信号分離部１２２と補正信号加算部１２４とのアクセス負荷を著しく軽減することができる。 At this time, the fundamental wave signal b _k [i] is a signal represented by a predetermined frequency, an amplitude of a sine wave having a predetermined frequency, and an amplitude of a cosine wave. Not the wave signal b _k [i] itself, but parameters such as frequency information indicating the frequency of the fundamental wave signal b _k [i], amplitude information of the sine wave component, amplitude information of the cosine wave component, and the fundamental wave signal b _k. The number information of [i] is transmitted to the correction signal adding unit 124. With this configuration, the access load between the signal separation unit 122 and the correction signal addition unit 124 can be significantly reduced.

また、信号分離部１２２は、数式６に示すように、フレーム信号ｘ_０［ｉ］から対象となる基本波信号ｂ_ｋ［ｉ］をすべて除いた残差信号ｄ［ｉ］を残差信号加算部１２６に送信する。 Further, as shown in Expression 6, the signal separation unit 122 adds a residual signal d [i] obtained by removing all the target fundamental wave signals b _k [i] from the frame signal x ₀ [i]. To the unit 126.

本実施形態においては、後述するように基本波信号ｂ_ｋ［ｉ］のみに音質改善処理が施され、残差信号ｄ［ｉ］には施されない。しかし、残差信号ｄ［ｉ］はエネルギー量としても無視可能な信号なので、残差信号ｄ［ｉ］に音質改善処理を施さなくとも、原信号であるフレーム信号ｘ_０［ｉ］の音質改善レベルに影響はなく、むしろ残差信号ｄ［ｉ］に音質改善処理を施す処理負荷を他の処理に有効活用できる。 In the present embodiment, as will be described later, only the fundamental wave signal b _k [i] is subjected to the sound quality improvement process, and is not applied to the residual signal d [i]. However, since the residual signal d [i] is a signal that can be ignored as the amount of energy, the sound quality of the frame signal x ₀ [i] that is the original signal is improved without performing the sound quality improvement process on the residual signal d [i]. There is no effect on the level, but rather the processing load for applying the sound quality improvement processing to the residual signal d [i] can be effectively utilized for other processing.

また、図４で示したように、基本波信号ｂ_ｋ［ｉ］の周波数ｆ_ｋを、極値の前後の半周期でサンプル数が同一となる周波数ｆ_ｋ＝ｆｓ／２／ＦＳとすることで、残差信号ｄ［ｉ］を除いたフレーム信号ｘ_０［ｉ］（１または複数の基本波信号ｂ_ｋ［ｉ］）を極値の前後の半周期でサンプル数が同一となる正弦波および余弦波で表すことができ、同一の正弦波や余弦波において、乗算する係数や加減算対象となるサンプル位置が異なるといった問題がなくなる。さらに、基本波信号ｂ_ｋ［ｉ］は、初期位相０の正弦波および余弦波のみで形成されるため、フレーム信号ｘ_０［ｉ］に対して補正信号を画一的に付加することができるので、音質改善の均一化を図ることが可能となる。 Further, as shown in FIG. 4, the frequency f _k of the fundamental wave signal b _k [i] is set to a frequency f _k = fs / 2 / FS at which the number of samples is the same in a half cycle before and after the extreme value. Thus, the sine wave having the same number of samples in the half period before and after the extreme value of the frame signal x ₀ [i] (one or more fundamental wave signals b _k [i]) excluding the residual signal d [i]. In the same sine wave or cosine wave, there is no problem that the coefficient to be multiplied or the sample position to be added or subtracted is different. Furthermore, since the fundamental wave signal b _k [i] is formed only by a sine wave and a cosine wave having an initial phase 0, a correction signal can be uniformly added to the frame signal x ₀ [i]. Therefore, it is possible to achieve uniform sound quality improvement.

補正信号加算部１２４は、信号分離部１２２が分離した１または複数の基本波信号それぞれに対し、音圧０を中心とした振幅の絶対値が拡大されるような補正信号を生成して基本波信号に加算する。 The correction signal adding unit 124 generates a correction signal for each of one or a plurality of fundamental wave signals separated by the signal separation unit 122 so that the absolute value of the amplitude centered on the sound pressure 0 is expanded to generate a fundamental wave. Add to signal.

図５は、補正信号加算部１２４のさらに具体的な構成を説明するための機能ブロック図であり、図６は、サンプル数と係数との関係を示した係数テーブルであり、図７および図８は、補正信号加算部１２４による音質改善処理の動作を説明するための説明図である。補正信号加算部１２４は、極値特定部１５０と、サンプル数計数部１５２と、補正信号生成部１５４と、遅延部１５６と、加算部１５８とを含んで構成される。また、係数テーブルは図示しないメモリに保持されてもよく、通信網１０６から取得するとしてもよい。ここで、まず、補正信号加算部１２４で実行される音質改善処理の基本的動作を説明する。 FIG. 5 is a functional block diagram for explaining a more specific configuration of the correction signal adding unit 124. FIG. 6 is a coefficient table showing the relationship between the number of samples and the coefficients. These are explanatory drawings for explaining the operation of the sound quality improvement processing by the correction signal adding unit 124. The correction signal adding unit 124 includes an extreme value specifying unit 150, a sample number counting unit 152, a correction signal generating unit 154, a delay unit 156, and an adding unit 158. The coefficient table may be held in a memory (not shown) or may be acquired from the communication network 106. First, the basic operation of the sound quality improvement process executed by the correction signal adding unit 124 will be described.

極値特定部１５０は、補正信号加算部１２４が受信したフレーム信号ｘ_０［ｉ］（１または複数の基本波信号ｂ_ｋ［ｉ］）の極大値と極小値とを特定する。具体的に、極値特定部１５０は、フレーム信号ｘ_０［ｉ］の各サンプルにおける値を順次比較し、値が増加している状態または増減無しの状態から減少に転じたとき、その減少に転じる直前のサンプルにおける値を極大値とし、値が減少している状態または増減無しの状態から増加に転じたとき、その増加に転じる直前のサンプルにおける値を極小値とする。 The extreme value specifying unit 150 specifies the maximum value and the minimum value of the frame signal x ₀ [i] (one or a plurality of fundamental wave signals b _k [i]) received by the correction signal adding unit 124. Specifically, the extreme value specifying unit 150 sequentially compares the values in the respective samples of the frame signal x ₀ [i], and when the value changes from a state where the value is increasing or a state where there is no increase / decrease, the extreme value specifying unit 150 The value in the sample immediately before turning is set to the maximum value, and when the value is decreasing or the state in which there is no increase / decrease is changed to the increase, the value in the sample immediately before starting the increase is set to the minimum value.

サンプル数計数部１５２は、任意の極値（極大値または極小値）から次の極値までのサンプル数、すなわち、極大値から極小値まで、または極小値から極大値までのサンプル数を計数する。 The sample number counting unit 152 counts the number of samples from an arbitrary extreme value (maximum value or minimum value) to the next extreme value, that is, the number of samples from the maximum value to the minimum value, or from the minimum value to the maximum value. .

補正信号生成部１５４は、フレーム信号ｘ_０［ｉ］における所定のサンプル間の変化量と１の係数とを乗じてデジタル音声信号の振幅の絶対値が拡大されるような補正値を生成し、所定のサンプル位置に配して補正信号を生成する。 The correction signal generation unit 154 generates a correction value such that the absolute value of the amplitude of the digital audio signal is expanded by multiplying the change amount between predetermined samples in the frame signal x ₀ [i] and a coefficient of 1, A correction signal is generated at a predetermined sample position.

例えば、図７の例では、補正信号生成部１５４は、図６の係数テーブルを参照し、図７（ａ）に示すフレーム信号ｘ_０［ｉ］に基づきサンプル数計数部１５２が計数した極大値から極小値まで、または極小値から極大値までの極値間のサンプル数、例えば「４」に対応した、係数「０．５」を抽出する。 For example, in the example of FIG. 7, the correction signal generation unit 154 refers to the coefficient table of FIG. 6, and the local maximum value counted by the sample number counting unit 152 based on the frame signal x ₀ [i] illustrated in FIG. The coefficient “0.5” corresponding to the number of samples between the extreme values from the minimum value to the minimum value or between the minimum value and the maximum value, for example, “4” is extracted.

ここで、図６の係数テーブルにおいて、サンプル数が多いほど係数の値が小さいのは以下の理由からである。すなわち、任意の極値から次の極値までのサンプル数が多い場合、そのフレーム信号ｘ_０［ｉ］の周波数は低く、例えば２２．１ｋＨｚの低域通過フィルタ（ＬＰＦ：Low Pass Filter）でフィルタリングが施されている場合であっても、その低周波数のフレーム信号ｘ_０［ｉ］の高調波は抑制されずに残る。したがって、大きな高周波数成分を付加しなくとも十分に高音質を維持できるので、係数は小さくて済む。 Here, in the coefficient table of FIG. 6, the larger the number of samples, the smaller the coefficient value is for the following reason. That is, when the number of samples from an arbitrary extreme value to the next extreme value is large, the frequency of the frame signal x ₀ [i] is low, for example, filtering with a 22.1 kHz low pass filter (LPF). Is applied, harmonics of the low-frequency frame signal x ₀ [i] remain without being suppressed. Therefore, a sufficiently high sound quality can be maintained without adding a large high-frequency component, and the coefficient can be small.

一方、任意の極値から次の極値までのサンプル数が少ない場合、そのフレーム信号ｘ_０［ｉ］の周波数は高く、例えば２２．１ｋＨｚの低域通過フィルタでフィルタリングが施されている場合に、その高周波数のフレーム信号ｘ_０［ｉ］の高調波はほとんど削減されている。したがって、高周波数成分を十分に付加しないと音質の改善を図ることができないので、係数は大きい必要がある。 On the other hand, when the number of samples from an arbitrary extreme value to the next extreme value is small, the frequency of the frame signal x ₀ [i] is high, for example, when filtering is performed by a low-pass filter of 22.1 kHz. The harmonics of the high-frequency frame signal x ₀ [i] are almost reduced. Therefore, the sound quality cannot be improved unless sufficient high-frequency components are added, so the coefficient needs to be large.

続いて、補正信号生成部１５４は、図７（ａ）に示すフレーム信号ｘ_０［ｉ］の極大値と１サンプリング前のサンプル値との差分値ｄｌに、係数テーブルから抽出した０．５を乗算した乗算結果Δｄｌを極大値のサンプル位置に配し、フレーム信号の極小値と１サンプリング前のサンプル値との差分値ｄｓに０．５を乗算した乗算結果Δｄｓを極小値のサンプル位置に配して図７（ｂ）に示す補正信号ｃｏ［ｉ］を生成する。 Subsequently, the correction signal generation unit 154 sets 0.5 extracted from the coefficient table to the difference value dl between the maximum value of the frame signal x ₀ [i] illustrated in FIG. The multiplied multiplication result Δdl is arranged at the sample position of the maximum value, and the multiplication result Δds obtained by multiplying the difference value ds between the minimum value of the frame signal and the sample value before one sampling by 0.5 is arranged at the sample position of the minimum value. Then, the correction signal co [i] shown in FIG. 7B is generated.

また、ここでは、乗算結果Δｄｌ、Δｄｓを極大値や極小値のサンプル位置に加減算するような補正信号が生成されているが、加減算対象となるサンプル位置は、かかる場合に限らず、例えば、極大値や極小値の前後所定数のサンプル位置に加減算することもできる。 Further, here, a correction signal for adding and subtracting the multiplication results Δdl and Δds to and from the sample position of the maximum value or the minimum value is generated. However, the sample position to be added or subtracted is not limited to this, for example, the maximum It is also possible to add / subtract to a predetermined number of sample positions before and after the value or minimum value.

例えば、補正信号生成部１５４は、図８（ａ）に示すフレーム信号ｘ_０［ｉ］の極大値と１サンプリング前のサンプル値との差分値ｄｌに、係数テーブルから抽出した０．５を乗算した乗算結果Δｄｌを極大値の前後１のサンプル位置に配し、フレーム信号の極小値と１サンプリング前のサンプル値との差分値ｄｓに０．５を乗算した乗算結果Δｄｓを極小値の前後１のサンプル位置に配して図８（ｂ）に示す補正信号ｃｏ［ｉ］を生成する。また、極大値や極小値のサンプル位置と極大値や極小値の前後所定数のサンプル位置にそれぞれ乗算結果Δｄｌ、Δｄｓを配して、図７（ｂ）と図８（ｂ）とを合成した補正信号を生成することも可能である。 For example, the correction signal generation unit 154 multiplies the difference value dl between the maximum value of the frame signal x ₀ [i] shown in FIG. 8A and the sample value before one sampling by 0.5 extracted from the coefficient table. The obtained multiplication result Δdl is arranged at one sample position before and after the maximum value, and the multiplication result Δds obtained by multiplying the difference value ds between the minimum value of the frame signal and the sample value before one sampling by 0.5 is set to 1 before and after the minimum value. The correction signal co [i] shown in FIG. 8B is generated at the sample positions. Further, the multiplication results Δdl and Δds are arranged at the sample positions of the local maximum value and the local minimum value and a predetermined number of sample positions before and after the local maximum value and the local minimum value, respectively, and FIG. 7B and FIG. It is also possible to generate a correction signal.

このように、特定の高周波数成分を付加するための複雑な計算を伴うことなく、任意のサンプル位置における振幅の絶対値を大きくするといった単純な処理で高周波数成分を付加する構成により、処理負荷を軽減しつつ音質の改善を図ることが可能となる。 In this way, the processing load is increased by a simple process of increasing the absolute value of the amplitude at an arbitrary sample position without complicated calculation for adding a specific high frequency component. It is possible to improve sound quality while reducing noise.

遅延部１５６は、原信号となるフレーム信号ｘ_０［ｉ］を、極値特定部１５０、サンプル数計数部１５２、補正信号生成部１５４での処理時間分だけ遅延させ、図７（ａ）と図７（ｂ）や図８（ａ）と図８（ｂ）のようにフレーム信号ｘ_０［ｉ］と補正信号ｃｏ［ｉ］とを同期させる。 The delay unit 156 delays the frame signal x ₀ [i], which is the original signal, by the processing time in the extreme value specifying unit 150, the sample number counting unit 152, and the correction signal generating unit 154, as shown in FIG. As shown in FIG. 7B, FIG. 8A, and FIG. 8B, the frame signal x ₀ [i] and the correction signal co [i] are synchronized.

加算部１５８は、例えば図７（ａ）および図８（ａ）に示されるフレーム信号ｘ_０［ｉ］に、図７（ｂ）および図８（ｂ）に示される補正信号ｃｏ［ｉ］を加算して、図７（ｃ）および図８（ｃ）に示すような音質改善処理が施されたフレーム信号ｘ_０’［ｉ］を生成する。本実施形態においては、このような矩形波に近くなるように補正信号を付加することで高周波数成分を拡張し、音質の改善を図ることができる。 For example, the adder 158 adds the correction signal co [i] shown in FIGS. 7B and 8B to the frame signal x ₀ [i] shown in FIGS. 7A and 8A. The frame signal x ₀ ′ [i] subjected to the sound quality improvement processing as shown in FIG. 7C and FIG. In the present embodiment, the correction signal is added so as to be close to such a rectangular wave, thereby expanding the high frequency component and improving the sound quality.

しかし、このような音質改善処理を無作為に実行すると、フレーム信号ｘ_０［ｉ］に含まれる各信号の周波数とサンプリング周波数ｆｓとが所定の関係を有さないので、仮に、フレーム信号ｘ_０［ｉ］が規則的な正弦波のみから形成されていたとしても、その周波数がサンプリング周波数ｆｓと異なれば、極小値から極大値までのサンプル数と極大値から極小値までのサンプル数とが異なり、乗算する係数や加減算対象となるサンプル位置が半周期毎に異なり音質改善処理が偏ることとなる。例えば、フレーム信号ｘ_０［ｉ］の全周期のサンプル数が「７」である場合、半周期のいずれか一方のサンプル数が「４」となり、他方が「３」となってしまい、その補正量もサンプル数に応じて偏る。 However, running such a sound quality improvement process at random, since the frequency and the sampling frequency fs of the signal included in the frame signal x _{0 [i]} does not have a predetermined relationship, if the frame signal x ₀ Even if [i] is formed only from a regular sine wave, if the frequency is different from the sampling frequency fs, the number of samples from the minimum value to the maximum value differs from the number of samples from the maximum value to the minimum value. Therefore, the coefficient to be multiplied and the sample position to be added / subtracted differ every half cycle, and the sound quality improvement processing is biased. For example, when the number of samples in the entire period of the frame signal x ₀ [i] is “7”, the number of samples in one of the half periods is “4” and the other is “3”. The amount also depends on the number of samples.

本実施形態においては、上述したように、音質改善処理の対象を、フレーム信号ｘ_０［ｉ］ではなく、そのフレーム信号ｘ_０［ｉ］に含まれる、極値の前後の半周期でサンプル数が同一となる周波数に基づく基本波信号ｂ_ｋ［ｉ］としているので、画一的かつ均一に音声改善処理を施すことができる。 In the present embodiment, as described above, the target of the sound quality improvement process is not the frame signal x ₀ [i], but the number of samples in the half cycle before and after the extreme value included in the frame signal x ₀ [i]. Since the fundamental wave signals b _k [i] are based on the same frequency, the sound improvement processing can be performed uniformly and uniformly.

例えば、本実施形態における補正信号加算部１２４に入力される１または複数の基本波信号ｂ_ｋ［ｉ］の１の基本波信号ｂ_ｋ［ｉ］を例に挙げると、上述した極値特定部１５０が特定すべき極大値と極小値の値は、基本波信号ｂ_ｋ［ｉ］の正弦波成分および余弦波成分それぞれの振幅情報から特定でき、極大値と極小値のサンプル位置は、基本波信号ｂ_ｋ［ｉ］の周波数を示す周波数情報から特定できる。 For example, taking the first fundamental signal b _{k [i]} of one or more of the fundamental wave signal b _k to be inputted to the correction signal addition section 124 _[i] in the present embodiment example, extreme specifying unit described above The maximum and minimum values to be identified by 150 can be identified from the amplitude information of the sine wave component and cosine wave component of the fundamental wave signal b _k [i], and the sample positions of the maximum and minimum values are the fundamental wave. It can be identified from frequency information indicating the frequency of the signal b _k [i].

また、サンプル数計数部１５２が特定すべきサンプル数も、周波数情報から図４を参照して一意に決定することができる。したがって、補正信号生成部１５４が生成すべき補正信号も、基本波信号ｂ_ｋ［ｉ］の各情報から一意に導き出すことが可能となる。 Also, the number of samples to be specified by the sample number counting unit 152 can be uniquely determined from the frequency information with reference to FIG. Therefore, the correction signal to be generated by the correction signal generation unit 154 can be uniquely derived from each information of the fundamental wave signal b _k [i].

上述したように、基本波信号ｂ_ｋ［ｉ］は、サンプリング周波数ｆｓの１／２を整数で分周した所定数の周波数ｆ_ｋのみで形成される。したがって、極値の前後の半周期でサンプル数が同一となるばかりでなく、正弦波や余弦波の開始点および終了点がサンプル点に位置することとなる。そうすると、補正信号加算部１２４は、画一的な補正値を付加するだけといった単純処理によって補正信号を生成できる。 As described above, the fundamental wave signal b _k [i] is formed with only a predetermined number of frequencies f _{k obtained} by dividing 1/2 of the sampling frequency fs by an integer. Accordingly, not only the number of samples is the same in the half cycle before and after the extreme value, but also the start and end points of the sine wave and cosine wave are located at the sample points. Then, the correction signal adding unit 124 can generate a correction signal by a simple process such as simply adding a uniform correction value.

図９は、本実施形態の補正信号加算部１２４における音質改善処理を説明するための説明図である。例えば、図９（ａ）に示す正弦波ｓｉｎ［ｉ］の音質改善処理では、極大値と極小値が、その周波数ｆ_ｋから求まるサンプル数ＦＳ（ここでは４）毎に出現し、同様に、図９（ｂ）に示す余弦波ｃｏｓ［ｉ］の補正でも、極大値と極小値がサンプル数ＦＳ毎に出現する。また、サンプル数ＦＳに応じて、加減算対象となるサンプル位置および係数も定まる。さらに、その振幅に応じて、加減算される値も一意に求まる。そうすると、当該正弦波ｓｉｎ［ｉ］や余弦波ｃｏｓ［ｉ］に対する補正信号が基本波信号ｂ_ｋ［ｉ］の各情報から一意に導き出される。したがって、補正信号加算部１２４は、基本波信号それぞれの周波数ｆ_ｋと正弦波ｓｉｎ［ｉ］の振幅と余弦波ｃｏｓ［ｉ］の振幅とに応じて、図９（ａ）や図９（ｂ）のように、音質改善処理を画一的に施すことが可能となる。 FIG. 9 is an explanatory diagram for explaining the sound quality improvement processing in the correction signal adding unit 124 of the present embodiment. For example, in the sound quality improvement processing of the sine wave sin [i] shown in FIG. 9A, the maximum value and the minimum value appear for every sample number FS (here, 4) obtained from the frequency f _k . Even in the correction of the cosine wave cos [i] shown in FIG. 9B, the maximum value and the minimum value appear for each number of samples FS. Further, the sample position and coefficient to be added / subtracted are also determined according to the number of samples FS. Further, the value to be added or subtracted is uniquely determined according to the amplitude. Then, a correction signal for the sine wave sin [i] and the cosine wave cos [i] is uniquely derived from each information of the fundamental wave signal b _k [i]. Therefore, the correction signal adding unit 124 may change the frequency f _{k of} each fundamental wave signal, the amplitude of the sine wave sin [i], and the amplitude of the cosine wave cos [i] according to FIGS. ), The sound quality improvement process can be applied uniformly.

ここでは、サンプル数ＦＳが偶数の場合を説明したが、サンプル数ＦＳが奇数の場合も同様に画一的な補正信号を生成することができる。 Although the case where the sample number FS is an even number has been described here, a uniform correction signal can be generated in the same manner when the sample number FS is an odd number.

また、正弦波および余弦波の各サンプル位置における補正信号ｃｏ［ｉ］の値は、基本波信号ｂ_ｋ［ｉ］の周波数ｆ_ｋに対して一義的に求まるので、補正信号加算部１２４は、基本波信号ｂ_ｋ［ｉ］の周波数ｆ_ｋと、振幅が１である正弦波および余弦波の各サンプル位置における補正信号ｃｏ［ｉ］の値とが対応付けられた補正テーブルを予め作成しておくこともできる。かかる補正テーブルは図示しないメモリに保持されてもよく、通信網１０６から取得するとしてもよい。そして、補正信号加算部１２４は、補正テーブルを参照し、１または複数の基本波信号ｂ_ｋ［ｉ］それぞれの周波数ｆ_ｋに応じて振幅が１である正弦波および余弦波の各サンプル位置における補正信号ｃｏ［ｉ］の値を抽出し、１または複数の基本波信号ｂ_ｋ［ｉ］それぞれの正弦波の振幅と余弦波の振幅とを乗じて補正信号ｃｏ［ｉ］を生成する。 Further, since the value of the correction signal co [i] at each sample position of the sine wave and cosine wave is uniquely determined with respect to the frequency f _k of the fundamental wave signal b _k [i], the correction signal adding unit 124 A correction table in which the frequency f _{k of the} fundamental wave signal b _k [i] and the value of the correction signal co [i] at each sample position of the sine wave and cosine wave having an amplitude of 1 is created in advance It can also be left. Such a correction table may be held in a memory (not shown) or may be acquired from the communication network 106. Then, the correction signal adding unit 124 refers to the correction table, and at each sample position of the sine wave and cosine wave whose amplitude is 1 according to the frequency f _k of each of the one or more fundamental wave signals b _k [i]. The value of the correction signal co [i] is extracted, and the correction signal co [i] is generated by multiplying the amplitude of the sine wave and the amplitude of the cosine wave of each of the one or more fundamental wave signals b _k [i].

さらに、基本波信号に含まれる正弦波や余弦波と補正信号とが比例関係にあるので、正弦波や余弦波と補正信号とを予め加算した信号とを対応付けてテーブルを作成することも可能である。 Furthermore, since the sine wave or cosine wave included in the fundamental wave signal is proportional to the correction signal, it is possible to create a table by associating the sine wave or cosine wave with the signal obtained by adding the correction signal in advance. It is.

かかる残差信号ｄ［ｉ］を除く基本波信号ｂ_ｋ［ｉ］のみに対して補正信号ｃｏ［ｉ］を生成する構成により、補正信号ｃｏ［ｉ］を生成する際の処理負荷を著しく軽減することが可能となり、プログラムの簡素化を図ったり、処理能力の低い安価な処理装置を採用してコストの削減を図ることができる。 The processing load when generating the correction signal co [i] is significantly reduced by the configuration in which the correction signal co [i] is generated only for the fundamental wave signal b _k [i] excluding the residual signal d [i]. Therefore, the program can be simplified, and the cost can be reduced by using an inexpensive processing apparatus having a low processing capability.

また、すべての基本波信号ｂ_ｋ［ｉ］に対して、適切なサンプル位置に適切な係数を乗じた乗算結果を均等に加減算することが可能となり、また、フレーム信号の変化に拘わらず、基本波信号ｂ_ｋ［ｉ］の同じサンプル位置に振幅に比例する同じ補正値を加えることができるので、偏りのない高周波数信号を付加することが可能となる。このように、デジタル音声信号に含まれる各信号に対して画一的に音声改善処理を施すことで、音質改善の均一化を図ることが可能となる。 Further, it becomes possible to add and subtract evenly the multiplication results obtained by multiplying the appropriate sample position by the appropriate coefficient with respect to all the fundamental wave signals b _k [i]. Since the same correction value proportional to the amplitude can be added to the same sample position of the wave signal b _k [i], it is possible to add a high-frequency signal without bias. As described above, the sound quality improvement can be made uniform by performing the sound improvement process uniformly on each signal included in the digital sound signal.

残差信号加算部１２６は、補正信号加算部１２４によって補正信号ｃｏ［ｉ］が加減算された１または複数の基本波信号ｂ_ｋ［ｉ］（フレーム信号ｘ_０’［ｉ］）と、信号分離部１２２で分離された残差信号ｄ［ｉ］とを加算して、フレーム信号を再構成する。したがって、再構成されたフレーム信号ｘ_０”［ｉ］は、数式７のようになる。ただし、数式７中のδs［ｉ］および、δc［ｉ］はそれぞれ振幅１の正弦波と余弦波に対する変位量を表し、ｉは０〜Ｌ−１、ｋは２、３、４、…、１０である。

…（数式７） The residual signal adding unit 126 performs signal separation on one or more fundamental wave signals b _k [i] (frame signal x ₀ ′ [i]) _obtained by adding or subtracting the correction signal co [i] by the correction signal adding unit 124. The residual signal d [i] separated by the unit 122 is added to reconstruct the frame signal. Therefore, the reconstructed frame signal x ₀ ″ [i] is expressed by Equation 7. However, δs [i] and δc [i] in Equation 7 are respectively for a sine wave and a cosine wave having an amplitude of 1. The displacement amount is represented, i is 0 to L-1, k is 2, 3, 4,.

... (Formula 7)

オーバラップ合成部１２８は、残差信号加算部１２６において再構成されたフレーム信号と、１つ前のフレーム信号とを（隣り合うフレーム同士を）、デジタル音声信号の一部がオーバラップするように合成し、最終の出力信号を生成する。 The overlap synthesizing unit 128 combines the frame signal reconstructed by the residual signal adding unit 126 and the previous frame signal (adjacent frames) so that a part of the digital audio signal overlaps. Combine and generate the final output signal.

図１０は、オーバラップ合成部１２８の動作を説明するための説明図である。図１０中フレーム信号は、フレーム化部１２０によって生成された後、信号分離部１２２、補正信号加算部１２４および残差信号加算部１２６を経由した信号であり、Ａ、Ｂ、Ｃの英数字は、図３のデジタル音声信号Ａ、Ｂ、Ｃに対応している。 FIG. 10 is an explanatory diagram for explaining the operation of the overlap composition unit 128. The frame signal in FIG. 10 is a signal that is generated by the framing unit 120 and then passes through the signal separation unit 122, the correction signal addition unit 124, and the residual signal addition unit 126. The alphanumeric characters A, B, and C are , Corresponding to the digital audio signals A, B and C in FIG.

具体的に、オーバラップ合成部１２８は、まず、再構成されたフレーム信号ｘ_０”［ｉ］（フレーム信号０、フレーム信号１、フレーム信号２、…）に対して図１０に示す窓関数Ｗを乗じる。フレーム化部１２０において既に正弦波窓による窓関数が施されている場合、オーバラップ合成部１２８においも正弦波窓による窓関数を採用する。また、フレーム化部１２０において窓関数が採用されていない場合、ハニング窓やブラックマン窓を採用する。窓関数はかかる場合に限られず、２つのフレーム信号がオーバラップしたとき、そのオーバラップ部分が合成してオーバラップしない部分と等しくなれば、既存の様々な窓関数を採用することができる。 Specifically, the overlap synthesis unit 128 first performs the window function W shown in FIG. 10 on the reconstructed frame signal x ₀ ″ [i] (frame signal 0, frame signal 1, frame signal 2,...). When a window function using a sine wave window has already been applied in the framing unit 120, a window function using a sine wave window is also used in the overlap synthesis unit 128. In addition, a window function is used in the framing unit 120. If not, the Hanning window or the Blackman window is used, and the window function is not limited to such a case, and when two frame signals overlap, if the overlap part is synthesized and equal to the non-overlapping part Various existing window functions can be employed.

図１０におけるフレーム信号１が入力されたときには、既にフレーム信号０のデジタル音声信号Ａが保持されており、オーバラップ合成部１２８は、フレーム信号０のデジタル音声信号Ａと、フレーム信号１の後部信号Ａ’とがオーバラップするように、デジタル音声信号Ａと後部信号Ａ’を加算して合成信号Ａ”を生成する。同時にオーバラップ合成部１２８は、フレーム信号１のデジタル音声信号Ｂを次回の加算処理のため一次的に保持する。そして、周波数時間変換部１４６からフレーム信号２が入力されると、オーバラップ合成部１２８は、フレーム信号１のとき同様、フレーム信号１のデジタル音声信号Ｂと、フレーム信号２の後部信号Ｂ’とをオーバラップするように加算して合成信号Ｂ”を生成する。オーバラップ合成部１２８は、このようにして生成された合成信号Ａ”、Ｂ”、Ｃ”、…を接続して随時出力する。 When the frame signal 1 in FIG. 10 is input, the digital audio signal A of the frame signal 0 is already held, and the overlap synthesis unit 128 performs the digital audio signal A of the frame signal 0 and the rear signal of the frame signal 1. The digital audio signal A and the rear signal A ′ are added so that A ′ overlaps to generate a synthesized signal A ″. At the same time, the overlap synthesizing unit 128 converts the digital audio signal B of the frame signal 1 into the next time. When the frame signal 2 is input from the frequency time converter 146, the overlap synthesizer 128 receives the digital audio signal B of the frame signal 1 and the frame signal 1 as in the case of the frame signal 1. Then, the synthesized signal B ″ is generated by adding the rear signal B ′ of the frame signal 2 so as to overlap. The overlap synthesis unit 128 connects the synthesized signals A ″, B ″, C ″,... Generated in this way and outputs them as needed.

（音声処理プログラム）
また、上述した音声処理装置１００は、コンピュータを用いて実現することができる。 (Speech processing program)
Further, the above-described voice processing apparatus 100 can be realized using a computer.

図１１は、音声処理装置１００として、デジタル音声信号を分析し、その分析結果を用いてデジタル音声信号を加工処理することが可能なコンピュータ（情報処理装置）２００の典型例を示した機能ブロック図である。コンピュータ２００は、中央処理装置２１０と、一時記憶装置２１２と、外部記憶装置２１４と、入力部２１６と、出力部２１８とを含んで構成される。 FIG. 11 is a functional block diagram showing a typical example of a computer (information processing apparatus) 200 that can analyze a digital audio signal and process the digital audio signal using the analysis result as the audio processing apparatus 100. It is. The computer 200 includes a central processing unit 210, a temporary storage device 212, an external storage device 214, an input unit 216, and an output unit 218.

中央処理装置（ＣＰＵ）２１０は、一時記憶装置２１２や外部記憶装置２１４のプログラムやアプリケーションによりコンピュータ２００全体を制御する。一時記憶装置２１２は、ＲＡＭ、ＥＥＰＲＯＭ、不揮発性ＲＡＭ等から構成され、中央処理装置２１０で処理されるデジタル音声信号等を一時的に記憶する。外部記憶装置２１４は、フラッシュメモリ、ＨＤＤ等で構成され、中央処理装置２１０で処理されるプログラムを記憶する。入力部２１６は、放送局１０２から放送波を通じて、コンテンツサーバ１０４から通信網１０６を通じて、または、記憶媒体１０８から直接、デジタル音声信号を入力し、一時記憶装置２１２に送信する。出力部２１８は、当該コンピュータ２００によって生成された出力信号を再生装置１１０に転送する。 A central processing unit (CPU) 210 controls the entire computer 200 with programs and applications in the temporary storage device 212 and the external storage device 214. The temporary storage device 212 includes a RAM, an EEPROM, a nonvolatile RAM, and the like, and temporarily stores digital audio signals and the like processed by the central processing unit 210. The external storage device 214 includes a flash memory, an HDD, and the like, and stores a program processed by the central processing unit 210. The input unit 216 inputs a digital audio signal from the broadcast station 102 through broadcast waves, from the content server 104 through the communication network 106, or directly from the storage medium 108, and transmits it to the temporary storage device 212. The output unit 218 transfers the output signal generated by the computer 200 to the playback device 110.

上述した音質改善処理は、中央処理装置２１０がプログラムを実行することによって為される。したがって、音声処理装置１００が提供されると同時に、コンピュータ２００に、デジタル音声信号の周波数分析を行い、デジタル音声信号を１または複数の基本波信号と１または複数の基本波信号を除いた残差信号とに分離する信号分離ステップと、１または複数の基本波信号それぞれに対し、振幅の絶対値が拡大されるような補正信号を生成して基本波信号に加算する補正信号生成ステップと、補正信号がそれぞれ加算された１または複数の基本波信号に残差信号を加算する残差信号加算ステップと、を実行させる音声処理プログラムも提供される。また、このプログラムは、記憶媒体から読みとられてコンピュータに取り込まれてもよいし、通信網１０６を介してコンピュータ２００に取り込まれてもよい。 The sound quality improvement process described above is performed by the central processing unit 210 executing a program. Therefore, at the same time when the audio processing apparatus 100 is provided, the computer 200 performs frequency analysis of the digital audio signal, and removes the digital audio signal from one or more fundamental wave signals and one or more fundamental wave signals. A signal separation step that separates the signal into signals, a correction signal generation step that generates a correction signal that increases the absolute value of the amplitude for each of the one or more fundamental wave signals, and adds the correction signal to the fundamental wave signal; There is also provided an audio processing program for executing a residual signal adding step of adding a residual signal to one or a plurality of fundamental wave signals to which signals are added. In addition, this program may be read from a storage medium and loaded into a computer, or may be loaded into the computer 200 via the communication network 106.

（音声処理方法）
次に、上述した音声処理装置１００を用いてデジタル音声信号を分析し、その分析結果を用いてデジタル音声信号を加工処理する音声処理方法を説明する。 (Audio processing method)
Next, a speech processing method for analyzing a digital speech signal using the speech processing apparatus 100 described above and processing the digital speech signal using the analysis result will be described.

図１２は、音声分析合成方法の全体的な流れを示したフローチャートである。音声処理装置１００のフレーム化部１２０は、音声処理装置１００が取得したデジタル音声信号を、所定のフレーム単位（所定サンプル数長）で順次切り出し、フレーム信号を生成する（Ｓ３００）。 FIG. 12 is a flowchart showing the overall flow of the speech analysis / synthesis method. The framing unit 120 of the audio processing device 100 sequentially extracts the digital audio signal acquired by the audio processing device 100 in predetermined frame units (predetermined number of samples) to generate a frame signal (S300).

続いて、信号分離部１２２は、一般調和解析に基づき、フレーム信号の周波数分析を行い、所定数の相異なる周波数ｆ_ｋの所定数の基本波信号ｂ_ｋ［ｉ］を、それぞれ単独でフレーム信号から減算して差分信号ｅ_ｋ［ｉ］を求める（Ｓ３０２）。信号分離部１２２は、所定数の周波数ｆ_ｋすべてに関して処理が遂行されたか否か判断し（Ｓ３０４）、すべてに関して遂行されていない場合（Ｓ３０４のＮＯ）、差分信号導出ステップＳ３０２を繰り返す。 Subsequently, the signal separation unit 122 performs a frequency analysis of the frame signal based on the general harmonic analysis, and each of the predetermined number of fundamental wave signals b _k [i] having a predetermined number of different frequencies f _k is a frame signal. The difference signal e _k [i] is _obtained by subtracting from (S302). The signal separation unit 122 determines whether or not processing has been performed for all of the predetermined number of frequencies f _k (S304), and when not performed for all (NO in S304), repeats the differential signal derivation step S302.

所定数の周波数ｆ_ｋすべてに関して遂行されると（Ｓ３０４のＹＥＳ）、差分信号のエネルギーＥ_ｋが小さい順に９つの周波数ｆ_ｋを並び替える（Ｓ３０６）。そして、信号分離部１２２は、すべての周波数ｆ_ｋに関する基本波信号ｂ_ｋ［ｉ］が減算されるか、または、残差信号ｄ［ｉ］が所定のエネルギー以下となるまで（Ｓ３０８のＮＯ）、その９つの周波数ｆ_ｋに対応する９つの基本波信号ｂ_ｋ［ｉ］を、並び替えられた周波数ｆ_ｋの順に、フレーム信号ｘ_０［ｉ］から順次減算し、残差信号ｄ［ｉ］を導出する（Ｓ３１０）。こうして、信号分離部１２２は、デジタル音声信号を１または複数の基本波信号ｂ_ｋ［ｉ］と残差信号ｄ［ｉ］とに分離することができる。 Once accomplished for all frequencies _{f k} of a predetermined number (YES in S304), rearranges the nine frequencies _{f k} energy _{E k} is the ascending order of the difference signal (S306). Then, the signal separation unit 122 subtracts the fundamental wave signal b _k [i] for all the frequencies f _k or until the residual signal d [i] becomes equal to or lower than a predetermined energy (NO in S308). , the nine corresponding to the frequency _{f k} nine fundamental signal _b k [i], the order of the frequency _{f k} rearranged sequentially subtracted from the frame signal _x 0 [i], the residual signal d [i ] Is derived (S310). Thus, the signal separation unit 122 can separate the digital audio signal into one or a plurality of fundamental wave signals b _k [i] and a residual signal d [i].

そして、補正信号加算部１２４は、１または複数の基本波信号それぞれに対し、振幅の絶対値が拡大されるような補正信号を生成して基本波信号に加算し（Ｓ３１２）、残差信号加算部１２６は、補正信号がそれぞれ加算された１または複数の基本波信号に残差信号を加算する（Ｓ３１４）。 Then, the correction signal adding unit 124 generates a correction signal that increases the absolute value of the amplitude for each of the one or a plurality of fundamental wave signals, adds the correction signal to the fundamental wave signal (S312), and adds the residual signal. The unit 126 adds the residual signal to one or a plurality of fundamental signals to which the correction signals are added (S314).

最後に、オーバラップ合成部１２８は、残差信号加算部１２６において再構成されたフレーム信号と、１つ前のフレーム信号とを、一部がオーバラップするように合成し、最終の出力信号を生成する（Ｓ３１６）。 Finally, the overlap synthesizer 128 synthesizes the frame signal reconstructed by the residual signal adder 126 and the previous frame signal so as to partially overlap, and outputs the final output signal. Generate (S316).

以上説明した音声処理方法によってもデジタル音声信号に含まれる各信号に対して画一的に音声改善処理を施すことで、音質改善の均一化を図ることが可能となる。 Even with the audio processing method described above, it is possible to uniformly improve the sound quality by uniformly performing audio improvement processing on each signal included in the digital audio signal.

以上、添付図面を参照しながら本発明の好適な実施形態について説明したが、本発明はかかる実施形態に限定されないことは言うまでもない。当業者であれば、特許請求の範囲に記載された範疇において、各種の変更例または修正例に想到し得ることは明らかであり、それらについても当然に本発明の技術的範囲に属するものと了解される。 As mentioned above, although preferred embodiment of this invention was described referring an accompanying drawing, it cannot be overemphasized that this invention is not limited to this embodiment. It will be apparent to those skilled in the art that various changes and modifications can be made within the scope of the claims, and these are naturally within the technical scope of the present invention. Is done.

なお、本明細書の音声処理方法における各工程は、必ずしもフローチャートとして記載された順序に沿って時系列に処理する必要はなく、並列的あるいはサブルーチンによる処理を含んでもよい。 Note that each step in the voice processing method of the present specification does not necessarily have to be processed in time series in the order described in the flowchart, and may include parallel or subroutine processing.

本発明は、デジタル音声信号を分析し、その分析結果を用いてデジタル音声信号を加工処理する音声処理装置、音声処理方法および音声処理プログラムに利用することができる。 The present invention can be used in an audio processing apparatus, an audio processing method, and an audio processing program that analyze a digital audio signal and process the digital audio signal using the analysis result.

１００ …音声処理装置
１２０ …フレーム化部
１２２ …信号分離部
１２４ …補正信号加算部
１２６ …残差信号加算部
１２８ …オーバラップ合成部
２００ …コンピュータ DESCRIPTION OF SYMBOLS 100 ... Audio | voice processing apparatus 120 ... Framing part 122 ... Signal separation part 124 ... Correction signal addition part 126 ... Residual signal addition part 128 ... Overlap synthesis part 200 ... Computer

Claims

A signal separation unit that performs frequency analysis of the input digital audio signal and separates the digital audio signal into one or more fundamental wave signals and a residual signal excluding the one or more fundamental wave signals;
A correction signal adding unit that generates a correction signal that increases an absolute value of an amplitude for each of the one or more fundamental wave signals and adds the correction signal to the fundamental wave signal;
A residual signal adding unit that adds the residual signal to the one or more fundamental wave signals to which the correction signals are added, and
With
The one or more fundamental wave signals are a plurality of fundamental wave signals having different frequencies,
The signal separation unit obtains a difference signal when a plurality of fundamental wave signals having the same frequency as the one or more fundamental wave signals are subtracted from the digital audio signal independently, and the difference signal has a lower energy in order An audio processing apparatus, wherein the digital audio signal is sequentially subtracted from the digital audio signal to separate the digital audio signal into one or more fundamental wave signals and the residual signal.

A framing unit that cuts out a digital audio signal in units of a predetermined frame and generates a digital audio signal for each predetermined frame;
An overlap synthesizing unit that synthesizes the input digital audio signal in units of frames so that part of the digital audio signals of adjacent frames overlap;
Further comprising
The digital audio signal input to the signal separation unit is a digital audio signal divided into predetermined frames generated by the framing unit,
The audio processing apparatus according to claim 1, wherein the digital audio signal in units of frames input to the overlap synthesis unit is input from the residual signal addition unit.

The one or more fundamental wave signals are signals represented by a predetermined frequency and respective amplitudes of a sine wave and a cosine wave having the predetermined frequency. Voice processing device.

4. The audio according to claim 3, wherein the correction signal adding unit generates the correction signal according to a frequency of each of the one or more fundamental wave signals, an amplitude of a sine wave, and an amplitude of a cosine wave. Processing equipment.

The correction signal adding unit is
Refer to the correction table in which the frequency of the fundamental wave signal and the value of the correction signal at each sample position of the sine wave and cosine wave having an amplitude of 1 are associated in advance,
The value of the correction signal at each sample position of the sine wave and cosine wave having the amplitude of 1 is extracted according to the frequency of each of the one or more fundamental wave signals, and the sine wave of each of the one or more fundamental wave signals The sound processing apparatus according to claim 4, wherein the correction signal is generated by multiplying the amplitude of the cosine wave and the amplitude of the cosine wave.

Analyzing the frequency of the input digital audio signal, obtaining a differential signal when subtracting a plurality of fundamental signals having the same frequency as one or a plurality of fundamental signals having different frequencies from the digital audio signal, The one or more fundamental wave signals are sequentially subtracted from the digital audio signal in ascending order of energy of the difference signal to remove the one or more fundamental wave signals and the one or more fundamental wave signals from the digital audio signal. Separated into residual signals,
For each of the one or more fundamental wave signals, a correction signal that increases the absolute value of the amplitude is generated and added to the fundamental wave signal,
An audio processing method, wherein the residual signal is added to the one or more fundamental wave signals to which the correction signals are added.

On the computer,
Analyzing the frequency of the input digital audio signal, obtaining a differential signal when subtracting a plurality of fundamental signals having the same frequency as one or a plurality of fundamental signals having different frequencies from the digital audio signal, The one or more fundamental wave signals are sequentially subtracted from the digital audio signal in ascending order of energy of the difference signal to remove the one or more fundamental wave signals and the one or more fundamental wave signals from the digital audio signal. A signal separation step for separating into a residual signal,
A correction signal generation step of generating a correction signal that increases an absolute value of an amplitude for each of the one or more fundamental wave signals and adding the correction signal to the fundamental wave signal;
A residual signal adding step of adding the residual signal to the one or more fundamental wave signals to which the correction signals are respectively added;
A voice processing program characterized by causing