JP5388542B2

JP5388542B2 - Audio signal processing apparatus and method

Info

Publication number: JP5388542B2
Application number: JP2008284019A
Authority: JP
Inventors: 俊明久保; 聡山中; 浩次南; 貴久青柳; 善彦森
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2008-11-05
Filing date: 2008-11-05
Publication date: 2014-01-15
Anticipated expiration: 2028-11-05
Also published as: JP2010114553A

Description

本発明は、音声信号処理装置及び方法に関し、特に、デジタル音声信号の量子化歪みを改善する技術に関する。より詳しく述べれば、本発明は、デジタル音声信号のビット数を拡張したデータを生成する技術に係わるものである。 The present invention relates to an audio signal processing apparatus and method, and more particularly to a technique for improving quantization distortion of a digital audio signal. More specifically, the present invention relates to a technique for generating data in which the number of bits of a digital audio signal is expanded.

近年、オーディオ機器における音声信号処理のデジタル化は一般的となっている。例えば、ＣＤなどは１６ビット量子化、サンプリング周波数４４．１ｋＨｚ（デジタル・オーディオ・テープの規格はサンプリング周波数４８ｋＨｚ）が規格として定められている。
一方、音声信号のソースはもともとアナログ的に生成されるような自然音が一般的であり、その入力レベルのダイナミックレンジは非常に広い。よって、微小レベルで変化する音声信号には量子化により波形歪みが発生する。 In recent years, digitalization of audio signal processing in audio equipment has become common. For example, 16-bit quantization for a CD or the like, and a sampling frequency of 44.1 kHz (a digital audio tape standard is a sampling frequency of 48 kHz) are defined as standards.
On the other hand, a natural sound that is generated in an analog manner is generally used as the source of the audio signal, and the dynamic range of the input level is very wide. Therefore, waveform distortion occurs due to quantization in an audio signal that changes at a minute level.

下記の特許文献１では、このような量子化のビット数不足による信号歪みに対応するために
（ａ）入力されたデジタルデータのサンプルデータ毎のデータ変化のうち、変化しない区間をはさんだＬＳＢの変化の時間長を判断基準として、異なるカットオフ周波数を有するローパスフィルタのうち少なくとも一つを選定して、信号出力を得ること、及び
（ｂ）ＬＳＢ変化の時間間隔からレベルの変化率を判別し、その判別された期間で波形変化をなめらかにするＬＳＢよりも下位のデータ列を生成後、ＬＳＢ変化点の前後にわたってＬＳＢ以下のデータを生成してビット拡張することが記載されている。 In Patent Document 1 below, in order to cope with such signal distortion due to the insufficient number of bits of quantization, (a) among the data changes for each sample data of the input digital data, Using at least one of the low-pass filters having different cutoff frequencies as a criterion for the change time length, a signal output is obtained, and (b) the level change rate is determined from the LSB change time interval. In addition, it is described that after a data string lower than the LSB that smoothes the waveform change in the determined period is generated, data less than the LSB is generated before and after the LSB change point to perform bit extension.

特公平７−７３１８６号公報Japanese Patent Publication No. 7-73186 雛元孝夫監修、棟安実治、夫田口亮著、「非線形ディジタル信号処理」朝倉書店、１９９９年３月２０日、ｐ．７２−７４、ｐ．１０６−１０８Supervised by Takao Hinamoto, Mitsuji Muneyasu, Ryo Otaguchi, “Nonlinear Digital Signal Processing”, Asakura Shoten, March 20, 1999, p. 72-74, p. 106-108

なお、上記の非特許文献１については、後に言及する。 The above Non-Patent Document 1 will be mentioned later.

特許文献１による方法は、ＬＳＢ変化の時間長のみでカットオフ周波数を選択している。つまりＬＳＢ変化がある度にカットオフ周波数が変化する。また、ＬＳＢ変化の時間長のみで正確にカットオフ周波数を選択するは困難である。例えば一定周波数の正弦波でもＬＳＢ変化の時間長は一定ではない。そのため、長い時間範囲でみると処理の繋ぎ目がなめらかに変化せず、歪みが十分補正されない。 In the method according to Patent Document 1, the cut-off frequency is selected only by the time length of the LSB change. That is, the cut-off frequency changes every time there is an LSB change. In addition, it is difficult to accurately select the cut-off frequency only by the time length of the LSB change. For example, even with a sine wave having a constant frequency, the time length of the LSB change is not constant. For this reason, the process joints do not change smoothly over a long time range, and the distortion is not sufficiently corrected.

本発明は前記問題点を解決するためになされたもので、量子化前の音声信号波形により近い波形の音声を生成することができる音声信号処理装置及び方法を提供することを目的とする。本発明はまた、音声信号が特定の周波数の成分を含む場合には、該特定の周波数の成分を失うことなく、出力することを可能にすることを目的とする。 The present invention has been made to solve the above problems, and an object thereof is to provide an audio signal processing apparatus and method capable of generating audio having a waveform closer to the audio signal waveform before quantization. Another object of the present invention is to make it possible to output an audio signal without losing the specific frequency component when the audio signal includes a specific frequency component.

この発明の音声信号処理装置は、
ｎビット（ｎは正の整数）の音声信号をαビット分（αは正の整数）ビットシフトしてｎ＋αビットの音声信号を生成する原データビットシフト部と、
前記ｎビットの音声信号の変曲点を検出し、該変曲点から該音声信号の周波数と振幅を推定する周波数振幅推定部と、
前記音声信号の予め定められた特定の周波数のスペクトルの強度が所定の閾値よりも大きいか否かを判定する特定周波数検出部と、
前記特定周波数検出部により前記特定周波数のスペクトルの強度が前記所定の閾値以下と判定されたときは、前記原データビットシフト部から出力された音声信号に対し、前記周波数振幅推定部で推定された前記周波数及び振幅に基づいて生成された低域通過フィルタ係数を用いてエッジ保存型平滑化処理を行なうことにより得られた音声信号を出力し、
前記特定周波数検出部により前記特定周波数のスペクトルの強度が前記閾値よりも大きいと判定されたときは、前記原データビットシフト部から出力された音声信号をそのまま出力する
選択的フィルタ処理部と
を備えることを特徴とする。 The audio signal processing device of the present invention is
an original data bit shift unit that shifts an n-bit (n is a positive integer) audio signal by α bits (α is a positive integer) to generate an n + α-bit audio signal;
A frequency amplitude estimation unit that detects an inflection point of the n-bit audio signal and estimates the frequency and amplitude of the audio signal from the inflection point;
A specific frequency detector that determines whether or not the intensity of a spectrum of a predetermined specific frequency of the audio signal is greater than a predetermined threshold;
When the intensity of the spectrum of the specific frequency is determined to be equal to or less than the predetermined threshold by the specific frequency detection unit, the frequency amplitude estimation unit estimates the audio signal output from the original data bit shift unit. Output an audio signal obtained by performing edge preserving smoothing processing using a low-pass filter coefficient generated based on the frequency and amplitude,
A selective filter processing unit that directly outputs the audio signal output from the original data bit shift unit when the specific frequency detection unit determines that the spectrum intensity of the specific frequency is greater than the threshold value. It is characterized by that.

本発明によれば、ｎビットの音声信号から周波数と振幅を推定し、その周波数と振幅に基づいて決定された低域通過フィルタ係数を使って平滑化することで、量子化によって発生する高調波成分を除去し、量子化による音声信号の波形歪みを補正することができる。またエッジ保存型平滑化フィルタ処理を行なうことにより、信号振幅が急峻に大きく変化する領域を有する音声信号の再現性を損なわないようにすることができる。さらに音声信号が特定の周波数の成分を含む場合には、エッジ保存型平滑化フィルタを施さないので、当該特定の周波数の成分を失うことなく、そのまま出力することができる。 According to the present invention, harmonics generated by quantization are estimated by estimating the frequency and amplitude from an n-bit speech signal and smoothing using a low-pass filter coefficient determined based on the frequency and amplitude. The components can be removed, and waveform distortion of the audio signal due to quantization can be corrected. Further, by performing the edge preserving smoothing filter process, it is possible to prevent the reproducibility of the audio signal having a region where the signal amplitude changes sharply and greatly. Further, when the audio signal includes a component of a specific frequency, the edge preserving smoothing filter is not applied, so that it can be output as it is without losing the component of the specific frequency.

実施の形態１．
図１は、本発明の実施の形態１に係る音声信号処理装置の構成を示す図である。実施の形態１に係る音声信号処理装置は、入力端子１、周波数振幅推定部２、特定周波数検出部６、フィルタ係数生成部３、原データビットシフト部４、及びフィルタ部５を備える。 Embodiment 1 FIG.
FIG. 1 is a diagram showing a configuration of an audio signal processing device according to Embodiment 1 of the present invention. The audio signal processing apparatus according to Embodiment 1 includes an input terminal 1, a frequency amplitude estimation unit 2, a specific frequency detection unit 6, a filter coefficient generation unit 3, an original data bit shift unit 4, and a filter unit 5.

図１に示される音声信号処理装置では、ｎビット（ｎは正の整数）の音声信号(入力音声信号)Ｘが入力端子１から原データビットシフト部４、周波数振幅推定部２、及び特定周波数検出部６に入力される。 In the audio signal processing apparatus shown in FIG. 1, an n-bit (n is a positive integer) audio signal (input audio signal) X is converted from an input terminal 1 to an original data bit shift unit 4, a frequency amplitude estimation unit 2, and a specific frequency. Input to the detection unit 6.

原データビットシフト部４は、ｎビットの音声信号Ｘをαビット分（αは正の整数）ビットシフト（ビット拡張）したｎ＋αビットの音声信号（ビットシフト後音声信号）Ｘ’をフィルタ部５に出力する。 The original data bit shift unit 4 filters the n + α-bit audio signal (bit-shifted audio signal) X ′ obtained by bit-shifting (bit extension) the n-bit audio signal X by α bits (α is a positive integer). Output to.

周波数振幅推定部２は、入力された音声信号Ｘの変曲点を検出し、相前後する変曲点の間隔から該音声信号Ｘの周波数を推定するとともに、相前後する変曲点の信号の値の差から前記音声信号の振幅を推定してフィルタ係数生成部３に周波数Ｆと振幅Ａを出力する。 The frequency amplitude estimator 2 detects an inflection point of the input audio signal X, estimates the frequency of the audio signal X from the interval of the inflection points that follow each other, and calculates the signal of the inflection point that follows. The amplitude of the audio signal is estimated from the difference in value, and the frequency F and the amplitude A are output to the filter coefficient generation unit 3.

特定周波数検出部６は、音声信号Ｘの予め定められた特定周波数Ｆｄのスペクトルの強度Ｒを算出し、該強度Ｒが所定の閾値よりも大きいか否かを示す信号(強度判定結果信号)ＤＲを、フィルタ係数生成部３に出力する。強度判定結果信号ＤＲは、強度Ｒが所定の閾値ＲＴＨよりも大きいときは第１の値、例えば「１」を取り、強度が所定の閾値ＲＴＨ以下のときは、第２の値、例えば「０」を取る。 The specific frequency detection unit 6 calculates a spectrum intensity R of a predetermined specific frequency Fd of the audio signal X, and a signal (intensity determination result signal) DR indicating whether or not the intensity R is greater than a predetermined threshold value. Is output to the filter coefficient generation unit 3. The intensity determination result signal DR takes a first value, for example, “1” when the intensity R is greater than a predetermined threshold value RTH, and a second value, for example, “0” when the intensity is equal to or less than the predetermined threshold value RTH. "I take the.

フィルタ係数生成部３は、強度判定結果信号ＤＲと、周波数Ｆと振幅Ａとに基づいてフィルタ係数Ｃを生成して、フィルタ部５に出力する。
このフィルタ係数Ｃは、強度判定結果信号ＤＲが第２の値のとき（スペクトルの強度Ｒが閾値ＲＴＨ以下のとき）は、周波数Ｆと振幅Ａに基づいて生成され、フィルタ部５をエッジ保存型平滑化フィルタとして動作させるためのもの（低域通過フィルタ係数）であり、強度判定結果信号ＤＲが第１の値のとき（スペクトルの強度Ｒが閾値ＲＴＨよりも大きいとき）は、フィルタ部５に、入力された信号をそのまま出力させるためのものである。 The filter coefficient generation unit 3 generates a filter coefficient C based on the strength determination result signal DR, the frequency F, and the amplitude A, and outputs the filter coefficient C to the filter unit 5.
This filter coefficient C is generated based on the frequency F and the amplitude A when the intensity determination result signal DR is the second value (when the spectrum intensity R is equal to or less than the threshold value RTH). When the intensity determination result signal DR is the first value (when the spectrum intensity R is larger than the threshold value RTH), the filter unit 5 is operated as a smoothing filter (low-pass filter coefficient). This is for outputting the input signal as it is.

フィルタ部５は、原データビットシフト部４から出力される、ｎ＋αビットの音声信号Ｘ’に対してフィルタ係数Ｃを用いてフィルタ処理し、ｎ＋αビットの音声信号Ｙを出力する。先にも述べたように、スペクトルの強度Ｒが比較的小さい場合には、エッジ保存型平滑化処理が行なわれ、スペクトルの強度Ｒが比較的大きい場合には、入力をそのまま出力する処理が行なわれる。 The filter unit 5 filters the n + α-bit audio signal X ′ output from the original data bit shift unit 4 using the filter coefficient C, and outputs an n + α-bit audio signal Y. As described above, when the spectrum intensity R is relatively small, an edge-preserving smoothing process is performed, and when the spectrum intensity R is relatively large, a process of outputting the input as it is is performed. It is.

フィルタ係数生成部３とフィルタ部５とで、特定周波数のスペクトルの強度Ｒが所定の閾値以下のときは、ビットシフト後音声信号Ｘ’に対し、周波数Ｒ及び振幅Ａに基づいて生成された低域通過フィルタ係数を用いてエッジ保存型平滑化処理を行なうことにより得られた音声信号Ｙを出力し、特定周波数のスペクトルの強度Ｒが閾値よりも大きいときは、ビットシフト後音声信号Ｘ’をそのまま出力する選択的フィルタ処理部１０が構成されている。 When the filter coefficient generation unit 3 and the filter unit 5 have the spectrum intensity R of the specific frequency equal to or lower than a predetermined threshold, the low-frequency signal generated on the basis of the frequency R and the amplitude A is reduced with respect to the audio signal X ′ after the bit shift. The speech signal Y obtained by performing the edge preserving smoothing process using the pass-pass filter coefficient is output, and when the spectrum intensity R of the specific frequency is larger than the threshold, the speech signal X ′ after the bit shift is A selective filter processing unit 10 is configured to output as it is.

図２（ａ）〜（ｃ）は、図１の音声信号処理装置で処理される音声信号の一例を示す図である。図２（ａ）は周波数ｆ１の正弦波のアナログ音声信号を示している。図２（ａ）及び図２（ｂ）で横軸は時間ｔ、図２（ａ）の縦軸は信号レベルを示している。このアナログ信号が標本化、量子化されて、図２（ｂ）のようなｎビットの周波数ｆ１の正弦波のデジタル音声信号が生成され、入力端子１に入力される。図２（ｂ）の縦軸は信号Ｘ（ｉ）のレベルを示している。図２（ｂ）に示すデジタル音声信号Ｘ（ｉ）は、相連続するサンプリング点（黒丸で表されている）毎の、デジタル信号の列であり、時間（を表す数値）ｉは、サンプリング点毎に１ずつ増加する。
図２（ｃ）の実線は図２（ｂ）の音声信号Ｘ（ｉ）の周波数スペクトル、点線は（後述の）低域通過フィルタの周波数特性を示している。図２（ｃ）の横軸は周波数ｆ、縦軸はパワー及びゲインを示している。図２（ｃ）の周波数スペクトルが示すように量子化前のアナログ信号は周波数ｆ１の正弦波であるが、量子化により得られるデジタル音声信号Ｘ（ｉ）には、２次高調波ｆ２や３次高調波ｆ３などの高調波が含まれる。 2A to 2C are diagrams illustrating an example of an audio signal processed by the audio signal processing device of FIG. FIG. 2A shows a sinusoidal analog audio signal having a frequency f1. 2A and 2B, the horizontal axis indicates time t, and the vertical axis in FIG. 2A indicates the signal level. The analog signal is sampled and quantized to generate a digital audio signal having an n-bit frequency f 1 as shown in FIG. 2B and input to the input terminal 1. The vertical axis in FIG. 2B indicates the level of the signal X (i). The digital audio signal X (i) shown in FIG. 2 (b) is a sequence of digital signals at successive sampling points (represented by black circles), and the time (representing numerical value) i is the sampling point. Increases by 1 each time.
The solid line in FIG. 2C indicates the frequency spectrum of the audio signal X (i) in FIG. 2B, and the dotted line indicates the frequency characteristics of a low-pass filter (described later). In FIG. 2C, the horizontal axis indicates frequency f, and the vertical axis indicates power and gain. As shown in the frequency spectrum of FIG. 2 (c), the analog signal before quantization is a sine wave of frequency f1, but the digital audio signal X (i) obtained by quantization has second harmonics f2 and 3 Harmonics such as the second harmonic f3 are included.

図１の音声信号処理装置では、例えば、図２（ｃ）の実線が示すような周波数ｆ１を周波数振幅推定部２で推定し、特定周波数検出部６で特定周波数Ｆｄの強さが所定の閾値以下であると判定された場合には、図２（ｃ）の点線のようなカットオフ周波数特性を実現するための低域通過フィルタ係数を生成する。このときカットオフ周波数ｆｃ１は、ｆ１とｆ２の間に設定される。周波数ｆ１が推定できればｆ２は自明である。生成したカットオフ周波数ｆｃ１の特性を持つフィルタ係数を用いて平滑化を行うことで、音声信号からｆ２やｆ３のような高調波を取り除くことができる。 In the audio signal processing device of FIG. 1, for example, the frequency f1 as shown by the solid line in FIG. 2C is estimated by the frequency amplitude estimation unit 2, and the strength of the specific frequency Fd is determined by the specific frequency detection unit 6 as a predetermined threshold value. When it is determined that it is below, a low-pass filter coefficient for realizing the cut-off frequency characteristic as shown by the dotted line in FIG. 2C is generated. At this time, the cutoff frequency fc1 is set between f1 and f2. If the frequency f1 can be estimated, f2 is obvious. By performing smoothing using the generated filter coefficient having the characteristic of the cut-off frequency fc1, harmonics such as f2 and f3 can be removed from the audio signal.

以下、図１の音声信号処理装置の構成要素について順に説明する。
図３（ａ）及び（ｂ）は、原データビットシフト部４の動作を説明するための図である。横軸は時間ｉを示し、縦軸は信号レベルを示している。図３（ａ）はｎビットの音声信号Ｘ（図２（ｂ）に示された信号と同じ）を示し、図３（ｂ）はｎ＋αビットの音声信号Ｘ’を示している。原データビットシフト部４は、図３（ａ）に示すようなｎビットの音声信号をαビットだけビットシフトし、図３（ｂ）に示したようなｎ＋αビットの音声信号をフィルタ部５に出力する。 Hereinafter, components of the audio signal processing apparatus of FIG. 1 will be described in order.
FIGS. 3A and 3B are diagrams for explaining the operation of the original data bit shift unit 4. The horizontal axis indicates time i, and the vertical axis indicates the signal level. 3A shows an n-bit audio signal X (same as the signal shown in FIG. 2B), and FIG. 3B shows an n + α-bit audio signal X ′. The original data bit shift unit 4 bit-shifts the n-bit audio signal as shown in FIG. 3A by α bits, and the n + α-bit audio signal as shown in FIG. Output.

図４は、周波数振幅推定部２の詳細な構成を示す図である。周波数振幅推定部２は、図４に示すように、変曲点検出部７、周波数推定部８、及び振幅推定部９を備えている。 FIG. 4 is a diagram showing a detailed configuration of the frequency amplitude estimation unit 2. The frequency amplitude estimation unit 2 includes an inflection point detection unit 7, a frequency estimation unit 8, and an amplitude estimation unit 9, as shown in FIG.

変曲点検出部７は、ｎビットの音声信号が一連の単調増加を開始する点又は終了する点及び一連の単調減少を開始する点又は終了する点を変曲点として検出する。ここで、「一連の単調増加」とは、途中で減少が生じることのない（同じ値が続くことはあっても良い）増加の連続を意味する。この一連の単調増加の区間内では、
Ｘ（ｉ）≦Ｘ（ｉ＋１）
の関係が連続して（すべてのサンプリング点ｉにおいて）満たされる。
同様に、「一連の単調減少」とは、途中で増加が生じることのない（同じ値が続くことはあっても良い）減少の連続を意味する。この一連の単調減少の区間内では、
Ｘ（ｉ）≧Ｘ（ｉ＋１）
の関係が連続して（全てのサンプリング点ｉにおいて）満たされる。 The inflection point detection unit 7 detects a point at which the n-bit audio signal starts or ends a series of monotone increases and a point at which a series of monotone decreases starts or ends as inflection points. Here, “a series of monotonous increases” means a continuous increase in which no decrease occurs in the middle (the same value may continue). Within this series of monotonically increasing intervals,
X (i) ≦ X (i + 1)
Are continuously satisfied (at all sampling points i).
Similarly, “a series of monotonous decreases” means a series of decreases in which no increase occurs in the middle (the same value may continue). Within this series of monotonically decreasing intervals,
X (i) ≧ X (i + 1)
Are continuously satisfied (at all sampling points i).

図示の変曲点検出部７は、一次微分算出部１１及び符号変化点検出部１２を備える。
一次微分算出部１１は、入力されたｎビットの音声信号Ｘ（ｉ）から一次微分データＤ（ｉ）を算出して出力する。例えば、一次微分算出部１１は、
各サンプリング点における前記ｎビットの音声信号をＸ（ｉ）で表し、
次のサンプリング点における前記ｎビットの音声信号をＸ（ｉ＋１）で表すとき、
Ｄ（ｉ）＝Ｘ（ｉ＋１）−Ｘ（ｉ）
で得られるＤ（ｉ）を、一次微分データとして出力する。 The illustrated inflection point detection unit 7 includes a first derivative calculation unit 11 and a sign change point detection unit 12.
The primary differential calculation unit 11 calculates and outputs primary differential data D (i) from the input n-bit audio signal X (i). For example, the primary derivative calculation unit 11
The n-bit audio signal at each sampling point is represented by X (i),
When the n-bit audio signal at the next sampling point is represented by X (i + 1),
D (i) = X (i + 1) −X (i)
D (i) obtained in step 1 is output as first-order differential data.

符号変化点検出部１２は、一次微分データＤの符号が正に変化した点及び負に変化した点を、変曲点として検出する。より具体的には、符号変化点検出部１２は、一次微分データＤが、符号が負である状態又は値がゼロである状態から符号が正である状態に変わった点（以下、「正への変化点」と言う）、及び符号が正である状態又は値がゼロである状態から符号が負である状態に変わった点（以下「負への変化点」と言う）を、上記の「符号が変化した点」として検出する。符号変化点検出部１２の出力は２値データＰＭであり、「正への変化点」から、「負への変化点」までは、第１の値（例えば「１」）を取り、「負への変化点」から、「正への変化点」までは、第２の値（例えば「０」）を取る。
符号変化点検出部１２のこのような動作は、一次微分データＤ（ｉ）から符号のみの２値データＰＭへ変換する動作（ただし、「０」の場合は前データの符号とする）と言うこともできる。
上記の一次微分算出部１１と符号変化点検出部１２とで構成される変曲点検出部７は、ｎビットの音声信号が一連の単調増加を開始する点及び一連の単調減少を開始する点を変曲点として検出し、検出された変曲点の位置で第１の値、例えば「１」となり、それ以外の位置で第２の値、例えば「０」となる信号ＳＬを生成する。 The sign change point detection unit 12 detects a point at which the sign of the primary differential data D has changed positively and a point at which the sign has changed negatively as an inflection point. More specifically, the sign change point detection unit 12 changes the primary differential data D from a state where the sign is negative or a state where the value is zero to a state where the sign is positive (hereinafter referred to as “positive”). ), And a point where the sign is positive or a state where the value is zero is changed to a state where the sign is negative (hereinafter referred to as “change point to negative”). This is detected as a point where the sign has changed. The output of the sign change point detection unit 12 is binary data PM, and takes a first value (for example, “1”) from “change point to positive” to “change point to negative”. The second value (eg, “0”) is taken from “change point to” to “change point to positive”.
Such an operation of the sign change point detection unit 12 is an operation of converting the primary differential data D (i) into binary data PM having only a sign (however, in the case of “0”, the sign of the previous data is used). You can also.
The inflection point detection unit 7 including the first-order differential calculation unit 11 and the sign change point detection unit 12 has a point where an n-bit audio signal starts a series of monotone increases and a point where a series of monotone decreases starts. Is detected as an inflection point, and a signal SL that has a first value, for example, “1” at the position of the detected inflection point, and a second value, for example, “0”, is generated at other positions.

周波数推定部８は音声信号の変曲点と前の（直前の）変曲点の区間長を算出して周波数Ｆとして出力する。
振幅推定部９は音声信号の変曲点と前の（直前の）変曲点の音声信号のレベル差を振幅Ａとして出力する。 The frequency estimator 8 calculates the section length of the inflection point of the audio signal and the previous (immediately preceding) inflection point and outputs it as the frequency F.
The amplitude estimation unit 9 outputs the level difference between the inflection point of the audio signal and the audio signal at the previous (immediately) inflection point as the amplitude A.

図５（ａ）〜（ｄ）は、周波数振幅推定部２の動作を説明するための図である。横軸は時間ｉを示している。
図５（ａ）はｎビットの音声信号Ｘ（ｉ）を示し、図５（ｂ）は音声信号の一次微分データＤ（ｉ）を示し、図５（ｃ）は２値データＰＭを示し、図５（ｄ）は音声信号の変曲点位置の検出結果を示す信号ＳＬを示している。
図５（ａ）に示すｎビットの音声信号が入力された場合、一次微分算出部１１は、図５（ｂ）のような一次微分データＤ（ｉ）を算出する。
符号変化点検出部１２は、一次微分データＤから図５（ｃ）のような符号のみの２値データＰＭに変換して、図５（ｄ）に示すようにその２値データＰＭが変化する位置ｉｃ＝ｉ１、ｉ２、ｉ３（その時間軸上の位置が縦方向に延びた点線で示されている）を検出し、検出位置で信号ＳＬの値を「１」にする。
よって、周波数推定部８及び振幅推定部９は、ｉ１〜ｉ２区間では周波数としてＦ＝１／（ｉ２−ｉ１）、振幅としてＡ＝｜Ｘ（ｉ２）−Ｘ（ｉ１）｜をそれぞれ求め、ｉ２〜ｉ３区間では、周波数としてＦ＝１／（ｉ３−ｉ２）、振幅としてＡ＝｜Ｘ（ｉ３）−Ｘ（ｉ２）｜をそれぞれ求める。求められた周波数Ｆ及び振幅Ａはフィルタ係数生成部３に供給される。 FIGS. 5A to 5D are diagrams for explaining the operation of the frequency amplitude estimation unit 2. The horizontal axis indicates time i.
FIG. 5A shows an n-bit audio signal X (i), FIG. 5B shows primary differential data D (i) of the audio signal, FIG. 5C shows binary data PM, FIG. 5D shows a signal SL indicating the detection result of the inflection point position of the audio signal.
When the n-bit audio signal shown in FIG. 5A is input, the primary differential calculation unit 11 calculates primary differential data D (i) as shown in FIG.
The sign change point detector 12 converts the primary differential data D into binary data PM having only a sign as shown in FIG. 5C, and the binary data PM changes as shown in FIG. 5D. The positions ic = i1, i2, and i3 (the positions on the time axis are indicated by dotted lines extending in the vertical direction) are detected, and the value of the signal SL is set to “1” at the detection position.
Therefore, the frequency estimation unit 8 and the amplitude estimation unit 9 obtain F = 1 / (i2−i1) as the frequency and A = | X (i2) −X (i1) | In the interval ˜i3, F = 1 / (i3−i2) as the frequency and A = | X (i3) −X (i2) | as the amplitude, respectively. The obtained frequency F and amplitude A are supplied to the filter coefficient generation unit 3.

図５（ａ）〜（ｄ）を参照して説明したように、図４に示す周波数振幅推定部２で算出される音声信号の一次微分データの符号の変化する位置（「正への変化点」及び「負への変化点」）は音声信号の変曲点として扱われ、その相前後する変曲点の時間間隔の逆数が、音声信号の周波数であると推定することができる。また、変曲点は、一連の単調増加の開始点又は一連の単調減少の開始点であるので、相前後する変曲点の信号値の差が振幅であると推定することができる。 As described with reference to FIGS. 5A to 5D, the position where the sign of the first derivative data of the audio signal calculated by the frequency amplitude estimation unit 2 shown in FIG. 4 changes (“change point to positive”). ”And“ change point to negative ”) are treated as inflection points of the audio signal, and it can be estimated that the reciprocal of the time interval of the inflection points that follow each other is the frequency of the audio signal. In addition, since the inflection point is the start point of a series of monotone increases or the start point of a series of monotone decreases, it can be estimated that the difference between the signal values of successive inflection points is the amplitude.

図４に示す周波数振幅推定部２では、音声信号に含まれる最も高い周波数の成分の半周期を推定することで周波数を算出することができ、同時に変曲点における信号値の差から振幅を算出することができる。 The frequency amplitude estimation unit 2 shown in FIG. 4 can calculate the frequency by estimating the half cycle of the highest frequency component included in the audio signal, and simultaneously calculate the amplitude from the difference in signal value at the inflection point. can do.

そして、このようにして推定された周波数及び振幅に基づいて決定されたフィルタ係数が、それぞれの区間（相前後する変曲点相互の区間）の信号値に対するフィルタリングに用いられる。たとえば、図５（ａ）〜（ｄ）の相前後する変曲点ｉ１、ｉ２により求められた周波数、振幅により決定されたフィルタ係数を用いて、変曲点ｉ１の次のサンプル点から、変曲点ｉ２までのデータに対するフィルタリングが行われる。 Then, the filter coefficient determined based on the frequency and the amplitude estimated in this way is used for filtering on the signal value of each section (intersections between successive inflection points). For example, using the filter coefficient determined by the frequency and amplitude determined by the inflection points i1 and i2 shown in FIG. 5A to FIG. 5D, the sample point next to the inflection point i1 is changed. Filtering is performed on the data up to the music point i2.

なお、上記の変曲点検出部７は、ｎビットの音声信号が一連の単調増加を開始する点及び一連の単調減少を開始する点を変曲点として検出するものであるが、代わりにｎビットの音声信号が一連の単調増加を終了する点及び一連の単調減少を終了する点を変曲点として検出するように変曲点検出部を構成しても良い。その場合には、一次微分算出部として、
Ｄ（ｉ）＝Ｘ（ｉ）−Ｘ（ｉ−１）
で得られるＤ（ｉ）を、一次微分データとして出力するものを用い、符号変化点検出部として、一次微分データＤの符号が負に変化した点より前で、それに最も近い正であった点、及び一次微分データＤの符号が正に変化した点より前で、それに最も近い負であった点を検出するものを用いれば良い。 The inflection point detection unit 7 detects a point at which an n-bit audio signal starts a series of monotone increases and a point at which a series of monotone decreases starts as an inflection point. The inflection point detection unit may be configured to detect, as an inflection point, a point at which a bit sound signal ends a series of monotone increases and a point at which a series of monotone decreases ends. In that case, as the first derivative calculation unit,
D (i) = X (i) -X (i-1)
D (i) obtained in (1) is output as primary differential data, and the sign change point detection unit is the closest positive point before the point where the sign of the primary differential data D changes to negative , And a point that detects the closest negative point before the point where the sign of the primary differential data D changes to positive may be used.

特定周波数検出部６は、予め定められた特定周波数のみのスペクトルの強度を算出する。検出したい周波数（特定周波数）Ｆｄを設定し、その周波数の余弦波（ＣＯＳ波）及び正弦波（ＳＩＮ波）を基本パターンを生成し、該基本パターンと音声信号Ｘを重畳させた結果に基づいて特定周波数Ｆｄのスペクトルの強度Ｒを算出する。 The specific frequency detector 6 calculates the intensity of the spectrum only for a predetermined specific frequency. Based on the result of setting the frequency (specific frequency) Fd to be detected, generating a cosine wave (COS wave) and a sine wave (SIN wave) of that frequency, and superimposing the audio signal X on the basic pattern The intensity R of the spectrum of the specific frequency Fd is calculated.

具体的には、スペクトルの強度Ｒ（ｉ）は、式（１）で表される。
ここで、Ｘ（ｉ）は入力レベル、Ｆｄは特定周波数、Ｆｓはサンプリング周波数、ｋは注目点からの相対的な位置（サンプル数で表した距離及び方向で定義される）、ｍは予め設定された次数（計算に用いられるサンプル数）、Ｒ（ｉ）は時間ｉを中心としたｍサンプル区間の周波数Ｆｄのスペクトルの強度である。 Specifically, the intensity R (i) of the spectrum is expressed by equation (1).
Here, X (i) is the input level, Fd is the specific frequency, Fs is the sampling frequency, k is the relative position from the point of interest (defined by the distance and direction expressed by the number of samples), and m is preset. The order (number of samples used in the calculation), R (i), is the intensity of the spectrum of the frequency Fd in the m sample section centered on time i.

図６（ａ）及び（ｂ）、図７（ａ）〜（ｄ）、並びに図８（ａ）〜（ｄ）は、特定周波数検出部６の動作を説明するための図である。検出したい周波数（特定周波数）をＦｄ＝Ｆｓ／４とした場合の特定周波数検出部６の動作を例にして説明する。図６（ａ）及び（ｂ）はスペクトルの強度算出に用いる基本パターンを示している。図６（ａ）はＦｄ＝Ｆｓ／４の時のＣＯＳ波の基本パターン（ｃｏｓ｛（２πＦｄ／Ｆｓ）ｋ｝で表される）を示し、図６（ｂ）はＦｄ＝Ｆｓ／４の時のＳＩＮ波の基本パターン（ｓｉｎ｛（２πＦｄ／Ｆｓ）ｋ｝で表される）を示している。 FIGS. 6A and 6B, FIGS. 7A to 7D, and FIGS. 8A to 8D are diagrams for explaining the operation of the specific frequency detection unit 6. FIG. The operation of the specific frequency detector 6 when the frequency to be detected (specific frequency) is Fd = Fs / 4 will be described as an example. FIGS. 6A and 6B show basic patterns used for spectrum intensity calculation. FIG. 6A shows a basic pattern of COS waves (represented by cos {(2πFd / Fs) k}) when Fd = Fs / 4, and FIG. 6B shows a case where Fd = Fs / 4. The SIN wave basic pattern (represented by sin {(2πFd / Fs) k}) is shown.

図７（ａ）〜（ｄ）は入力信号Ｘ（ｉ）に特定周波数が含まれている時の特定周波数検出部６の動作を説明するための図である。図７（ａ）〜（ｄ）で横軸は時間ｉを示す。図７（ａ）で縦軸は信号Ｘ（ｉ）のレベルを示し、図７（ｂ）〜（ｄ）で縦軸はパワーを示している。
図７（ａ）は特定周波数Ｆｄと同じ周波数の正弦波の音声信号Ｘ（ｉ）を示している。図７（ｂ）は、図７（ａ）の音声信号と図６（ａ）の基本パターンの重畳結果であるＲｃ（ｉ）を示している。図７（ｃ）は、図７（ａ）の音声信号と図６（ｂ）の基本パターンの重畳結果であるＲｓ（ｉ）を示している。図７（ｄ）は、図７（ｂ）のＲｃ（ｉ）と図７（ｃ）のＲｓ（ｉ）の二乗和（Ｒ（ｉ）＝｛Ｒｓ（ｉ）｝^２＋｛Ｒｃ（ｉ）｝^２）であり、特定周波数Ｆｄのスペクトルの強度Ｒ（ｉ）を示している。 7A to 7D are diagrams for explaining the operation of the specific frequency detector 6 when the input signal X (i) includes a specific frequency. 7A to 7D, the horizontal axis represents time i. In FIG. 7A, the vertical axis indicates the level of the signal X (i), and in FIGS. 7B to 7D, the vertical axis indicates the power.
FIG. 7A shows a sine wave audio signal X (i) having the same frequency as the specific frequency Fd. FIG. 7B shows Rc (i) that is a superposition result of the audio signal of FIG. 7A and the basic pattern of FIG. FIG. 7C shows Rs (i) that is a result of superimposing the audio signal of FIG. 7A and the basic pattern of FIG. 6B. FIG. 7D shows the sum of squares of Rc (i) in FIG. 7B and Rs (i) in FIG. 7C (R (i) = {Rs (i)} ² + {Rc (i) } ² ), which shows the intensity R (i) of the spectrum of the specific frequency Fd.

図８（ａ）〜（ｄ）は入力信号Ｘ（ｉ）に特定周波数が含まれていない時の特定周波数検出部６の動作を説明するための図である。図８（ａ）〜（ｄ）で横軸は時間ｉを示す。図８（ａ）で縦軸は信号レベルＸ（ｉ）を示し、図８（ｂ）〜（ｄ）で縦軸はパワーを示している。
図８（ａ）は特定周波数Ｆｄと異なる周波数の一例として特定周波数Ｆｄの２分の１の周波数（Ｆｄ／２）の正弦波の音声信号を示している。図８（ｂ）は、図８（ａ）の音声信号と図６（ａ）の基本パターンの重畳結果であるＲｃ（ｉ）を示している。図８（ｃ）は、図８（ａ）の音声信号と図６（ｂ）の基本パターンの重畳結果であるＲｓ（ｉ）を示している。図８（ｄ）は、図８（ｂ）のＲｃ（ｉ）と図８（ｃ）のＲｓ（ｉ）の二乗和であり、特定周波数Ｆｄのスペクトルの強度Ｒ（ｉ）を示している。 FIGS. 8A to 8D are diagrams for explaining the operation of the specific frequency detector 6 when the input signal X (i) does not include a specific frequency. 8A to 8D, the horizontal axis indicates time i. In FIG. 8A, the vertical axis indicates the signal level X (i), and in FIGS. 8B to 8D, the vertical axis indicates the power.
FIG. 8A shows a sine wave audio signal having a frequency (Fd / 2) that is a half of the specific frequency Fd as an example of a frequency different from the specific frequency Fd. FIG. 8B shows Rc (i) that is a result of superimposing the audio signal of FIG. 8A and the basic pattern of FIG. 6A. FIG. 8C shows Rs (i) that is a result of superimposing the audio signal of FIG. 8A and the basic pattern of FIG. 6B. FIG. 8D is a sum of squares of Rc (i) in FIG. 8B and Rs (i) in FIG. 8C, and shows the intensity R (i) of the spectrum of the specific frequency Fd.

図７（ｄ）と図８（ｄ）が示すように、入力信号に特定周波数Ｆｄを含む場合、スペクトルの強度Ｒ（ｉ）は大きい値になり、入力信号に特定周波数Ｆｄを含まない場合、スペクトルの強度Ｒ（ｉ）は小さい値になる。 As shown in FIGS. 7D and 8D, when the input signal includes the specific frequency Fd, the spectrum intensity R (i) is a large value, and when the input signal does not include the specific frequency Fd, The spectrum intensity R (i) is a small value.

図９（ａ）及び（ｂ）は、周波数ｆ１の正弦波の音声信号Ｘが入力された場合の特定周波数検出部６の動作を説明するための図である。図９（ａ）及び（ｂ）で横軸は時間ｉを示し、図９（ａ）で縦軸は信号レベルを示し、図９（ｂ）で縦軸はスペクトルの強度を示している。
ここで特定周波数ＦｄはＦｓ／２付近に設定したものとする。図９（ａ）に示すような音声信号Ｘは特定周波数Ｆｄとは異なる周波数ｆ１の正弦波なので（図示の例では、ｆ１＝２Ｆｄ／９として図示している）、スペクトルの強度Ｒは、図９（ｂ）に示すように非常に小さい値となる。 FIGS. 9A and 9B are diagrams for explaining the operation of the specific frequency detector 6 when the sine wave audio signal X having the frequency f1 is input. 9A and 9B, the horizontal axis represents time i, the vertical axis in FIG. 9A represents the signal level, and the vertical axis in FIG. 9B represents the spectrum intensity.
Here, it is assumed that the specific frequency Fd is set in the vicinity of Fs / 2. Since the audio signal X as shown in FIG. 9A is a sine wave having a frequency f1 different from the specific frequency Fd (illustrated as f1 = 2Fd / 9 in the illustrated example), the spectrum intensity R is As shown in 9 (b), the value is very small.

以上のように特定周波数検出部６は、音声信号に特定周波数が含まれれば大きい値、含まなければ小さい値となるスペクトルの強度Ｒを算出し、この強度Ｒが閾値ＲＴＨより大きいか否かを示す信号ＤＲを出力することができる。 As described above, the specific frequency detection unit 6 calculates the intensity R of the spectrum having a large value if the specific frequency is included in the audio signal, and a small value if the specific frequency is not included, and determines whether the intensity R is greater than the threshold value RTH. A signal DR can be output.

フィルタ係数生成部３は、特定周波数検出部６より出力された強度判定結果信号ＤＲが第１の値を取る場合（スペクトルの強度Ｒが閾値ＲＴＨより大きいことを示す場合）には入力をそのまま出力するフィルタ係数Ｃを生成し、特定周波数検出部６より出力された強度判定結果信号ＤＲが第２の値を取る場合（スペクトルの強度Ｒが閾値ＲＴＨ以下であることを示す場合）には周波数Ｆと振幅Ａに基づいて低域通過フィルタ係数Ｃを選択してフィルタ部５に出力する。
ここで閾値ＲＴＨは音声信号に特定周波数が含まれているかいないかと判断する基準値で、予め設定される。 The filter coefficient generation unit 3 outputs the input as it is when the intensity determination result signal DR output from the specific frequency detection unit 6 takes the first value (indicating that the spectrum intensity R is greater than the threshold value RTH). When the intensity determination result signal DR output from the specific frequency detector 6 takes the second value (indicating that the spectrum intensity R is equal to or less than the threshold value RTH), the frequency F is generated. The low pass filter coefficient C is selected based on the amplitude A and output to the filter unit 5.
Here, the threshold value RTH is a reference value for determining whether or not a specific frequency is included in the audio signal, and is set in advance.

フィルタ部５に、入力をそのまま出力させるには、フィルタ係数生成部３は、フィルタ係数として、着目点のサンプル値に対する係数として「１」、他のサンプル値に対する係数として「０」をフィルタ部５に供給する。 In order to cause the filter unit 5 to output the input as it is, the filter coefficient generation unit 3 sets “1” as the coefficient for the sample value of the target point and “0” as the coefficient for the other sample values as the filter coefficient. To supply.

低域通過フィルタ係数Ｃを生成する場合には、音声信号の周波数に基づいて低域通過フィルタのカットオフ周波数特性を決定し、周波数と振幅に基づいて低域通過フィルタの次数を決定する。
例えば図１０の実線のような周波数特性を持つ音声信号の場合、図１０の点線のようなカットオフ周波数特性を実現するための低域通過フィルタ係数を生成する。この場合、周波数ｆ１が推定できればｆ２は自明であるため、ｆ１とｆ２の間にカットオフ周波数ｆｃ１を設定する。 When generating the low-pass filter coefficient C, the cutoff frequency characteristic of the low-pass filter is determined based on the frequency of the audio signal, and the order of the low-pass filter is determined based on the frequency and amplitude.
For example, in the case of an audio signal having a frequency characteristic as shown by a solid line in FIG. 10, a low-pass filter coefficient for realizing a cutoff frequency characteristic as shown by a dotted line in FIG. 10 is generated. In this case, if the frequency f1 can be estimated, f2 is self-explanatory, and therefore a cutoff frequency fc1 is set between f1 and f2.

また、周波数と振幅から傾きを求めて次数を決定する。例えば、周波数が同じであれば、振幅Ａが大きいほど次数を小さくする。 Also, the order is determined by obtaining the slope from the frequency and amplitude. For example, if the frequency is the same, the order is reduced as the amplitude A increases.

フィルタ係数生成部３として、上記のような考慮に基づいて生成された、カットオフ周波数と次数が互いに異なる複数組の低域通過フィルタ係数を予め候補として格納するフィルタ係数テーブル（その一例が図１１に示されている）を備え、推定された周波数Ｆと振幅Ａに基づいて、該複数の候補のいずれかを選択して、低域通過フィルタ係数Ｃとして出力するものを用いても良い。図１１に示されるフィルタ係数テーブルに格納されたフィルタ係数は、カットオフ周波数と次数が互いに異なるものである。 As the filter coefficient generation unit 3, a filter coefficient table (an example of which is shown in FIG. 11) that stores a plurality of sets of low-pass filter coefficients having different cut-off frequencies and orders that are generated based on the above consideration as candidates. May be selected based on the estimated frequency F and amplitude A and output as a low-pass filter coefficient C. The filter coefficients stored in the filter coefficient table shown in FIG. 11 have different cutoff frequencies and orders.

以上のようにフィルタ係数生成部３は、音声信号に特定の周波数を含む場合には入力をそのまま出力するフィルタ係数を生成し、含まない場合には推定された周波数Ｆの高調波成分をカットし、推定された振幅Ａが大きいほど低域通過フィルタの次数が小さいフィルタ係数を生成する。 As described above, the filter coefficient generation unit 3 generates a filter coefficient that outputs an input as it is when the audio signal includes a specific frequency, and cuts off the higher harmonic component of the estimated frequency F when it does not include it. As the estimated amplitude A increases, a filter coefficient having a lower order of the low-pass filter is generated.

低域通過フィルタ係数が供給されると、フィルタ部５はエッジ保存型平滑化フィルタとして動作する。
ここで、エッジ保存型平滑化フィルタについて説明する。エッジ保存型平滑化フィルタとは、急峻で大きな変化が存在する部分の先鋭度を保存しながら、小さな変化のみを平滑化するフィルタであり、εフィルタ、トリムド平均値フィルタ（ＤＷ−ＭＴＭフィルタ）、バイラテラルフィルタなどがある。εフィルタ及びトリムド平均値フィルタ（ＤＷ−ＭＴＭフィルタ）は上記の非特許文献１に説明されている。以下、例としてエッジ保存型平滑化フィルタをεフィルタとして説明する。 When the low-pass filter coefficient is supplied, the filter unit 5 operates as an edge-preserving smoothing filter.
Here, the edge preserving smoothing filter will be described. An edge-preserving smoothing filter is a filter that smoothes only small changes while preserving the sharpness of a portion where there is a steep and large change. An ε filter, a trimmed average value filter (DW-MTM filter), There are bilateral filters. The ε filter and the trimmed average value filter (DW-MTM filter) are described in Non-Patent Document 1 described above. Hereinafter, an edge preserving smoothing filter will be described as an ε filter as an example.

εフィルタによる一次元処理は、式（２）で表される。
ここで、ｘ（ｉ）は入力レベル、ｙ（ｉ）は出力レベル、ａ_ｋは低域通過フィルタ係数、ｋは着目点からの相対的な位置、ｍは次数、εは閾値である。 One-dimensional processing by the ε filter is expressed by Expression (2).
Here, x (i) is an input level, y (i) is an output level, _ak is a low-pass filter coefficient, k is a relative position from a point of interest, m is an order, and ε is a threshold value.

ｕの絶対値が閾値ε以下の場合ｆ（ｕ）＝ｕとなり、ｕの絶対値が閾値εより大きい場合ｆ（ｕ）＝０となる。式（２）では、ｕをｘ（ｉ−ｋ）−ｘ（ｉ）としている。 When the absolute value of u is less than or equal to the threshold ε, f (u) = u, and when the absolute value of u is greater than the threshold ε, f (u) = 0. In Expression (2), u is x (ik) -x (i).

着目点ｉの信号レベルｘ（ｉ）と着目点ｉの周辺の位置ｉ−ｋの信号レベルｘ（ｉ−ｋ）との差分
ｘ（ｉ−ｋ）−ｘ（ｉ）
が小さい場合、
ｆ（ｕ）＝ｆ｛ｘ（ｉ−ｋ）−ｘ（ｉ）｝
はほぼ線形であり、係数ａ_ｋの総和を１と仮定すると、式（２）は式（３）と書き換えられ、これは重み付平均値フィルタと等しくなる。 Difference x (ik) −x (i) between the signal level x (i) of the point of interest i and the signal level x (ik) of the position i−k around the point of interest i
Is small,
f (u) = f {x (ik) -x (i)}
Is approximately linear, and assuming that the sum of the coefficients _ak is 1, Equation (2) is rewritten as Equation (3), which is equal to the weighted average filter.

一方、着目点ｉの信号レベルｘ（ｉ）と着目点ｉの周辺の位置ｉ−ｋの信号レベルｘ（ｉ−ｋ）との差分
ｘ（ｉ−ｋ）−ｘ（ｉ）
が大きい場合、
ｆ（ｕ）＝ｆ｛ｘ（ｉ−ｋ）−ｘ（ｉ）｝
は０であり、式（２）は式（４）と書き換えられ、着目点の値がそのまま出力される。
ｙ（ｉ）＝ｘ（ｉ） …（４）
従って、εフィルタは急峻で大きな変化が存在する部分を保存しながら、小さな変化のみを平滑化する。 On the other hand, the difference x (ik) −x (i) between the signal level x (i) at the point of interest i and the signal level x (ik) at the position i−k around the point of interest i.
Is large,
f (u) = f {x (ik) -x (i)}
Is 0, and Expression (2) is rewritten as Expression (4), and the value of the point of interest is output as it is.
y (i) = x (i) (4)
Therefore, the ε filter smooths only small changes while preserving a portion where there is a steep and large change.

図１２は、εフィルタの動作を説明するための図である。図１２（ａ）はεフィルタの入力信号ｘ（ｉ）を示し、図１２（ｂ）は図１２（ａ）の位置ｉ１を着目点とした時の
ｆ｛ｘ（ｉ１−ｋ）−ｘ（ｉ１）｝
を示している。 FIG. 12 is a diagram for explaining the operation of the ε filter. 12A shows the input signal x (i) of the ε filter, and FIG. 12B shows f {x (i1-k) −x (when the position i1 in FIG. i1)}
Is shown.

図１２（ａ）のような緩やかな傾斜を持つ信号がεフィルタの入力信号の場合、次数が大きすぎると着目点と周辺の点の差分が閾値を超えてしまい、図１２（ｂ）のようにその領域が０とされて重み付平均を取るため平滑化の効果が弱まる。 When the signal having a gentle slope as shown in FIG. 12A is an input signal of the ε filter, if the order is too large, the difference between the point of interest and the surrounding points exceeds the threshold value, as shown in FIG. Since the area is set to 0 and the weighted average is taken, the smoothing effect is weakened.

εフィルタなどのエッジ保存型平滑化フィルタの多くは低域通過フィルタに比べて、急峻で大きな変化を保存できる利点がある。しかし、信号が緩やかな傾斜を持つ場合、平滑化の性能が低くなる。そこで、図１の音声信号処理装置では音声信号の周波数と振幅を推定することにより信号の傾斜を考慮している。周波数が同じで振幅が大きい場合、傾斜は急になるのでフィルタの次数を小さくし、εフィルタでの平滑化の性能を上げている。 Many edge-preserving smoothing filters such as the ε filter have the advantage of being able to preserve steep and large changes compared to low-pass filters. However, when the signal has a gentle slope, the smoothing performance is low. Therefore, in the audio signal processing apparatus of FIG. 1, the inclination of the signal is taken into account by estimating the frequency and amplitude of the audio signal. When the frequency is the same and the amplitude is large, the slope becomes steep, so the order of the filter is reduced and the smoothing performance of the ε filter is improved.

以下では、フィルタ部５は、低域通過フィルタ係数が供給されたとき、εフィルタとして機能して、平滑化処理を行うものとして説明する。εフィルタは原データビットシフト部４におけるビットシフトにより増えたビット数を小振幅成分として扱う。つまり、ｎビットからｎ＋αビットにビットシフトする場合、εフィルタの閾値εは２のα乗とする。これにより、ｎビットにおける１ＬＳＢ以下の段差を平滑化する。 In the following description, it is assumed that the filter unit 5 functions as an ε filter and performs a smoothing process when a low-pass filter coefficient is supplied. The ε filter handles the number of bits increased by the bit shift in the original data bit shift unit 4 as a small amplitude component. That is, when bit shifting from n bits to n + α bits, the threshold ε of the ε filter is set to 2 to the power of α. Thereby, a step of 1 LSB or less in n bits is smoothed.

図１３（ａ）及び（ｂ）は、εフィルタとして動作しているときのフィルタ部５の動作を説明するための図である。横軸は時間ｉを示し、縦軸は信号レベルを示している。図１３（ａ）はｎ＋αビットの音声信号Ｘ’（ｉ）を示し、図１３（ｂ）はεフィルタ処理されたｎ＋αビットの音声信号Ｙ（ｉ）を示している。フィルタ部５では、ｎ＋αビットの音声信号から高調波を除去するカットオフ周波数特性を実現するフィルタ係数Ｃを用いて平滑化処理を行うことで、図１３（ａ）に示すようなｎ＋αビットの音声信号から量子化による波形歪みを補正し、図１３（ｂ）に示したようなｎ＋αビットの音声信号を出力する。 FIGS. 13A and 13B are diagrams for explaining the operation of the filter unit 5 when operating as an ε filter. The horizontal axis indicates time i, and the vertical axis indicates the signal level. 13A shows an n + α-bit audio signal X ′ (i), and FIG. 13B shows an n + α-bit audio signal Y (i) subjected to the ε filter process. The filter unit 5 performs smoothing processing using a filter coefficient C that realizes a cut-off frequency characteristic that removes harmonics from an n + α-bit audio signal, whereby an n + α-bit audio as shown in FIG. Waveform distortion due to quantization is corrected from the signal, and an n + α-bit audio signal as shown in FIG. 13B is output.

このように、フィルタ部５をエッジ保存型平滑化フィルタ、例えばεフィルタとして動作させることでｎビットの音声信号から高調波を除去し、波形歪みを補正したｎ＋αビットの音声信号を出力することができる。 In this way, by operating the filter unit 5 as an edge-preserving smoothing filter, for example, an ε filter, harmonics are removed from an n-bit audio signal, and an n + α-bit audio signal with corrected waveform distortion is output. it can.

次に、特定周波数Ｆｄを含むｎビットの音声信号が入力された場合の図１の音声信号処理装置の動作について説明する。ここで特定周波数Ｆｄはサンプリング周波数の半分付近の高周波数とする Next, the operation of the audio signal processing apparatus in FIG. 1 when an n-bit audio signal including the specific frequency Fd is input will be described. Here, the specific frequency Fd is a high frequency near half the sampling frequency.

図１４（ａ）〜（ｃ）は、図１の音声信号処理装置に入力されるサンプリング周波数の半分付近の周波数の高周波を含んだ音声信号の一例を示す図である。
図１４（ａ）はサンプリング周波数の半分付近の周波数ｆ８の正弦波のアナログ音声信号を示している。図１４（ａ）の横軸は時間ｔ、縦軸は信号レベルを示している。このアナログ信号が標本化、量子化されて、図１４（ｂ）のようなｎビットの周波数ｆ８の正弦波のデジタル音声信号が生成され、入力端子１に入力される。図で縦方向の点線がサンプリングタイミングを示す。図示の例では、音声信号の振幅の１／２（ピークとボトムの差の１／２）が、ｎビットの信号のＬＳＢのステップ幅に相当する場合を想定している。
図１４（ｂ）の横軸は時間ｉ、縦軸は信号レベルを示している。図１４（ｂ）に示すデジタル音声信号は、相連続するサンプリング点毎の、デジタル信号の列であり、時間（を表す数値）ｉは、サンプリング点毎に１ずつ増加する。
サンプル周波数が音声信号の周波数の半分と少し異なるため、図１４（ｂ）に示すように、音声信号の位相に対して、サンプル位相が少しずつずれ、サンプル値のレベルが連続して同じちとなる（信号のつぶれが起こる）区間（ｉａからｉｂまで）が生じる。図１４（ｃ）の実線は図１４（ｂ）の音声信号の周波数スペクトル、点線は（後述の）低域通過フィルタの周波数特性を示している。図１４（ｃ）の横軸は周波数ｆ、縦軸はパワー及びゲインを示している。図１４（ｃ）の周波数スペクトルが示すように周波数ｆ８はサンプリング周波数の半分付近なのでｆ８より高い周波数の高調波はあまりない。 FIGS. 14A to 14C are diagrams showing an example of an audio signal including a high frequency having a frequency near half the sampling frequency input to the audio signal processing apparatus of FIG.
FIG. 14A shows a sinusoidal analog audio signal having a frequency f8 near the half of the sampling frequency. In FIG. 14A, the horizontal axis indicates time t, and the vertical axis indicates the signal level. The analog signal is sampled and quantized to generate a sine wave digital audio signal with an n-bit frequency f8 as shown in FIG. In the figure, the vertical dotted line indicates the sampling timing. In the illustrated example, it is assumed that 1/2 of the amplitude of the audio signal (1/2 of the difference between the peak and the bottom) corresponds to the LSB step width of the n-bit signal.
In FIG. 14B, the horizontal axis indicates time i, and the vertical axis indicates the signal level. The digital audio signal shown in FIG. 14B is a sequence of digital signals for each successive sampling point, and the time (representing numerical value) i increases by 1 for each sampling point.
Since the sample frequency is slightly different from half the frequency of the audio signal, as shown in FIG. 14B, the sample phase is gradually shifted from the phase of the audio signal, and the sample value levels are continuously the same. An interval (from ia to ib) occurs (a signal collapse occurs). The solid line in FIG. 14C indicates the frequency spectrum of the audio signal in FIG. 14B, and the dotted line indicates the frequency characteristics of a low-pass filter (described later). In FIG. 14C, the horizontal axis indicates the frequency f, and the vertical axis indicates the power and gain. As shown in the frequency spectrum of FIG. 14C, since the frequency f8 is near half of the sampling frequency, there are not many harmonics having a frequency higher than f8.

単一正弦波で考えた場合、低振幅、高周波数であるほど量子化により信号が潰れてしまい、同じレベルが連続する傾向が強い。特にサンプリング周波数の半分付近では、この現象は顕著である。また、図１４（ｃ）が示すように基本周波数がサンプリング周波数の半分付近の場合、高調波はあまりなく、高調波を除去することの効果は少ない。 Considering a single sine wave, the lower the amplitude and the higher the frequency, the more the signal is crushed by quantization, and the same level tends to continue. This phenomenon is particularly noticeable near half the sampling frequency. Further, as shown in FIG. 14C, when the fundamental frequency is near half the sampling frequency, there are not many harmonics, and the effect of removing the harmonics is small.

図１４（ｂ）のようなｎビットで量子化された周波数ｆ８の正弦波の音声信号Ｘが入力された場合の図１の音声信号処理装置の動作を説明する。音声信号Ｘは周波数振幅推定部２、特定周波数検出部６、原データビットシフト部４に入力される。 The operation of the audio signal processing apparatus of FIG. 1 when a sine wave audio signal X of frequency f8 quantized with n bits as shown in FIG. 14B is input will be described. The audio signal X is input to the frequency amplitude estimation unit 2, the specific frequency detection unit 6, and the original data bit shift unit 4.

図１５（ａ）〜（ｄ）は、周波数ｆ８の正弦波の音声信号Ｘが入力された場合の、周波数振幅推定部２の動作を説明するための図である。横軸は時間ｉを示している。
図１５（ａ）はｎビットの音声信号Ｘ（ｉ）を示し、図１５（ｂ）は音声信号の一次微分データＤ（ｉ）を示し、図１５（ｃ）は２値データＰＭを示し、図１５（ｄ）は音声信号の変曲点位置の検出結果を示す信号ＳＬを示している。
図１５（ａ）に示すｎビットの音声信号が入力された場合、一次微分算出部１１は、図１５（ｂ）のような一次微分データ（Ｄｉ）を算出する。
符号変化点検出部１２は、一次微分データＤから図１５（ｃ）のような符号のみの２値データＰＭに変換して、図１５（ｄ）に示すようにその２値データＰＭが変化する位置ｉｃ＝ｉ４〜ｉ１９（その時間軸上の位置が縦方向に延びた点線で示されている）を検出し、検出位置で信号ＳＬの値を「１」にする。
よって、周波数推定部８及び振幅推定部９は、ｉ４〜ｉ５区間では周波数としてＦ＝１／（ｉ５−ｉ４）、振幅としてＡ＝｜Ｘ（ｉ５）−Ｘ（ｉ４）｜をそれぞれ求め、求められた周波数Ｆ及び振幅Ａはフィルタ係数生成部３に供給される。この時、求められたＦは入力信号の周波数ｆ８とほぼ同じである。同様にｉ５〜ｉ１１区間、ｉ１２〜ｉ１９区間でも正確に周波数Ｆを求めることができる。 FIGS. 15A to 15D are diagrams for explaining the operation of the frequency amplitude estimator 2 when a sine wave audio signal X having a frequency f8 is input. The horizontal axis indicates time i.
15 (a) shows an n-bit audio signal X (i), FIG. 15 (b) shows primary differential data D (i) of the audio signal, FIG. 15 (c) shows binary data PM, FIG. 15D shows a signal SL indicating the detection result of the inflection point position of the audio signal.
When the n-bit audio signal shown in FIG. 15A is input, the primary differential calculation unit 11 calculates primary differential data (Di) as shown in FIG.
The sign change point detector 12 converts the primary differential data D into binary data PM having only a sign as shown in FIG. 15C, and the binary data PM changes as shown in FIG. Positions ic = i4 to i19 (the positions on the time axis are indicated by dotted lines extending in the vertical direction) are detected, and the value of the signal SL is set to “1” at the detection position.
Therefore, the frequency estimation unit 8 and the amplitude estimation unit 9 obtain and obtain F = 1 / (i5-i4) as the frequency and A = | X (i5) −X (i4) | as the amplitude in the i4 to i5 section, respectively. The obtained frequency F and amplitude A are supplied to the filter coefficient generation unit 3. At this time, the obtained F is substantially the same as the frequency f8 of the input signal. Similarly, the frequency F can be accurately obtained in the i5 to i11 interval and the i12 to i19 interval.

一方で、ｉ１１〜ｉ１２区間では、周波数としてＦ＝１／（ｉ１２−ｉ１１）、振幅としてＡ＝｜Ｘ（ｉ１２）−Ｘ（ｉ１１）｜をそれぞれ求め、求められた周波数Ｆ及び振幅Ａはフィルタ係数生成部３に供給される。求められたＦは入力信号の周波数ｆ８と大きく異なる。 On the other hand, in the section i11 to i12, F = 1 / (i12−i11) as the frequency and A = | X (i12) −X (i11) | as the amplitude, and the obtained frequency F and amplitude A are the filters. It is supplied to the coefficient generator 3. The obtained F is greatly different from the frequency f8 of the input signal.

図１６（ａ）及び（ｂ）は、周波数ｆ８の正弦波の音声信号Ｘが入力された場合の特定周波数検出部６の動作を説明するための図である。横軸は時間ｉを示している。ここで特定周波数ＦｄはＦｓ／２付近に設定されているものとする。図１６（ａ）に示すような音声信号Ｘは特定周波数Ｆｄ付近の周波数ｆ８の正弦波なので、図１６（ｂ）に示すような大きい値のスペクトルの強度Ｒをフィルタ係数生成部３に出力する。 FIGS. 16A and 16B are diagrams for explaining the operation of the specific frequency detection unit 6 when the sine wave audio signal X having the frequency f8 is input. The horizontal axis indicates time i. Here, it is assumed that the specific frequency Fd is set in the vicinity of Fs / 2. Since the audio signal X as shown in FIG. 16A is a sine wave of the frequency f8 near the specific frequency Fd, the spectrum intensity R having a large value as shown in FIG. .

フィルタ係数生成部３には、大きい値のスペクトルの強度Ｒが入力されるため、入力をそのまま出力する係数をフィルタ部５に出力する。 The filter coefficient generation unit 3 receives the spectrum intensity R having a large value, and outputs a coefficient for outputting the input as it is to the filter unit 5.

図１７（ａ）及び（ｂ）は、周波数ｆ８の正弦波の音声信号Ｘが入力された場合の、原データビットシフト部４の動作を説明するための図である。横軸は時間ｉを示し、縦軸は信号レベルを示している。図１７（ａ）はｎビットの音声信号Ｘを示し、図１７（ｂ）はｎ＋αビットの音声信号Ｘ’を示している。原データビットシフト部４は、図１７（ａ）に示すようなｎビットの音声信号をαビットだけビットシフトし、図１７（ｂ）に示したようなｎ＋αビットの音声信号をフィルタ部５に出力する。 FIGS. 17A and 17B are diagrams for explaining the operation of the original data bit shift unit 4 when a sine wave audio signal X having a frequency f8 is input. The horizontal axis indicates time i, and the vertical axis indicates the signal level. FIG. 17A shows an n-bit audio signal X, and FIG. 17B shows an n + α-bit audio signal X ′. The original data bit shift unit 4 bit-shifts the n-bit audio signal as shown in FIG. 17A by α bits, and the n + α-bit audio signal as shown in FIG. Output.

図１８（ａ）及び（ｂ）は、周波数ｆ８の正弦波の音声信号Ｘが入力された場合の、フィルタ部５の動作を説明するための図である。横軸は時間ｉを示し、縦軸は信号レベルを示している。図１８（ａ）はｎ＋αビットの音声信号Ｘ’（ｉ）を示し、図１８（ｂ）はフィルタ処理されたｎ＋αビットの音声信号Ｙ（ｉ）を示している。
フィルタ部５には、入力をそのまま出力するフィルタ係数が入力されるため、図１８（ａ）に示すようなｎ＋αビットの音声信号をそのまま（図１８（ｂ））出力する。 18A and 18B are diagrams for explaining the operation of the filter unit 5 when the sine wave audio signal X having the frequency f8 is input. The horizontal axis indicates time i, and the vertical axis indicates the signal level. 18A shows an n + α-bit audio signal X ′ (i), and FIG. 18B shows a filtered n + α-bit audio signal Y (i).
Since the filter coefficient for outputting the input as it is is input to the filter unit 5, an n + α-bit audio signal as shown in FIG. 18A is output as it is (FIG. 18B).

このように、図１の音声信号処理装置はｎビットの音声信号に特定周波数が含まれる場合、ビットシフトのみを行ったｎ＋αビットの音声信号を出力することができる。 As described above, the audio signal processing apparatus of FIG. 1 can output an n + α-bit audio signal obtained by performing only bit shift when a specific frequency is included in the n-bit audio signal.

ここで特定周波数検出部６が設けられていない場合の動作を１９（ａ）及び（ｂ）を参照して説明する。図１９（ａ）及び（ｂ）で横軸は周波数ｆを示している。図１４（ａ）〜（ｃ）、図１５（ａ）〜（ｄ）のｉ４〜ｉ１１区間、ｉ１２〜ｉ１９区間では上記のように正確に周波数を求められるため、図１９（ａ）に点線で示すようなカットオフ周波数特性を実現する低域通過フィルタ係数を生成する。一方で、ｉ１１〜ｉ１２区間では求められた周波数が間違っているため、図１９（ｂ）の点線のようにカットオフ周波数特性の低域通過フィルタ係数を生成する。そのため、平滑化処理にてｉ１１〜ｉ１２区間では周波数ｆ８も除去されてしまう。 Here, the operation when the specific frequency detection unit 6 is not provided will be described with reference to 19 (a) and (b). In FIGS. 19A and 19B, the horizontal axis indicates the frequency f. In the sections i4 to i11 and i12 to i19 in FIGS. 14 (a) to 14 (c) and FIGS. 15 (a) to 15 (d), the frequency is accurately obtained as described above. A low-pass filter coefficient that realizes a cutoff frequency characteristic as shown is generated. On the other hand, since the obtained frequency is incorrect in the sections i11 to i12, a low-pass filter coefficient having a cutoff frequency characteristic is generated as indicated by a dotted line in FIG. Therefore, the frequency f8 is also removed in the i11 to i12 section in the smoothing process.

これに対して、特定周波数検出部６が設けられている場合には、図１の音声信号処理装置では、入力音声信号に量子化で潰れてしまうような高周波数が含まれている場合、周波数振幅推定部２では周波数を誤検出するが、特定周波数検出部６で正確に周波数を検出できるため、基本周波数が除去されることはない。 On the other hand, when the specific frequency detection unit 6 is provided, in the audio signal processing device of FIG. 1, when the input audio signal includes a high frequency that is crushed by quantization, Although the amplitude estimation unit 2 erroneously detects the frequency, the specific frequency detection unit 6 can accurately detect the frequency, so that the fundamental frequency is not removed.

以上のように、図１の音声信号処理装置では音声信号の周波数と振幅を推定して、その周波数と振幅に基づいたカットオフ周波数特性と次数を有する低域通過フィルタ係数を生成して平滑化するため、音声信号の量子化による波形歪みを的確に補正することができる。一方で平滑化の際にエッジ保存型平滑化フィルタを適用しているため、信号振幅が急峻に大きく変化する領域を有する音声信号入力波形の再現性を損なわない。 As described above, the audio signal processing apparatus of FIG. 1 estimates the frequency and amplitude of the audio signal, generates a low-pass filter coefficient having a cutoff frequency characteristic and an order based on the frequency and amplitude, and smoothes it. Therefore, the waveform distortion due to the quantization of the audio signal can be accurately corrected. On the other hand, since the edge-preserving smoothing filter is applied at the time of smoothing, the reproducibility of the audio signal input waveform having a region where the signal amplitude changes sharply and greatly is not impaired.

また、ｎビットの音声信号に特定周波数が含まれる場合、ビットシフトのみを行ったｎ＋αビットの音声信号を出力することができる。 In addition, when the specific frequency is included in the n-bit audio signal, it is possible to output the n + α-bit audio signal that is only bit-shifted.

これまではフィルタ部５はεフィルタとして動作するものである構成で説明したが、図１の音声信号処理装置はこの構成に限ったものではない。バイラテラルフィルタ、トリムド平均値フィルタ（ＤＷ−ＭＴＭフィルタ）などの他の種類のエッジ保存型フィルタとして動作するものでも同様の効果が期待できる。 The filter unit 5 has been described so far as operating as an ε filter, but the audio signal processing apparatus of FIG. 1 is not limited to this configuration. The same effect can be expected even when the filter operates as another kind of edge-preserving filter such as a bilateral filter or a trimmed average value filter (DW-MTM filter).

また、これまで特定周波数はサンプリング周波数の半分付近として説明してきたが、図１の音声信号処理装置ではこの周波数に限ったものではない。例えば特定周波数を任意の周波数、例えば１０ｋＨｚとして設定すれば、該任意の周波数を含む信号は平滑化せずに出力する。 In addition, the specific frequency has been described as being approximately half the sampling frequency so far, but the audio signal processing apparatus in FIG. 1 is not limited to this frequency. For example, if the specific frequency is set as an arbitrary frequency, for example, 10 kHz, a signal including the arbitrary frequency is output without being smoothed.

図２０は、以上に説明した図１の音声信号処理装置の処理工程を示すフローチャートである。
まず、入力端子１よりｎビットの音声信号が周波数振幅推定部２、原データビットシフト部４、及び特定周波数検出部６に入力される。原データビットシフト部４は、ｎビットの音声信号をαビット分ビットシフトしたｎ＋αビットの音声信号をフィルタ部５に出力する（ＳＴ１１）。周波数振幅推定部２はｎビットの音声信号から周波数Ｆと振幅Ａを推定してフィルタ係数生成部３に出力する（ＳＴ１２）。特定周波数検出部６では、特定の周波数Ｆｄのスペクトルの強度Ｒを算出して該強度Ｒが閾値ＲＴＨより大きいか否かを示す信号ＤＲをフィルタ生成部３に出力する（ＳＴ１３）。そして、強度Ｒが閾値地ＲＴＨ以下であれば（ステップＳＴ１４でＹＥＳ）ステップＳＴ１５に進み、強度Ｒが閾値地ＲＴＨより大きければ（ステップＳＴ１４でＮＯ）ステップＳＴ１７に進む。 FIG. 20 is a flowchart showing the processing steps of the audio signal processing apparatus of FIG. 1 described above.
First, an n-bit audio signal is input from the input terminal 1 to the frequency amplitude estimation unit 2, the original data bit shift unit 4, and the specific frequency detection unit 6. The original data bit shift unit 4 outputs an n + α bit audio signal obtained by shifting the n bit audio signal by α bits to the filter unit 5 (ST11). The frequency amplitude estimator 2 estimates the frequency F and the amplitude A from the n-bit audio signal and outputs them to the filter coefficient generator 3 (ST12). The specific frequency detection unit 6 calculates the intensity R of the spectrum of the specific frequency Fd and outputs a signal DR indicating whether or not the intensity R is greater than the threshold value RTH to the filter generation unit 3 (ST13). If intensity R is equal to or less than threshold value RTH (YES in step ST14), the process proceeds to step ST15, and if intensity R is greater than threshold value RTH (NO in step ST14), the process proceeds to step ST17.

ステップＳＴ１５では、周波数Ｆと振幅Ａに基づいて低域通過フィルタ係数を生成し、フィルタ部５に出力する。
この場合、フィルタ部５は、低域通過フィルタ係数が供給されているので、エッジ保存型平滑化フィルタとして動作し、ｎ＋αビットの音声信号に対して低域通過フィルタ処理を行なって、ｎ＋αビットの音声信号を出力する（ＳＴ１６）。
ステップＳＴ１７では、フィルタ係数生成部３は入力をそのまま出力するフィルタ係数を生成する。
この場合、フィルタ部５は、入力されているｎ＋αビットの音声信号の着目点のサンプル値に対して係数「１」を掛け、その前後のサンプル値に「０」を掛けることで、入力されている音声信号をそのまま出力する（ＳＴ１８）。
ステップＳＴ１６でフィルタ部５から出力された音声信号またはステップＳＴ１８でフィルタ部５から出力された音声信号が、音声信号処理装置の出力となる。 In step ST15, a low-pass filter coefficient is generated based on the frequency F and the amplitude A, and is output to the filter unit 5.
In this case, since the low-pass filter coefficient is supplied, the filter unit 5 operates as an edge-preserving smoothing filter, performs low-pass filter processing on the n + α-bit audio signal, and performs n + α-bit processing. An audio signal is output (ST16).
In step ST17, the filter coefficient generation unit 3 generates a filter coefficient that outputs the input as it is.
In this case, the filter unit 5 multiplies the sample value of the target point of the input n + α-bit audio signal by a coefficient “1” and multiplies the sample values before and after that by “0”. The existing audio signal is output as it is (ST18).
The audio signal output from the filter unit 5 in step ST16 or the audio signal output from the filter unit 5 in step ST18 becomes the output of the audio signal processing device.

図１に示す音声信号処理装置は、ソフトウエアで、即ちプログラムされたコンピュータで実現することもできる。その場合、コンピュータによる処理の手順は、図２０を参照して上記したのと同様である。 The audio signal processing apparatus shown in FIG. 1 can also be realized by software, that is, by a programmed computer. In this case, the processing procedure by the computer is the same as that described above with reference to FIG.

実施の形態２．
図２１は、実施の形態２の音声信号処理装置を示す。図２１に示される音声信号処理装置は、図１を参照して説明した実施の形態１の音声信号処理装置と概して同じであるが、選択的フィルタ処理部１０の構成が異なる。図示の選択的フィルタ処理部１０は、周波数振幅推定部２により推定された周波数Ｆ及び振幅Ａを示す信号を受け、低域通過フィルタ係数Ｃを生成するフィルタ係数生成部１３と、ビットシフト後音声信号Ｘ’を入力として、低域通過フィルタ係数Ｃを用いたフィルタ処理後の音声信号を出力するエッジ保存型平滑化フィルタ部１５と、エッジ保存型平滑化フィルタ部１５の出力信号とビットシフト後音声信号Ｘ’のいずれかを選択して出力する選択部１６とを備える。 Embodiment 2. FIG.
FIG. 21 shows an audio signal processing apparatus according to the second embodiment. The audio signal processing apparatus shown in FIG. 21 is generally the same as the audio signal processing apparatus according to the first embodiment described with reference to FIG. 1, but the configuration of the selective filter processing unit 10 is different. The selective filter processing unit 10 shown in the figure receives a signal indicating the frequency F and the amplitude A estimated by the frequency amplitude estimation unit 2 and generates a low-pass filter coefficient C, and a bit-shifted speech. Edge-preserving smoothing filter unit 15 that receives signal X ′ as input and outputs an audio signal after filtering using low-pass filter coefficient C; and output signal of edge-preserving smoothing filter unit 15 and bit-shifted And a selection unit 16 that selects and outputs one of the audio signals X ′.

フィルタ係数生成部１３は、エッジ保存型平滑化フィルタ部１５に、周波数振幅推定部２で推定された周波数Ｆ及び振幅Ａに基づいて、ビットシフト後音声信号Ｘ’に対してエッジ保存型平滑化処理を行なわせるための低域通過フィルタ係数Ｃを生成する。
このようなフィルタ係数生成部１３の処理は、図１に示されたフィルタ係数生成部３において、特定周波数のスペクトルの強度Ｒが閾値ＲＴＨ以下のときに行なわれる、低域通過フィルタ係数Ｃの生成と同じであり、フィルタ係数の生成も実施の形態１と同様に行い得る。 The filter coefficient generation unit 13 causes the edge preserving smoothing filter unit 15 to perform edge preserving smoothing on the bit-shifted speech signal X ′ based on the frequency F and the amplitude A estimated by the frequency amplitude estimating unit 2. A low-pass filter coefficient C for processing is generated.
Such processing of the filter coefficient generation unit 13 is performed when the filter coefficient generation unit 3 shown in FIG. 1 performs the low-pass filter coefficient C, which is performed when the spectrum intensity R of the specific frequency is equal to or less than the threshold value RTH. The filter coefficients can be generated in the same manner as in the first embodiment.

エッジ保存型平滑化フィルタ部１５は、フィルタ係数生成部１３で生成されたフィルタ係数Ｃを用いて、ビットシフト後音声信号Ｘ’に対してフィルタ処理を行なった結果を出力する。
選択部１６は、特定周波数検出部６により検出された特定周波数のスペクトルの強度Ｒが所定の閾値ＲＴＨ以下のときは、エッジ保存型平滑化フィルタ部１５の出力信号を選択し、特定周波数検出部６により検出された特定周波数のスペクトルの強度Ｒが閾値よりも大きいときは、ビットシフト後音声信号Ｘ’を選択する。
このような構成の選択的フィルタ処理部１０は図１に示される実施の形態１の音声信号処理装置の選択的フィルタ処理部１０と全体として同じ機能を有し、その出力信号Ｙは、図１の実施の形態１と同じである。したがって、実施の形態２の音声信号処理装置も実施の形態１の音声信号処理装置と同様の効果を奏する。
エッジ保存型平滑化フィルタ部１５としては、実施の形態１でフィルタ部５について説明したのと同様にεフィルタを用い得る。 The edge preserving smoothing filter unit 15 outputs the result of performing the filter processing on the bit-shifted audio signal X ′ using the filter coefficient C generated by the filter coefficient generation unit 13.
The selection unit 16 selects the output signal of the edge preserving smoothing filter unit 15 when the intensity R of the spectrum of the specific frequency detected by the specific frequency detection unit 6 is equal to or less than a predetermined threshold value RTH, and the specific frequency detection unit When the intensity R of the spectrum of the specific frequency detected by 6 is larger than the threshold, the bit-shifted audio signal X ′ is selected.
The selective filter processing unit 10 having such a configuration as a whole has the same function as the selective filter processing unit 10 of the audio signal processing apparatus of the first embodiment shown in FIG. This is the same as the first embodiment. Therefore, the audio signal processing device of the second embodiment also has the same effect as the audio signal processing device of the first embodiment.
As the edge preserving smoothing filter unit 15, an ε filter can be used in the same manner as described for the filter unit 5 in the first embodiment.

実施の形態２の音声信号処理装置の動作は、図２２のフローチャートで表すこともできる。
まず、入力端子１よりｎビットの音声信号が周波数振幅推定部２、原データビットシフト部４、及び特定周波数検出部６に入力される。原データビットシフト部４は、ｎビットの音声信号をαビット分ビットシフトしたｎ＋αビットの音声信号をエッジ保存型平滑化フィルタ部１５に出力する（ＳＴ１１）。周波数振幅推定部２はｎビットの音声信号から周波数Ｆと振幅Ａを推定してフィルタ係数生成部３に出力する（ＳＴ１２）。特定周波数検出部６では、特定の周波数Ｆｄのスペクトルの強度Ｒを算出して該強度Ｒが閾値ＲＴＨより大きいか否かを示す信号ＤＲをフィルタ生成部３に出力する（ＳＴ１３）。強度Ｒが閾値ＲＴＨ以下の場合には（ＳＴ１４でＹＥＳ）、ステップＳＴ１５に進み、強度Ｒが閾値ＲＴＨより大きい場合には（ＳＴ１４でＮＯ）、ステップＳＴ２２に進む。 The operation of the audio signal processing apparatus according to the second embodiment can also be represented by the flowchart of FIG.
First, an n-bit audio signal is input from the input terminal 1 to the frequency amplitude estimation unit 2, the original data bit shift unit 4, and the specific frequency detection unit 6. The original data bit shift unit 4 outputs an n + α bit audio signal obtained by shifting the n bit audio signal by α bits to the edge preserving smoothing filter unit 15 (ST11). The frequency amplitude estimator 2 estimates the frequency F and the amplitude A from the n-bit audio signal and outputs them to the filter coefficient generator 3 (ST12). The specific frequency detection unit 6 calculates the intensity R of the spectrum of the specific frequency Fd and outputs a signal DR indicating whether or not the intensity R is greater than the threshold value RTH to the filter generation unit 3 (ST13). If the intensity R is less than or equal to the threshold value RTH (YES in ST14), the process proceeds to step ST15, and if the intensity R is greater than the threshold value RTH (NO in ST14), the process proceeds to step ST22.

ステップＳＴ１５では、フィルタ係数生成部３は周波数Ｆと振幅Ａに基づいて低域通過フィルタ係数をエッジ保存型平滑化フィルタ部１５に出力する。
この場合、エッジ保存型平滑化フィルタ部１５は、供給されたフィルタ係数を用いてｎ＋αビットの音声信号に対してエッジ保存型平滑化フィルタ処理を行って、ｎ＋αビットの音声信号を出力し（ＳＴ１６）、選択部１６がエッジ保存型平滑化フィルタ部１５の出力を選択する（ＳＴ２１）。
ステップＳＴ２２では、選択部１６が原データビットシフト部４の出力を選択して出力する。
このような処理の結果、ステップＳＴ１６でエッジ保存型平滑化フィルタ部１５から出力された音声信号またはステップＳＴ１１で原データビットシフト部４から出力された音声信号が、音声信号処理装置の出力となる。 In step ST15, the filter coefficient generation unit 3 outputs the low-pass filter coefficient to the edge preserving smoothing filter unit 15 based on the frequency F and the amplitude A.
In this case, the edge-preserving smoothing filter unit 15 performs edge preserving smoothing filter processing on the n + α-bit audio signal using the supplied filter coefficient, and outputs an n + α-bit audio signal (ST16). ), The selection unit 16 selects the output of the edge-preserving smoothing filter unit 15 (ST21).
In step ST22, the selection unit 16 selects and outputs the output of the original data bit shift unit 4.
As a result of such processing, the audio signal output from the edge preserving smoothing filter unit 15 in step ST16 or the audio signal output from the original data bit shift unit 4 in step ST11 becomes the output of the audio signal processing device. .

図２１に示す音声信号処理装置は、ソフトウエアで、即ちプログラムされたコンピュータで実現することもできる。その場合、コンピュータによる処理の手順は、図２２を参照して上記したのと同様である。 The audio signal processing apparatus shown in FIG. 21 can also be realized by software, that is, by a programmed computer. In this case, the processing procedure by the computer is the same as that described above with reference to FIG.

本発明の音声信号処理装置は、たとえば、カーオーディオなどの音声信号処理装置に適用することができる。 The audio signal processing apparatus of the present invention can be applied to an audio signal processing apparatus such as a car audio.

本発明の実施の形態１に係る音声信号処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the audio | voice signal processing apparatus which concerns on Embodiment 1 of this invention. （ａ）〜（ｃ）は、実施の形態１で処理される音声信号の一例を示す図である。(A)-(c) is a figure which shows an example of the audio | voice signal processed in Embodiment 1. FIG. （ａ）及び（ｂ）は、原データビットシフト部４の動作を示す図である。(A) And (b) is a figure which shows operation | movement of the original data bit shift part 4. FIG. 周波数振幅推定部２の一例を示すブロック図である。4 is a block diagram illustrating an example of a frequency amplitude estimation unit 2. FIG. （ａ）〜（ｄ）は、図４の周波数振幅推定部２の動作を示す図である。(A)-(d) is a figure which shows operation | movement of the frequency amplitude estimation part 2 of FIG. （ａ）及び（ｂ）は、特定周波数検出部６の動作を示す図である。(A) And (b) is a figure which shows operation | movement of the specific frequency detection part 6. FIG. （ａ）〜（ｄ）は、特定周波数検出部６の動作を示す図である。(A)-(d) is a figure which shows operation | movement of the specific frequency detection part 6. FIG. （ａ）〜（ｄ）は、特定周波数検出部６の動作を示す図である。(A)-(d) is a figure which shows operation | movement of the specific frequency detection part 6. FIG. （ａ）及び（ｂ）は、特定周波数検出部６の動作を示す図である。(A) And (b) is a figure which shows operation | movement of the specific frequency detection part 6. FIG. は、音声信号の周波数と、それに適した低域通過特性を示す図である。These are figures which show the frequency of an audio | voice signal, and the low-pass characteristic suitable for it. フィルタ係数生成部３内に記憶された係数テーブルの一例を示す図である。6 is a diagram illustrating an example of a coefficient table stored in a filter coefficient generation unit 3. FIG. （ａ）及び（ｂ）は、εフィルタの動作を説明するための図である。(A) And (b) is a figure for demonstrating the operation | movement of an epsilon filter. （ａ）及び（ｂ）は、εフィルタとして動作しているときのフィルタ部５の動作を示す図である。(A) And (b) is a figure which shows operation | movement of the filter part 5 when it operate | moves as an epsilon filter. （ａ）〜（ｃ）は、実施の形態１のｎビットの音声信号の一例を示す図である。(A)-(c) is a figure which shows an example of the n-bit audio | voice signal of Embodiment 1. FIG. （ａ）〜（ｄ）は、周波数振幅推定部２の動作を示す図である。(A)-(d) is a figure which shows the operation | movement of the frequency amplitude estimation part 2. FIG. （ａ）及び（ｂ）は、特定周波数検出部６の動作を示す図である。(A) And (b) is a figure which shows operation | movement of the specific frequency detection part 6. FIG. （ａ）及び（ｂ）は、原データビットシフト部４の動作を示す図である。(A) And (b) is a figure which shows operation | movement of the original data bit shift part 4. FIG. （ａ）及び（ｂ）は、フィルタ部５の動作を示す図である。(A) And (b) is a figure which shows operation | movement of the filter part 5. FIG. （ａ）及び（ｂ）は、特定周波数検出部６が設けられていない場合の動作を説明するための図である。(A) And (b) is a figure for demonstrating operation | movement when the specific frequency detection part 6 is not provided. 実施の形態１に係る音声信号処理装置の処理工程を示すフローチャートである。3 is a flowchart showing processing steps of the audio signal processing device according to the first embodiment. 本発明の実施の形態２に係る音声信号処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the audio | voice signal processing apparatus which concerns on Embodiment 2 of this invention. 実施の形態２に係る音声信号処理装置の処理工程を示すフローチャートである。10 is a flowchart showing processing steps of the audio signal processing device according to the second embodiment.

Explanation of symbols

１入力端子、２周波数振幅推定部、３フィルタ係数生成部、４原データビットシフト部、５フィルタ部、６特定周波数検出部、７変曲点検出部、８周波数推定部、９振幅推定部、１１一次微分算出部、１２符号変化点検出部、１３フィルタ係数生成部、１５エッジ保存型平滑化フィルタ部、１６選択部。 DESCRIPTION OF SYMBOLS 1 Input terminal, 2 Frequency amplitude estimation part, 3 Filter coefficient production | generation part, 4 Original data bit shift part, 5 Filter part, 6 Specific frequency detection part, 7 Inflection point detection part, 8 Frequency estimation part, 9 Amplitude estimation part, 11 primary differential calculation unit, 12 sign change point detection unit, 13 filter coefficient generation unit, 15 edge preserving smoothing filter unit, 16 selection unit.

Claims

an original data bit shift unit that shifts an n-bit (n is a positive integer) audio signal by α bits (α is a positive integer) to generate an n + α-bit audio signal;
A frequency amplitude estimation unit that detects an inflection point of the n-bit audio signal and estimates the frequency and amplitude of the audio signal from the inflection point;
A specific frequency detector that determines whether or not the intensity of a spectrum of a predetermined specific frequency of the audio signal is greater than a predetermined threshold;
When the intensity of the spectrum of the specific frequency is determined to be equal to or less than the predetermined threshold by the specific frequency detection unit, the frequency amplitude estimation unit estimates the audio signal output from the original data bit shift unit. Output an audio signal obtained by performing edge preserving smoothing processing using a low-pass filter coefficient generated based on the frequency and amplitude,
A selective filter processing unit that directly outputs the audio signal output from the original data bit shift unit when the specific frequency detection unit determines that the spectrum intensity of the specific frequency is greater than the threshold value. Audio signal processing device.

The selective filter processing unit includes:
Based on a determination result signal indicating whether or not the spectrum of the specific frequency is greater than the predetermined threshold by the specific frequency detection unit, and a signal indicating the frequency and amplitude estimated by the frequency amplitude estimation unit. A filter coefficient generation unit for generating a filter coefficient;
A filter unit that performs a filtering process on the audio signal output from the original data bit shift unit and outputs the audio signal after the filtering process;
The filter coefficient generation unit
When the determination result signal indicates that the spectrum intensity of the specific frequency is determined to be less than or equal to the predetermined threshold, edge-preserving smoothing is performed on the audio signal output from the original data bit shift unit. A low-pass filter coefficient for performing the conversion processing,
When the determination result signal indicates that the spectrum intensity of the specific frequency is determined to be greater than the predetermined threshold, the audio signal output from the original data bit shift unit is directly input to the filter unit. Generate filter coefficients for output,
The said filter part performs a filter process with respect to the audio | voice signal output from the said original data bit shift part using the filter coefficient produced | generated by the said filter coefficient production | generation part. Audio signal processing device.

The filter coefficient generation unit
When the intensity of the spectrum of the specific frequency is below the threshold,
The audio signal processing apparatus according to claim 2, wherein the order of the low-pass filter coefficient is decreased as the estimated amplitude is increased.

The filter coefficient generation unit
As candidates for the low-pass filter coefficient,
The audio signal processing apparatus according to claim 3, further comprising a filter coefficient table that stores a plurality of low-pass filter coefficients having orders different from the cutoff frequency.

The audio signal processing apparatus according to claim 2, wherein the filter unit operates as an ε filter when the low-pass filter coefficient is given.

The selective filter processing unit includes:
A filter coefficient generation unit that generates a low-pass filter coefficient based on a signal indicating the frequency and amplitude estimated by the frequency amplitude estimation unit;
An edge-preserving smoothing filter that receives the sound signal output from the original data bit shift unit, performs edge preserving smoothing filter processing using the low-pass filter coefficient, and outputs the sound signal after filtering And
A selection unit that selects and outputs either the output signal of the edge-preserving smoothing filter unit and the audio signal output from the original data bit shift unit;
The selection unit includes:
When the specific frequency detection unit determines that the spectrum intensity of the specific frequency is equal to or less than the predetermined threshold, the output signal of the edge preserving smoothing filter unit is selected,
The audio signal output from the original data bit shift unit is selected when the specific frequency detection unit determines that the spectrum intensity of the specific frequency is greater than the threshold value. The audio signal processing device described.

The filter coefficient generation unit
The audio signal processing apparatus according to claim 6, wherein the order of the low-pass filter is decreased as the estimated amplitude is increased.

The filter coefficient generation unit
As candidates for the low-pass filter coefficient,
The audio signal processing apparatus according to claim 7, further comprising: a filter coefficient table that stores a plurality of low-pass filter coefficients having different orders from the cutoff frequency.

The audio signal processing apparatus according to claim 6, wherein the edge preserving smoothing filter unit is an ε filter.

The frequency amplitude estimator is
An inflection point detection unit that detects a point at which the n-bit audio signal starts or ends a series of monotone increases and a point at which a series of monotonic decreases starts or ends as the inflection points;
A frequency estimation unit that estimates a frequency from the section length of the position of the detected inflection point;
The audio signal processing apparatus according to claim 1, further comprising: an amplitude estimation unit that estimates a level difference of the detected inflection point as an amplitude.

The inflection point detector is
A first derivative calculating unit for calculating first derivative data of the n-bit audio signal;
The audio signal processing device according to claim 10, further comprising: a sign change point detection unit that detects, as the inflection point, a point at which a sign of the primary differential data has changed positively and a point at which the sign has changed negatively. .

The sign change point detection unit is configured such that the primary differential data changes from a state where the sign is negative or a state where the value is zero to a state where the sign is positive, and a state or value where the sign is positive. The audio signal processing apparatus according to claim 11, wherein a point where the sign has changed to a negative state is detected as a point where the sign has changed.

The audio signal processing apparatus according to claim 1, wherein the specific frequency for calculating the intensity of the spectrum by the specific frequency detection unit is about half of the sampling frequency.

The specific frequency detector is
14. The intensity of the spectrum of the specific frequency is calculated based on a result of superimposing the basic pattern of the cosine wave and sine wave of the specific frequency and the input audio signal. The audio signal processing apparatus according to 1.

The specific frequency detector is
The square of the value of the signal obtained as a result of superimposing the basic pattern of the cosine wave of the specific frequency and the input audio signal;
The sum of squares of signal values obtained as a result of superimposing the basic pattern of the sine wave of the specific frequency and the input audio signal is calculated as the intensity of the spectrum of the specific frequency. 14. The audio signal processing device according to 14.

an original data bit shift step of shifting an n-bit (n is a positive integer) audio signal by α bits (α is a positive integer) to generate an n + α-bit audio signal;
A frequency amplitude estimating step of detecting an inflection point of the n-bit audio signal and estimating a frequency and an amplitude of the audio signal from the inflection point;
A specific frequency detecting step of determining whether or not the intensity of a spectrum of a predetermined specific frequency of the audio signal is greater than a predetermined threshold;
When it is determined in the specific frequency detection step that the spectrum intensity of the specific frequency is equal to or less than the predetermined threshold, the frequency amplitude estimation step estimates the speech signal bit-shifted in the original data bit shift step. Output an audio signal obtained by performing an edge-preserving smoothing process using a low-pass filter coefficient generated based on the frequency and amplitude,
An audio signal processing method comprising: a selective filter processing step that outputs the n + α-bit audio signal as it is when the intensity of the spectrum of the specific frequency is determined to be greater than the threshold value by the specific frequency detection step.

The selective filtering step includes:
Based on the determination result signal indicating whether or not the intensity of the spectrum of the specific frequency is greater than the predetermined threshold by the specific frequency detection step, and the signal indicating the frequency and amplitude estimated by the frequency amplitude estimation step. A filter coefficient generation step for generating a filter coefficient;
A filter step of performing a filtering process on the n + α-bit audio signal and outputting the filtered audio signal;
The filter coefficient generation step includes:
When the determination result signal indicates that the intensity of the spectrum of the specific frequency is determined to be equal to or less than the predetermined threshold value, the edge-preserving smoothing process is performed on the n + α-bit audio signal. Generate the low-pass filter coefficients of
When the determination result signal indicates that the intensity of the spectrum of the specific frequency is determined to be greater than the predetermined threshold, a filter coefficient for causing the filter step to output the n + α-bit audio signal as it is Produces
The audio signal processing method according to claim 16, wherein the filter step performs a filter process on the n + α-bit audio signal using the filter coefficient generated in the filter coefficient generation step.

The filter coefficient generation step includes:
When the intensity of the spectrum of the specific frequency is below the threshold,
The audio signal processing method according to claim 17, wherein the order of the low-pass filter coefficient is decreased as the estimated amplitude is increased.

The filter coefficient generation step includes:
As candidates for the low-pass filter coefficient,
The audio signal processing method according to claim 18, further comprising: a filter coefficient table that stores a plurality of low-pass filter coefficients having different orders from the cutoff frequency.

18. The audio signal processing method according to claim 17, wherein the filter step performs ε filter processing when the low-pass filter coefficient is given.

The selective filtering step includes:
A filter coefficient generation step for generating a low-pass filter coefficient based on the signal indicating the frequency and amplitude estimated by the frequency amplitude estimation step;
An edge-preserving smoothing filter step for performing an edge-preserving smoothing filter process using the low-pass filter coefficient by using the n + α-bit sound signal as an input, and outputting the sound signal after the filtering process;
A selection step of selecting and outputting either the audio signal after the filtering process or the n + α-bit audio signal;
The selection step includes
When it is determined by the specific frequency detection step that the intensity of the spectrum of the specific frequency is equal to or less than the predetermined threshold, the filtered audio signal is selected,
17. The audio signal processing method according to claim 16, wherein the n + α-bit audio signal is selected when it is determined in the specific frequency detection step that the intensity of the spectrum of the specific frequency is greater than the threshold value. .

The filter coefficient generation step includes:
The audio signal processing method according to claim 21, wherein the order of the low-pass filter is reduced as the estimated amplitude increases.

The filter coefficient generation step includes:
As candidates for the low-pass filter coefficient,
The audio signal processing method according to claim 22, further comprising a filter coefficient table that stores a plurality of low-pass filter coefficients having orders different from the cutoff frequency.

The audio signal processing method according to claim 21, wherein the edge preserving smoothing filter step performs ε filter processing.

The frequency amplitude estimating step includes:
An inflection point detecting step of detecting, as the inflection point, a point at which the n-bit audio signal starts or ends a series of monotonic increases and a point at which a series of monotonic decreases starts or ends;
A frequency estimation step of estimating the frequency from the section length of the position of the detected inflection point;
An audio signal processing method according to any one of claims 16 to 24, further comprising: an amplitude estimating step of estimating a level difference of the detected inflection point as an amplitude.

The inflection point detecting step includes
A first derivative calculating step for calculating first derivative data of the n-bit audio signal;
The audio signal processing method according to claim 25, further comprising a sign change point detecting step of detecting, as the inflection point, a point at which a sign of the primary differential data has changed positively and a point at which the sign has changed negatively. .

In the sign change point detecting step, the first derivative data is changed from a state where the sign is negative or a value is zero to a state where the sign is positive, and a state or value where the sign is positive is zero. 27. The audio signal processing method according to claim 26, wherein a point at which the sign has changed to a negative state is detected as a point at which the sign has changed.

The audio signal processing method according to any one of claims 16 to 27, wherein the specific frequency for calculating the intensity of the spectrum in the specific frequency detecting step is about half of the sampling frequency.

The specific frequency detecting step includes
29. The intensity of the spectrum of the specific frequency is calculated based on a result of superimposing the basic pattern of the cosine wave and sine wave of the specific frequency and the input audio signal. The audio signal processing method according to claim 1.

The specific frequency detecting step includes
The square of the value of the signal obtained as a result of superimposing the basic pattern of the cosine wave of the specific frequency and the input audio signal;
The sum of squares of signal values obtained as a result of superimposing the basic pattern of the sine wave of the specific frequency and the input audio signal is calculated as the intensity of the spectrum of the specific frequency. 29. The audio signal processing method according to 29.