JPS5848917B2

JPS5848917B2 - Smoothing method for audio spectrum change rate

Info

Publication number: JPS5848917B2
Application number: JP52057763A
Authority: JP
Inventors: 大和佐藤; 芳典匂坂
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1977-05-20
Filing date: 1977-05-20
Publication date: 1983-10-31
Also published as: JPS53143102A

Abstract

PURPOSE:To enable to grasp the change in spectrum at tone boundary clearly, by eliminating the fluctuation in timely spectrum change, through weighting the difference of envelope of audio spectrum.

Description

【発明の詳細な説明】本発明は、種々の音声分析において用いられる音声スペ
クトル変化率の平滑化方法に関するものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a method for smoothing a speech spectrum change rate used in various speech analyzes.

音声分析において、発音された自然音声を各音韻ごとに
区切る（セグメンテーションと呼ぶ）場合に、そのスペ
クトル変化率を用いることが非常に有効であることはふ
く知られている。In speech analysis, it is well known that it is very effective to use the spectral change rate when dividing natural speech into individual phonemes (called segmentation).

音声のスペクトルは音韻の定常部では安定しているが、
音韻境界では時間とともに著しく変化する。Although the speech spectrum is stable in the phonological stationary region,
Phonological boundaries change significantly over time.

従ってスペクトル変化の時間特性によって音韻区分を定
めることができる。Therefore, phoneme divisions can be determined based on the temporal characteristics of spectral changes.

このスペクトル変化率としては、そのゆらぎが小さく、
かつ急峻な変化を正確にとらえられるものが望ましい。As for this spectrum change rate, the fluctuation is small,
It is also desirable to have something that can accurately capture sudden changes.

その算出法としては、隣接した二時点間のスペクトル変
化を求める方法、あらかじめ用意してある標準スペクト
ルとの差の時系列を求める方法等がよく用いられる。As a calculation method, a method of determining a change in the spectrum between two adjacent points of time, a method of determining a time series of the difference from a standard spectrum prepared in advance, etc. are often used.

しかし、これらの算出法で得られたものは、一般にその
ゆらぎが大きく、セグメンテーションのためのパラメー
タとしてそのまま使用するのは困難であり、何らかの平
滑化処理が必要とされる。However, the values obtained by these calculation methods generally have large fluctuations, making it difficult to use them as they are as parameters for segmentation, and some kind of smoothing processing is required.

従来の平滑化方法としては、フレームごとに算出される
スペクトル変化率をバツファメモリに蓄えるか、もしく
は遅延素子を用い、時間的に前のものとの算術平均をと
る方法、低域濾波器を用いる方法等があるが、これらを
用いた場合、スペクトル変化率を平滑化するばかりでな
く、その急峻な変化を歪ませ、正確さを失わせる欠点が
あった。Conventional smoothing methods include storing the spectral change rate calculated for each frame in a buffer memory, using a delay element and taking the arithmetic average of the previous one in time, and using a low-pass filter. However, when these methods are used, they not only smooth the spectral change rate, but also distort the steep changes, resulting in a loss of accuracy.

本発明はスペクトル変化率のゆらぎを除去し、その急峻
な変化を失わせない平滑化方法を考案したものである。The present invention has devised a smoothing method that removes fluctuations in the rate of spectral change and does not lose its steep changes.

以下、図面を用いて本発明を詳細に説明する。Hereinafter, the present invention will be explained in detail using the drawings.

第１図に本発明によるスペクトル変化率の平滑化方法の
一実施例を示す。FIG. 1 shows an embodiment of the method for smoothing the spectral change rate according to the present invention.

まず、端子１から音声信号が入力され、音声信号前処理
部２で、４ＫＨｚの低域濾波処理が施され、８ＫＨｚで
サンプリングし、さらに１１ビットで量子化され、デイ
ジタル音声信号に変換される。First, an audio signal is input from a terminal 1, and is subjected to low-pass filtering at 4 KHz in an audio signal preprocessing section 2, sampled at 8 KHz, and further quantized at 11 bits to be converted into a digital audio signal.

スペクトル包絡算出部３では、このデイジタル音声信号
からＦＦＴ（高速フーリエ変換）等を用い、スペクトル
包絡を算出し、バツファメモリ４に蓄える。The spectral envelope calculation section 3 calculates the spectral envelope from this digital audio signal using FFT (Fast Fourier Transform) or the like, and stores it in the buffer memory 4.

この実施例におけるスペクトル変化率算出では、聴覚の
特性を反映させるため、Ｍｅｌ変換回路５により周波数
軸をＭｅｎ変換する。In calculating the spectral change rate in this embodiment, the Mel conversion circuit 5 performs Men conversion on the frequency axis in order to reflect the auditory characteristics.

この変換は次式に従って行われる。Ｍｅｌ（ｆ）’＝Ｃ
−ｌｏｇ（１０００＋ｆ）（Ｃ：定数、ｆ：周波
数）Ｍｅｌ変換されたスペクトル包絡を用い、スペクト
ル包絡差算出回路６において、τ（ｍｓｅｃ．）隔たっ
たものとの差がとられる。This conversion is performed according to the following equation. Mel(f)'=C
-log(1000+f) (C: constant, f: frequency) Using the Mel-transformed spectrum envelope, the spectrum envelope difference calculating circuit 6 calculates the difference between the spectrum envelope and the spectrum envelope separated by τ (msec.).

この包絡の差ｄ（ｔ，ｔ−τ）は次式で計算される。This envelope difference d(t, t-τ) is calculated by the following equation.

４０００（Ｈｚ）ｄ（ｔ，ｔ−ｒ）二ｆｌＡ（ｒ，ｔ）
−ａ（Ｖ，ｔ一τ）ｌ”ｄＭｅ６（ｆ）３００（Ｈ
，９ここでＡ（ｆ，ｔ）は時刻ｔにおけるスペクトル包絡の
周波数成分ｆの持つパワー（ｄＢ）である。4000 (Hz) d(t, t-r) 2f lA(r, t)
-a(V, t-τ) l”dMe6(f)300(H
, 9 Here, A(f, t) is the power (dB) of the frequency component f of the spectrum envelope at time t.

この平滑化法は、このようにして得られたｄ（ｔ，ｔ−
τ）を、そのまま時刻ｔにおけるスペクトル変化率と考
えてその平滑化を行うのではなく、次式に示すように−
３０＜τ＜＋３０（ｍｓｅｃ．）にわたり、
ｄ（ｔ，ｔ−τ）に重みＷ（τ）を掛けたものを加え合
わせ、それにより平滑化を図っている。This smoothing method uses d(t, t-
Instead of considering τ) as the spectral change rate at time t and smoothing it, −
30<τ<+30 (msec.),
The products obtained by multiplying d(t, t-τ) by the weight W(τ) are added together, thereby achieving smoothing.

スペクトル変化率：このようにスペクトル変化を区間にわたってみることに
より、平滑化を行っているので、前記処理により得られ
るスペクトル変化率は、ゆらぎが小さいだけでなく、変
化の急峻な箇所も正確に表わす。Spectral change rate: Smoothing is performed by looking at spectral changes over intervals in this way, so the spectral change rate obtained by the above process not only has small fluctuations but also accurately represents steep points of change. .

これらの処理のため、第１図に示す係数器７および積分
器８が用いられる。For these processes, a coefficient multiplier 7 and an integrator 8 shown in FIG. 1 are used.

係数器７では、スペクトル包絡差算出回路６で得られた
包絡差ｄ（ｔ，ｔ−τ）に重み関数Ｗ（τ）が掛けられ
る。In the coefficient unit 7, the envelope difference d(t, t-τ) obtained by the spectral envelope difference calculation circuit 6 is multiplied by the weighting function W(τ).

それらを積分器８で順次加え合わせ、その結果は端子９
から出力される。These are sequentially added by integrator 8, and the result is output at terminal 9.
is output from.

以上の平滑化効果を示す例を第２図に示す。An example showing the above smoothing effect is shown in FIG.

第２図の上段では”青い飛行船″と発声した音声のパワ
ーを示しており、中段はτ＝１０ｍｓｅｃ．隔たっ
たものとの差ｄ（ｔ，ｔ−１０）を示しており、
これは第１図中７，８の箇所を除きスペクトル包絡差算
出回路６から直接、端子９に出力したもの（第１図にお
いて、点線でこれを示す）である。The upper part of Figure 2 shows the power of the voice uttered "Blue Airship", and the middle part shows the power of the voice uttered "Blue Airship", and the middle part shows the power of τ = 10 msec. It shows the difference d(t, t-10) from the distance,
This is the output directly from the spectral envelope difference calculation circuit 6 to the terminal 9 except for points 7 and 8 in FIG. 1 (indicated by dotted lines in FIG. 1).

下段にはこの平滑化法にかるものを示した。The lower part shows this smoothing method.

中段のものと比較して、その効果がよく表われているこ
とがわかる。It can be seen that the effect is clearly expressed compared to the one in the middle row.

前記実施例｝こおいて、例示した各種定数、重み関数、
スペクトル算出法（ま、これらに限定されたものではな
く、この平滑化方法として重要なことは、スペクトル変
化率を求めるべき周波数を含む窓を設け、この窓尚のス
ペクトル変化率の荷重平均をとることにより、平滑化を
行うことである。The above embodiment} Here, the various constants, weighting functions,
Spectrum calculation method (well, it is not limited to these, but the important thing about this smoothing method is to set up a window that includes the frequency for which the spectral change rate is to be calculated, and then take a weighted average of the spectral change rate over this window) This is to perform smoothing.

以上説明したように、本発明の音声スペクトル変化率の
平滑化方法は、単に相異なる二時点間のスペクトル変化
を求めるのではなく、一定の時間、窓の中におけるスペ
クトルの差を荷重平均することによって、スペクトル変
化量を求める方法であるから、スペクトル変化における
時間的ゆらぎを除去して平滑化できるばかりでなく、近
隣した時点間のスペクトル差ほど大きな重み付けを行っ
て、音韻境界におけるスペクトルの変化を明確にするこ
とができる利点がある。As explained above, the method for smoothing the rate of change in the audio spectrum of the present invention does not simply obtain the change in the spectrum between two different points in time, but rather calculates the weighted average of the difference in the spectrum within a window over a certain period of time. Since this is a method for calculating the amount of spectral change, it is possible not only to remove temporal fluctuations in spectral change and smooth it, but also to weight the spectral difference between adjacent points of time to a greater extent, so that the spectral change at the phonetic boundary can be calculated. It has the advantage of being clear.

[Brief explanation of the drawing]

第１図は本発明の音声スペクトル変化率の平滑化方法の
一実施例図、第２図は”青い飛行船″と発声した音声に
ついて、上段にパワー、中段に平滑化処理前のスペクト
ル変化率、下段に処理後のスペクトル変化率を示す図で
ある。１・・・・・・入力端子、２・・・・・・音声信号前処
理部、３・・・・・・スペクトル包絡算出部、４・・・
・・・バツファメモ・り、５・・・・・・Ｍｅｎ変換回
路、６・・・・・・スペクトル包絡差算出回路、７・・
・・・・係数器、８・・・・・・積分器、９・・・・・
・出力端子。FIG. 1 is an example of the method for smoothing the rate of change in the audio spectrum of the present invention. FIG. 2 shows the power in the upper row and the rate of spectral change before smoothing in the middle row for the voice uttered "Blue Airship." The lower part is a diagram showing the spectral change rate after processing. DESCRIPTION OF SYMBOLS 1...Input terminal, 2...Audio signal preprocessing section, 3...Spectral envelope calculation section, 4...
... buffer memory, 5 ... Men conversion circuit, 6 ... spectrum envelope difference calculation circuit, 7 ...
... Coefficient unit, 8 ... Integrator, 9 ...
・Output terminal.

Claims

[Claims]

1. In a method for calculating the rate of spectral change using various parameters that characterize the speech spectrum, a window is provided that includes the frequency for which the rate of spectral change is to be determined, and the weighted average of the spectral differences within this window is taken to calculate the rate of spectral change. A method for smoothing a rate of change in an audio spectrum, characterized by performing smoothing.