JPS5848917B2 - Smoothing method for audio spectrum change rate - Google Patents
Smoothing method for audio spectrum change rateInfo
- Publication number
- JPS5848917B2 JPS5848917B2 JP52057763A JP5776377A JPS5848917B2 JP S5848917 B2 JPS5848917 B2 JP S5848917B2 JP 52057763 A JP52057763 A JP 52057763A JP 5776377 A JP5776377 A JP 5776377A JP S5848917 B2 JPS5848917 B2 JP S5848917B2
- Authority
- JP
- Japan
- Prior art keywords
- spectral
- spectrum
- change rate
- smoothing
- change
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired
Links
Landscapes
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
Description
【発明の詳細な説明】
本発明は、種々の音声分析において用いられる音声スペ
クトル変化率の平滑化方法に関するものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a method for smoothing a speech spectrum change rate used in various speech analyzes.
音声分析において、発音された自然音声を各音韻ごとに
区切る(セグメンテーションと呼ぶ)場合に、そのスペ
クトル変化率を用いることが非常に有効であることはふ
く知られている。In speech analysis, it is well known that it is very effective to use the spectral change rate when dividing natural speech into individual phonemes (called segmentation).
音声のスペクトルは音韻の定常部では安定しているが、
音韻境界では時間とともに著しく変化する。Although the speech spectrum is stable in the phonological stationary region,
Phonological boundaries change significantly over time.
従ってスペクトル変化の時間特性によって音韻区分を定
めることができる。Therefore, phoneme divisions can be determined based on the temporal characteristics of spectral changes.
このスペクトル変化率としては、そのゆらぎが小さく、
かつ急峻な変化を正確にとらえられるものが望ましい。As for this spectrum change rate, the fluctuation is small,
It is also desirable to have something that can accurately capture sudden changes.
その算出法としては、隣接した二時点間のスペクトル変
化を求める方法、あらかじめ用意してある標準スペクト
ルとの差の時系列を求める方法等がよく用いられる。As a calculation method, a method of determining a change in the spectrum between two adjacent points of time, a method of determining a time series of the difference from a standard spectrum prepared in advance, etc. are often used.
しかし、これらの算出法で得られたものは、一般にその
ゆらぎが大きく、セグメンテーションのためのパラメー
タとしてそのまま使用するのは困難であり、何らかの平
滑化処理が必要とされる。However, the values obtained by these calculation methods generally have large fluctuations, making it difficult to use them as they are as parameters for segmentation, and some kind of smoothing processing is required.
従来の平滑化方法としては、フレームごとに算出される
スペクトル変化率をバツファメモリに蓄えるか、もしく
は遅延素子を用い、時間的に前のものとの算術平均をと
る方法、低域濾波器を用いる方法等があるが、これらを
用いた場合、スペクトル変化率を平滑化するばかりでな
く、その急峻な変化を歪ませ、正確さを失わせる欠点が
あった。Conventional smoothing methods include storing the spectral change rate calculated for each frame in a buffer memory, using a delay element and taking the arithmetic average of the previous one in time, and using a low-pass filter. However, when these methods are used, they not only smooth the spectral change rate, but also distort the steep changes, resulting in a loss of accuracy.
本発明はスペクトル変化率のゆらぎを除去し、その急峻
な変化を失わせない平滑化方法を考案したものである。The present invention has devised a smoothing method that removes fluctuations in the rate of spectral change and does not lose its steep changes.
以下、図面を用いて本発明を詳細に説明する。Hereinafter, the present invention will be explained in detail using the drawings.
第1図に本発明によるスペクトル変化率の平滑化方法の
一実施例を示す。FIG. 1 shows an embodiment of the method for smoothing the spectral change rate according to the present invention.
まず、端子1から音声信号が入力され、音声信号前処理
部2で、4KHzの低域濾波処理が施され、8KHzで
サンプリングし、さらに11ビットで量子化され、デイ
ジタル音声信号に変換される。First, an audio signal is input from a terminal 1, and is subjected to low-pass filtering at 4 KHz in an audio signal preprocessing section 2, sampled at 8 KHz, and further quantized at 11 bits to be converted into a digital audio signal.
スペクトル包絡算出部3では、このデイジタル音声信号
からFFT(高速フーリエ変換)等を用い、スペクトル
包絡を算出し、バツファメモリ4に蓄える。The spectral envelope calculation section 3 calculates the spectral envelope from this digital audio signal using FFT (Fast Fourier Transform) or the like, and stores it in the buffer memory 4.
この実施例におけるスペクトル変化率算出では、聴覚の
特性を反映させるため、Mel変換回路5により周波数
軸をMen変換する。In calculating the spectral change rate in this embodiment, the Mel conversion circuit 5 performs Men conversion on the frequency axis in order to reflect the auditory characteristics.
この変換は次式に従って行われる。Mel(f)’=C
−log(1000+f) ( C :定数、f:周波
数)Mel変換されたスペクトル包絡を用い、スペクト
ル包絡差算出回路6において、τ(msec.)隔たっ
たものとの差がとられる。This conversion is performed according to the following equation. Mel(f)'=C
-log(1000+f) (C: constant, f: frequency) Using the Mel-transformed spectrum envelope, the spectrum envelope difference calculating circuit 6 calculates the difference between the spectrum envelope and the spectrum envelope separated by τ (msec.).
この包絡の差d(t,t−τ)は次式で計算される。This envelope difference d(t, t-τ) is calculated by the following equation.
4000(Hz)
d(t ,t−r)二f lA(r,t)
−a(V,t一τ) l”dMe6( f)300(H
,9
ここでA(f,t)は時刻tにおけるスペクトル包絡の
周波数成分fの持つパワー(dB)である。4000 (Hz) d(t, t-r) 2f lA(r, t)
-a(V, t-τ) l”dMe6(f)300(H
, 9 Here, A(f, t) is the power (dB) of the frequency component f of the spectrum envelope at time t.
この平滑化法は、このようにして得られたd(t,t−
τ)を、そのまま時刻tにおけるスペクトル変化率と考
えてその平滑化を行うのではなく、次式に示すように−
30<τ<+ 3 0 ( msec. )にわたり、
d(t,t−τ)に重みW(τ)を掛けたものを加え合
わせ、それにより平滑化を図っている。This smoothing method uses d(t, t-
Instead of considering τ) as the spectral change rate at time t and smoothing it, −
30<τ<+30 (msec.),
The products obtained by multiplying d(t, t-τ) by the weight W(τ) are added together, thereby achieving smoothing.
スペクトル変化率:
このようにスペクトル変化を区間にわたってみることに
より、平滑化を行っているので、前記処理により得られ
るスペクトル変化率は、ゆらぎが小さいだけでなく、変
化の急峻な箇所も正確に表わす。Spectral change rate: Smoothing is performed by looking at spectral changes over intervals in this way, so the spectral change rate obtained by the above process not only has small fluctuations but also accurately represents steep points of change. .
これらの処理のため、第1図に示す係数器7および積分
器8が用いられる。For these processes, a coefficient multiplier 7 and an integrator 8 shown in FIG. 1 are used.
係数器7では、スペクトル包絡差算出回路6で得られた
包絡差d(t,t−τ)に重み関数W(τ)が掛けられ
る。In the coefficient unit 7, the envelope difference d(t, t-τ) obtained by the spectral envelope difference calculation circuit 6 is multiplied by the weighting function W(τ).
それらを積分器8で順次加え合わせ、その結果は端子9
から出力される。These are sequentially added by integrator 8, and the result is output at terminal 9.
is output from.
以上の平滑化効果を示す例を第2図に示す。An example showing the above smoothing effect is shown in FIG.
第2図の上段では”青い飛行船″と発声した音声のパワ
ーを示しており、中段はτ= 1 0msec.隔たっ
たものとの差d( t , t−10)を示しており、
これは第1図中7,8の箇所を除きスペクトル包絡差算
出回路6から直接、端子9に出力したもの(第1図にお
いて、点線でこれを示す)である。The upper part of Figure 2 shows the power of the voice uttered "Blue Airship", and the middle part shows the power of the voice uttered "Blue Airship", and the middle part shows the power of τ = 10 msec. It shows the difference d(t, t-10) from the distance,
This is the output directly from the spectral envelope difference calculation circuit 6 to the terminal 9 except for points 7 and 8 in FIG. 1 (indicated by dotted lines in FIG. 1).
下段にはこの平滑化法にかるものを示した。The lower part shows this smoothing method.
中段のものと比較して、その効果がよく表われているこ
とがわかる。It can be seen that the effect is clearly expressed compared to the one in the middle row.
前記実施例}こおいて、例示した各種定数、重み関数、
スペクトル算出法(ま、これらに限定されたものではな
く、この平滑化方法として重要なことは、スペクトル変
化率を求めるべき周波数を含む窓を設け、この窓尚のス
ペクトル変化率の荷重平均をとることにより、平滑化を
行うことである。The above embodiment} Here, the various constants, weighting functions,
Spectrum calculation method (well, it is not limited to these, but the important thing about this smoothing method is to set up a window that includes the frequency for which the spectral change rate is to be calculated, and then take a weighted average of the spectral change rate over this window) This is to perform smoothing.
以上説明したように、本発明の音声スペクトル変化率の
平滑化方法は、単に相異なる二時点間のスペクトル変化
を求めるのではなく、一定の時間、窓の中におけるスペ
クトルの差を荷重平均することによって、スペクトル変
化量を求める方法であるから、スペクトル変化における
時間的ゆらぎを除去して平滑化できるばかりでなく、近
隣した時点間のスペクトル差ほど大きな重み付けを行っ
て、音韻境界におけるスペクトルの変化を明確にするこ
とができる利点がある。As explained above, the method for smoothing the rate of change in the audio spectrum of the present invention does not simply obtain the change in the spectrum between two different points in time, but rather calculates the weighted average of the difference in the spectrum within a window over a certain period of time. Since this is a method for calculating the amount of spectral change, it is possible not only to remove temporal fluctuations in spectral change and smooth it, but also to weight the spectral difference between adjacent points of time to a greater extent, so that the spectral change at the phonetic boundary can be calculated. It has the advantage of being clear.
第1図は本発明の音声スペクトル変化率の平滑化方法の
一実施例図、第2図は”青い飛行船″と発声した音声に
ついて、上段にパワー、中段に平滑化処理前のスペクト
ル変化率、下段に処理後のスペクトル変化率を示す図で
ある。
1・・・・・・入力端子、2・・・・・・音声信号前処
理部、3・・・・・・スペクトル包絡算出部、4・・・
・・・バツファメモ・り、5・・・・・・Men変換回
路、6・・・・・・スペクトル包絡差算出回路、7・・
・・・・係数器、8・・・・・・積分器、9・・・・・
・出力端子。FIG. 1 is an example of the method for smoothing the rate of change in the audio spectrum of the present invention. FIG. 2 shows the power in the upper row and the rate of spectral change before smoothing in the middle row for the voice uttered "Blue Airship." The lower part is a diagram showing the spectral change rate after processing. DESCRIPTION OF SYMBOLS 1...Input terminal, 2...Audio signal preprocessing section, 3...Spectral envelope calculation section, 4...
... buffer memory, 5 ... Men conversion circuit, 6 ... spectrum envelope difference calculation circuit, 7 ...
... Coefficient unit, 8 ... Integrator, 9 ...
・Output terminal.
Claims (1)
いたスペクトル変化率の算出法において、スペクトル変
化率を求める周波数を含む窓を設け、この窓内において
スペクトルの差の荷重平均をとることにより、スペクト
ル変化率の平滑化を行うことを特徴とする音声スペクト
ル変化率の平滑化方法。1. In a method for calculating the rate of spectral change using various parameters that characterize the speech spectrum, a window is provided that includes the frequency for which the rate of spectral change is to be determined, and the weighted average of the spectral differences within this window is taken to calculate the rate of spectral change. A method for smoothing a rate of change in an audio spectrum, characterized by performing smoothing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP52057763A JPS5848917B2 (en) | 1977-05-20 | 1977-05-20 | Smoothing method for audio spectrum change rate |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP52057763A JPS5848917B2 (en) | 1977-05-20 | 1977-05-20 | Smoothing method for audio spectrum change rate |
Publications (2)
Publication Number | Publication Date |
---|---|
JPS53143102A JPS53143102A (en) | 1978-12-13 |
JPS5848917B2 true JPS5848917B2 (en) | 1983-10-31 |
Family
ID=13064907
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP52057763A Expired JPS5848917B2 (en) | 1977-05-20 | 1977-05-20 | Smoothing method for audio spectrum change rate |
Country Status (1)
Country | Link |
---|---|
JP (1) | JPS5848917B2 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2003284654A1 (en) | 2002-11-25 | 2004-06-18 | Matsushita Electric Industrial Co., Ltd. | Speech synthesis method and speech synthesis device |
-
1977
- 1977-05-20 JP JP52057763A patent/JPS5848917B2/en not_active Expired
Also Published As
Publication number | Publication date |
---|---|
JPS53143102A (en) | 1978-12-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Houtgast et al. | Evaluation of speech transmission channels by using artificial signals | |
JP2763322B2 (en) | Audio processing method | |
US3855416A (en) | Method and apparatus for phonation analysis leading to valid truth/lie decisions by fundamental speech-energy weighted vibratto component assessment | |
JPS5850360B2 (en) | Preprocessing method in speech recognition device | |
CN111785285A (en) | Voiceprint recognition method for home multi-feature parameter fusion | |
JPH03266899A (en) | Device and method for suppressing noise | |
Prasad et al. | Speech features extraction techniques for robust emotional speech analysis/recognition | |
Traunmüller | Perception of speaker sex, age, and vocal effort | |
Alku et al. | Effects of bandwidth on glottal airflow waveforms estimated by inverse filtering | |
JPS5848917B2 (en) | Smoothing method for audio spectrum change rate | |
JP4166405B2 (en) | Drive signal analyzer | |
JPH03284000A (en) | Hearing aid system | |
Alku et al. | On the linearity of the relationship between the sound pressure level and the negative peak amplitude of the differentiated glottal flow in vowel production | |
JPH05289691A (en) | Speech speed measuring instrument | |
JPH0784596A (en) | Method for evaluating quality of encoded speech | |
Turner et al. | Linear prediction applied to time varying all pole signals | |
JPS6152478B2 (en) | ||
JP2898637B2 (en) | Audio signal analysis method | |
JPH02254500A (en) | Vocalization speed estimating device | |
JP2951333B2 (en) | Audio signal section discrimination method | |
JP3125951B2 (en) | Formant control method | |
Lacroix et al. | Accurate pitch estimation using digital filters | |
JP2569472B2 (en) | Voice analyzer | |
JPS5848112B2 (en) | Onsaven Sekiki | |
JPH07101355B2 (en) | Speech feature extraction method |