JPH0377520B2 - - Google Patents

Info

Publication number
JPH0377520B2
JPH0377520B2 JP57187105A JP18710582A JPH0377520B2 JP H0377520 B2 JPH0377520 B2 JP H0377520B2 JP 57187105 A JP57187105 A JP 57187105A JP 18710582 A JP18710582 A JP 18710582A JP H0377520 B2 JPH0377520 B2 JP H0377520B2
Authority
JP
Japan
Prior art keywords
waveform
maximum value
calculation
pitch
clipper
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP57187105A
Other languages
Japanese (ja)
Other versions
JPS5975297A (en
Inventor
Taisuke Watanabe
Tatsuya Kimura
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Priority to JP18710582A priority Critical patent/JPS5975297A/en
Publication of JPS5975297A publication Critical patent/JPS5975297A/en
Publication of JPH0377520B2 publication Critical patent/JPH0377520B2/ja
Granted legal-status Critical Current

Links

Description

【発明の詳細な説明】 産業上の利用分野 本発明は、音声分析合成系における重要な基本
パラメータの1つである音声のピツチ周期を抽出
する方法に関する。
DETAILED DESCRIPTION OF THE INVENTION Field of Industrial Application The present invention relates to a method for extracting the pitch period of speech, which is one of the important basic parameters in a speech analysis and synthesis system.

従来例の構成とその問題点 音声分析合成系は、音声波形を人間の発声機構
のシミユレーシヨンに基づき種々のバラメーター
を抽出し、そのパラメーターを伝送する事により
情報量の圧縮をはかるものであるが、特にその中
でも声帯の振動に対応した音声の基本周波数成分
であるピツチパラメーターは、合成音声の品質に
重大な影響を及ぼすため、ピツチを抽出する装置
にはピツチ抽出誤り率の低いものが要求される。
Conventional configuration and its problems Speech analysis and synthesis systems extract various parameters from a speech waveform based on a simulation of the human vocal mechanism, and compress the amount of information by transmitting the parameters. In particular, the pitch parameter, which is the fundamental frequency component of the voice that corresponds to the vibration of the vocal cords, has a significant effect on the quality of synthesized speech, so a device that extracts pitch is required to have a low pitch extraction error rate. Ru.

従来、ピツチ抽出方法としては、種々の方法が
提案されている。その内で最も一般的な手法とし
て、音声波形の自己相関関数を用いるものがあ
り、自己相関関数は演算量が多いため、いろいろ
簡便な相関法による手法が提案されている。
Conventionally, various methods have been proposed as pitch extraction methods. Among these methods, the most common method is to use an autocorrelation function of a speech waveform.Since the autocorrelation function requires a large amount of calculation, various simple correlation methods have been proposed.

特に、第1図に示す構成のように、音声波形x
(n)にクリツパ演算装置1によりクリツパ演算
を施して得られた波形y1(n)と、しきい値演算
装置2により−1、0、+1に3値化した波形y2
(n)との相互相関関数を次式 (τ)= 〓 n∈Ly1(n-τ)y2(n) ……(1) 又は (τ)= 〓 n∈Ly1(n)y2(n-τ) ……(2) 〔但し、Lはフレームの全領域τ≧0〕 のうちのいずれかにより、相関器3を用いて求
め、(τ)の最大値を与えるτを最大値位置検
出装置4で求めてピツチ周期Pを得る方法は、y2
(n)が3値化されているため相関演算の過程に
よる乗算演算が不要となるため、計算時間の点で
有利な方法である。
In particular, as in the configuration shown in FIG.
Waveform y 1 (n) obtained by performing clipper calculation on (n) by clipper calculation device 1, and waveform y 2 ternarized into -1, 0, +1 by threshold calculation device 2.
The cross-correlation function with (n) is expressed as follows (τ) = 〓 n∈Ly 1 (n-τ)y 2 (n) ...(1) or (τ) = 〓 n∈Ly 1 (n)y 2 (n-τ) ...(2) [However, L is the entire area of the frame τ≧0] Find it using the correlator 3, and set τ that gives the maximum value of (τ) to the maximum value. The method of obtaining the pitch period P by using the position detection device 4 is as follows: y 2
Since (n) is ternarized, there is no need for multiplication in the process of correlation calculation, so this is an advantageous method in terms of calculation time.

しかしながらこの方法では、相関演算の際に(1)
式を用いるか(2)式を用いるかの明確な規定がな
く、片方いずれかを固定して用いる事になるの
で、例えば第2図aに示した音声波形y(n)に
上記のクリツパ演算、しきい値演算による2種類
の非線形変換を施し、第2図b,cに示すy1
(n)、y2(n)を得、これを式(2)によつて(τ)
を求めると、第2図dのようになり、実際のピツ
チ周期に対応した相関ピーク値11より、値の大
きい相関ピーク値12が存在し、こちらの方をピ
ツチ周期として誤まつて抽出してしまう恐れがあ
り、ピツチ抽出精度が劣化するという問題点があ
つた。
However, in this method, (1)
There is no clear regulation as to whether to use the formula or formula (2), and one or the other must be used.For example, the above clipper operation is applied to the audio waveform y(n) shown in Figure 2a. , two types of nonlinear transformation using threshold calculation are performed, and y 1 shown in Fig. 2 b and c is obtained.
(n), y 2 (n), and use equation (2) to calculate (τ)
When calculated, the result is as shown in Figure 2 (d). There is a correlation peak value 12 which is larger than the correlation peak value 11 corresponding to the actual pitch period, and this is mistakenly extracted as the pitch period. There was a problem that the accuracy of pitch extraction deteriorated.

発明の目的 本発明は、分析を行う当該フレーム内の音声波
形の振幅情報に基き、上記相関演算の式(1)、(2)を
適宜選択する事により、上記問題点を解決し、よ
りピツチ抽出誤り率の低いピツチ抽出方法を提供
するものである。
Purpose of the Invention The present invention solves the above problems by appropriately selecting equations (1) and (2) of the above correlation calculation based on the amplitude information of the audio waveform in the frame to be analyzed. This invention provides a pitch extraction method with a low extraction error rate.

発明の構成 本発明は音声波形x(n)より、クリツパ演算
およびしきい値演算処理を施してy1(n)、y2(n)
を得、音声波形の最大値がフレームの前半に存在
するか後半に存在するかを判定してその結果得ら
れた最大値の位置情報をもとに(1)式または(2)式を
適応的に選択するようにしたピツチ抽出方法であ
る。
Structure of the Invention The present invention performs clipper calculation and threshold calculation processing on the audio waveform x(n) to obtain y 1 (n), y 2 (n).
, determine whether the maximum value of the audio waveform exists in the first half or the second half of the frame, and apply equation (1) or (2) based on the position information of the maximum value obtained as a result. This is a pitch extraction method that is selected based on

実施例の説明 以下、本発明の実施例を第3図を用いて説明す
る。音声波形x(n)をクリツパ演算装置1によ
りクリツパ演算を施し、クリツパ波形y1(n)を
得る。一方同時にx(n)をしきい値演算装置2
に通し、−1、0、1の3値化波形y2(n)を得
る。式であらわすと、y1(n)、y2(n)は次のよ
うになる。
DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments of the present invention will be described with reference to FIG. The audio waveform x(n) is subjected to clipper calculation by the clipper calculation device 1 to obtain a clipper waveform y 1 (n). On the other hand, x(n) is simultaneously calculated by the threshold calculation device 2.
to obtain a ternarized waveform y 2 (n) of -1, 0, and 1. Expressed in formulas, y 1 (n) and y 2 (n) are as follows.

y1(n)=x(n)−CL x(n)≧CL =0 |x(n)|<CL =x(n)+CL x(n)≦−CL y2(n)=1 x(n)≧CL =0 |x(n)|<CL =−1 x(n)≦−CL ただし、CLはしきい値である。y 1 (n) = x (n) - C L x (n) ≧ C L = 0 | x (n) | < C L = x ( n ) + C L )=1 x(n)≧C L =0 |x(n)|<C L =-1 x(n)≦-C L However, C L is a threshold value.

ここまでは第1図の従来例と同一である。8は
最大値位置検出装置で、当該フレームにおける音
声波形x(n)の絶対値の最大値を与えるnを検
出し、判定装置9により、上記nが、当該フレー
ムの前半に位置するか後半に位置するかを識別す
る信号Cを得る。
The process up to this point is the same as the conventional example shown in FIG. 8 is a maximum value position detection device that detects n that gives the maximum absolute value of the audio waveform x(n) in the frame, and a determination device 9 determines whether n is located in the first half of the frame or in the second half. A signal C identifying the location is obtained.

以上のようにして音声波形x(n)を処理して
得たy1(n)、y2(n)、Cの信号をもとに、両方向
性相関器7においては、信号Cがフレームの前半
にx(n)の最大値が存在している事を示してい
る時には、(1)式に従つて、又、信号Cがフレーム
の後半にx(n)の最大値が存在している事を示
している時には(2)式に従つてy1(n)とy2(n)の
相互相関関数(τ)を求める。この場合Lは、
当該フレームの全範囲をあらわす。
Based on the signals y 1 (n), y 2 (n), and C obtained by processing the audio waveform x(n) as described above, the bidirectional correlator 7 converts the signal C into a frame. When the maximum value of x(n) exists in the first half of the frame, according to equation (1), signal C also indicates that the maximum value of x(n) exists in the second half of the frame. When this is the case, the cross-correlation function (τ) of y 1 (n) and y 2 (n) is calculated according to equation (2). In this case, L is
Represents the entire range of the frame.

最後に、最大値位置検出装置10により、
(τ)の最大値を与えるτをピツチ周期Pとして
得る事により、ピツチ抽出動作を完了する。
Finally, by the maximum value position detection device 10,
By obtaining τ that gives the maximum value of (τ) as the pitch period P, the pitch extraction operation is completed.

第4図は第2図aに示した音声波形x(n)を
本発明に適用したときの波形図を示す。第4図
a,b,cはそれぞれ第2図a,b,cと全く同
一であり、第4図bは第4図aの音声波形x(n)
をクリツパ演算装置1によりクリツパ演算したと
きのクリツパ波形y1(n)を、第4図cは第4図
aの音声波形x(n)をしきい値演算装置2によ
り3値化したときの波形y2(n)を示す。
FIG. 4 shows a waveform diagram when the audio waveform x(n) shown in FIG. 2a is applied to the present invention. Figure 4 a, b, and c are completely the same as Figure 2 a, b, and c, respectively, and Figure 4 b is the audio waveform x(n) of Figure 4 a.
FIG. 4c shows the clipper waveform y 1 (n) when the clipper calculation device 1 performs the clipper calculation, and FIG. 4c shows the clipper waveform y 1 (n) when the audio waveform The waveform y 2 (n) is shown.

一方音声波形x(n)の当該フレームにおける
最大値は第4図aの波形では前半に位置してい
る。したがつて、最大値位置検出装置8、判定装
置9により得られる信号Cは音声波形x(n)の
最大値が当該フレームの前半にあることを示す信
号となり、両方向性相関器7では(1)式に従つてy1
(n)とy2(n)の相互相関関数(τ)を求める
よう動作する。この結果、得られる相互相関関数
(τ)は第4図dに示す波形になり、実際のピ
ツチ周期に対応したτにおいて相関のピーク13
が得られるのでピツチ周期が正しく抽出される。
On the other hand, the maximum value of the audio waveform x(n) in the relevant frame is located in the first half of the waveform of FIG. 4a. Therefore, the signal C obtained by the maximum value position detection device 8 and the determination device 9 becomes a signal indicating that the maximum value of the audio waveform x(n) is in the first half of the frame, and the bidirectional correlator 7 outputs (1 ) according to the formula y 1
(n) and y 2 (n) to find the cross-correlation function (τ). As a result, the cross-correlation function (τ) obtained has the waveform shown in Figure 4d, and the correlation peaks at 13 at τ corresponding to the actual pitch period.
is obtained, so the pitch period is correctly extracted.

発明の効果 以上のように、本発明は音声波形を複数区間に
分割して各区間毎にピツチ抽出を行なう際、各区
間内で音声波形をクリツパ演算およびしきい値演
算により非線形変換し、その相互相関関数の演算
を各区間内の音声波形の最大振幅の位置情報によ
りシフト方向を制御するようにしたピツチ抽出方
法であり、実際のピツチ周期に対応したτにおい
てピツチ周期が正しく抽出され、ピツチ抽出精度
を大きく向上させることができる。
Effects of the Invention As described above, when an audio waveform is divided into a plurality of sections and pitch extraction is performed for each section, the audio waveform is nonlinearly transformed within each section by clipper calculation and threshold calculation. This is a pitch extraction method in which the shift direction of the calculation of the cross-correlation function is controlled by the position information of the maximum amplitude of the audio waveform in each interval.The pitch period is correctly extracted at τ corresponding to the actual pitch period, and the pitch is Extraction accuracy can be greatly improved.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は、従来のピツチ抽出装置の一例を説明
したブロツク図、第2図a〜dは、従来のピツチ
抽出装置により、実際の音声信号からピツチ周期
が抽出されるまでの処理過程を説明した波形図、
第3図は本発明によるピツチ抽出方法に使用され
る装置の一実施例を示すブロツク図、第4図a〜
dは、第3図の装置により実際の音声波形からピ
ツチ周期が抽出されるまでの処理過程を説明した
波形図である。 1……クリツパ演算装置、2……しきい値演算
装置、3……相関器、4……最大値位置検出装
置、7……両方向性相関器、8……最大値位置検
出装置、9……判定装置、10……最大値位置検
出装置。
Fig. 1 is a block diagram illustrating an example of a conventional pitch extraction device, and Figs. 2 a to 2d illustrate the processing steps until the pitch period is extracted from an actual audio signal by the conventional pitch extraction device. waveform diagram,
FIG. 3 is a block diagram showing an embodiment of the apparatus used in the pitch extraction method according to the present invention, and FIG.
d is a waveform diagram illustrating the processing steps until the pitch period is extracted from an actual voice waveform by the apparatus shown in FIG. 3; DESCRIPTION OF SYMBOLS 1... Clipper calculation device, 2... Threshold calculation device, 3... Correlator, 4... Maximum value position detection device, 7... Bidirectional correlator, 8... Maximum value position detection device, 9... ...determination device, 10... maximum value position detection device.

Claims (1)

【特許請求の範囲】 1 音声波形を複数区間に分割し、各区間の音声
波形をクリツパ演算式 y1(n)=x(n)−CL (x(n)≧CL) =0 (|x(n)|<CL) =x(n)+CL (x(n)≦−CL 及びしきい値演算式 y2(n)=1 (x(n)≧CL) =0 (|x(n)|<CL) =−1 (x(n)≦−CL (ただしCLはしきい値、x(n)は区間内の音声
波形)により非線形変換し、前記区間内の区間の
前半にx(n)の最大値が存在する時は、相互相
関関数 (τ)= 〓 n∈Ly1=(n−τ)y2(n) を演算し、区間の後半にx(n)の最大値が存在
する時は、相互相関関数 (τ)= 〓 n∈Ly2=(n−τ)y1(n) (ただし、Lは当該分析区間全域を示し、τは波
形上の時間間隔でτ≧0である) を演算し、この相互相関関数の最大値よりピツチ
を抽出することを特徴とするピツチ抽出方法。
[Claims] 1. Divide the audio waveform into multiple sections, and calculate the audio waveform of each section using the clipper calculation formula y 1 (n) = x (n) - C L (x (n) ≧ C L ) = 0 ( |x(n)|<C L ) =x(n)+C L (x(n)≦-C L and threshold calculation formula y 2 (n)=1 (x(n)≧C L ) =0 (|x(n)|< CL ) = -1 (x(n)≦ -CL (where CL is the threshold and x(n) is the audio waveform within the section), and When the maximum value of x(n) exists in the first half of the interval, calculate the cross-correlation function (τ) = 〓 n∈Ly 1 = (n-τ)y 2 (n) When the maximum value of x(n) exists, the cross-correlation function (τ) = 〓 n∈Ly 2 = (n-τ)y 1 (n) (where, L indicates the entire analysis interval, and τ τ≧0 at time intervals on a waveform), and extracting the pitch from the maximum value of this cross-correlation function.
JP18710582A 1982-10-25 1982-10-25 Extraction of pitch Granted JPS5975297A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP18710582A JPS5975297A (en) 1982-10-25 1982-10-25 Extraction of pitch

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP18710582A JPS5975297A (en) 1982-10-25 1982-10-25 Extraction of pitch

Publications (2)

Publication Number Publication Date
JPS5975297A JPS5975297A (en) 1984-04-27
JPH0377520B2 true JPH0377520B2 (en) 1991-12-10

Family

ID=16200185

Family Applications (1)

Application Number Title Priority Date Filing Date
JP18710582A Granted JPS5975297A (en) 1982-10-25 1982-10-25 Extraction of pitch

Country Status (1)

Country Link
JP (1) JPS5975297A (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5570900A (en) * 1978-11-24 1980-05-28 Nippon Electric Co Voice analyzer

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5570900A (en) * 1978-11-24 1980-05-28 Nippon Electric Co Voice analyzer

Also Published As

Publication number Publication date
JPS5975297A (en) 1984-04-27

Similar Documents

Publication Publication Date Title
US5473759A (en) Sound analysis and resynthesis using correlograms
JP2006510017A (en) Signal separation
US6502067B1 (en) Method and apparatus for processing noisy sound signals
US5452398A (en) Speech analysis method and device for suppyling data to synthesize speech with diminished spectral distortion at the time of pitch change
CN101030374B (en) Method and apparatus for extracting base sound period
JP5325130B2 (en) LPC analysis device, LPC analysis method, speech analysis / synthesis device, speech analysis / synthesis method, and program
JPS62229200A (en) Pitch detector
JPH0377520B2 (en)
JPH04288600A (en) Extracting method for pitch frequency difference feature quantity
EP0162585B1 (en) Encoder capable of removing interaction between adjacent frames
JPH058839B2 (en)
JPH0114599B2 (en)
JP4313740B2 (en) Reverberation removal method, program, and recording medium
JPH0122638B2 (en)
JP3233543B2 (en) Method and apparatus for extracting impulse drive point and pitch waveform
JPS61190400A (en) Enunciation speed estimate apparatus
JP2898637B2 (en) Audio signal analysis method
JP4378098B2 (en) Sound source selection apparatus and method
JP2629762B2 (en) Pitch extraction device
JPH0636157B2 (en) Band division type vocoder
JPH0736119B2 (en) Piecewise optimal function approximation method
JPH0754438B2 (en) Voice processor
JPH0448239B2 (en)
JPS63108400A (en) Voice encoder
JPS60238900A (en) Fundamental frequency pattern extraction system