JPH0378636B2 - - Google Patents

Info

Publication number
JPH0378636B2
JPH0378636B2 JP57076112A JP7611282A JPH0378636B2 JP H0378636 B2 JPH0378636 B2 JP H0378636B2 JP 57076112 A JP57076112 A JP 57076112A JP 7611282 A JP7611282 A JP 7611282A JP H0378636 B2 JPH0378636 B2 JP H0378636B2
Authority
JP
Japan
Prior art keywords
pitch
voiced
unvoiced
period
search range
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP57076112A
Other languages
Japanese (ja)
Other versions
JPS58193597A (en
Inventor
Satoru Taguchi
Masanori Kobayashi
Takayuki Ishikawa
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
Nippon Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Electric Co Ltd filed Critical Nippon Electric Co Ltd
Priority to JP7611282A priority Critical patent/JPS58193597A/en
Publication of JPS58193597A publication Critical patent/JPS58193597A/en
Publication of JPH0378636B2 publication Critical patent/JPH0378636B2/ja
Granted legal-status Critical Current

Links

Description

【発明の詳細な説明】 本発明は音声波形をピツチ周期程度のフレーム
周期で分析して得られる自己相関係数に基づいて
ピツチ抽出を行なうピツチ抽出装置に関し、特に
聴覚的に重要な有声音連続部におけるピツチ抽出
誤りを大幅に減少し得るピツチ抽出装置に係る。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a pitch extraction device that performs pitch extraction based on an autocorrelation coefficient obtained by analyzing a speech waveform with a frame period approximately equal to the pitch period. The present invention relates to a pitch extraction device that can significantly reduce pitch extraction errors in a section.

音声波形における有声音部分は周期的な繰返し
波形を持ち、その周期(ピツチ周期)の変化特性
は音声の分析合成に、認識等における重要なパラ
メータであることが知られている。例えば、音声
の分析合成系においては分析部で抽出されるピツ
チ抽出結果が合成部において合成される合成音の
品質に大きな影響を及ぼす。
It is known that the voiced part of a speech waveform has a periodic repeating waveform, and the change characteristic of the period (pitch period) is an important parameter in speech analysis and synthesis, recognition, and the like. For example, in a speech analysis and synthesis system, the pitch extraction result extracted by the analysis section has a large effect on the quality of synthesized speech synthesized by the synthesis section.

音声波形のピツチ周期抽出法としては、従来、
ピツチ周期程度の時間長を持つフレーム毎に自己
相関係数を算出し抽出する方法等、種々の分析パ
ラメータを用いる方法が知られている。
Conventionally, the pitch period extraction method for audio waveforms is
Methods using various analysis parameters are known, such as a method of calculating and extracting an autocorrelation coefficient for each frame having a time length of approximately the pitch period.

自己相関係数に基づくピツチ抽出法は、自己相
関係数が時間領域内の処理で求め得る点と、被分
析波形とフレームとの位相の影響が比較的に小さ
い点とから広く用いられている。しかしながら自
己相関係数に基づくピツチ抽出法は、ピツチ周期
の整数倍、又はピツチ周期のN1/N2倍の周期を
ピツチ周期として誤つて検出することが多い。
(但し、N1、N2は整数であり、N1<N2である)
なお、ピツチ抽出法における上記、本来のピツチ
周期の整数倍の周期をピツチ周期と誤検出する
(以後、「整数倍ピツチ周期誤り」と云う)問題に
ついては本発明者の提案になる出願特開昭54−
139307「ピツチ抽出装置」に詳しく述べられてい
る。
Pitch extraction methods based on autocorrelation coefficients are widely used because the autocorrelation coefficients can be obtained through processing in the time domain, and the influence of the phase between the analyzed waveform and the frame is relatively small. . However, the pitch extraction method based on the autocorrelation coefficient often erroneously detects a period that is an integral multiple of the pitch period or N 1 /N 2 times the pitch period as the pitch period.
(However, N 1 and N 2 are integers, and N 1 < N 2 )
The above-mentioned problem in the pitch extraction method where a cycle that is an integer multiple of the original pitch cycle is mistakenly detected as a pitch cycle (hereinafter referred to as "integer multiple pitch cycle error") is addressed in the patent application proposed by the present inventor. Showa 54-
Detailed in 139307 "Pituchi Extraction Device".

多くの自己相関係数に基づくピツチ抽出装置は
整数倍ピツチ周期誤り等を緩和するためにピツチ
検索範囲をピツチ周期付近に限定している。従
来、この種のピツチ抽出装置では、1つの有声無
声判別手段を用いて有声音区間を検出し、過去の
有声音区間におけるピツチ周期を用いてピツチ検
索範囲を決定していた。ところが、有声音には、
有声破裂音等、云わゆるピツチ周期の抽出結果が
ランダム性を有するものも含まれている。従来の
方法は、過去の有声音区間におけるピツチ周期デ
ータを用いてピツチ検索範囲を決定するために、
有声破裂音区間の影響で不適当なピツチ検索範囲
を決定し、ピツチ抽出エラーを生じるという欠点
を有していた。
Many pitch extraction devices based on autocorrelation coefficients limit the pitch search range to around the pitch period in order to alleviate integer multiple pitch period errors. Conventionally, in this type of pitch extraction device, a voiced sound section is detected using one voiced/unvoiced discrimination means, and a pitch search range is determined using the pitch period in the past voiced sound section. However, for voiced sounds,
This includes cases where the so-called pitch period extraction results are random, such as voiced plosives. The conventional method uses pitch period data in past voiced sound sections to determine the pitch search range.
This method has the drawback of determining an inappropriate pitch search range due to the influence of voiced plosive sections, resulting in pitch extraction errors.

本発明の目的はピツチ検索範囲を決定するため
に使用する過去のピツチ周期データのうち、有声
破裂音等によるデータを除去することにより、安
定なピツチ検出が可能なピツチ抽出装置を供給す
ることにある。
An object of the present invention is to provide a pitch extraction device that is capable of stable pitch detection by removing data due to voiced plosives, etc. from past pitch cycle data used to determine a pitch search range. be.

本発明の特徴は、音声本来の有声/無声区分と
は独立に、ピツチ検索範囲を決定する目的で、有
声/無声を区分することにある。つまり、本発明
はピツチ検索範囲を決定するための有声音区間を
独立の有声/無声判別手段により指定することに
より、ピツチ検索範囲決定のデータとしては不適
当な、有声破裂音等のデータを除去し、より正確
なピツチ抽出を可能とする効果を有する。
A feature of the present invention is that voiced/unvoiced audio is classified for the purpose of determining the pitch search range, independently of the original voiced/unvoiced classification. In other words, the present invention eliminates data such as voiced plosives that are inappropriate as data for determining the pitch search range by specifying the voiced interval for determining the pitch search range using an independent voiced/unvoiced discrimination means. This has the effect of enabling more accurate pitch extraction.

次に本発明の実施例を図面を参照して説明す
る。
Next, embodiments of the present invention will be described with reference to the drawings.

図は本発明の実施例を説明するためのブロツク
図である。入力音声信号はA/D変換器1へ入力
される。A/D変換器1は入力音声信号を例えば
8KHzで標本化し、各標本を量子化する。標本化、
量子化された音声信号はバツフアメモリ2へ書込
まれる。バツフアメモリ2はフレーム周期毎例え
ば20mSEC毎に1分析窓長分例え30mSEC分の音
声サンプルを有声/無声判別器3、自己相関係数
計測器4、有声/無声判別器5へ出力する。有
声/無声判別器3は音声本来の有声/無声判別を
行なうものであり、有声/無声判別器3はバツフ
アメモリ2よりフレーム周期毎に供給される音声
波形の音声本来の有声/無声を判定し、更に判定
結果をピツチ検索器6へ出力する。自己相関係数
計測器4はバツフアメモリ2よりフレーム周期毎
に供給される音声波形の自己相関係数列をピツチ
周期分布範囲、例えば2.5mSEC〜15mSECに対
応する遅れ範囲について計測する。自己相関係数
計測器4にて計測される遅れτの自己相関係数ρ〓
である。但しx(i)は分析フレームにおける第i番
目の音声サンプル、Nは自然数であり、例えば15
mSECに相当する数(8KHzサンプルのとき720)
である。計測された自己相関係数列はピツチ検索
器6へ供給される。
The figure is a block diagram for explaining an embodiment of the present invention. An input audio signal is input to an A/D converter 1. The A/D converter 1 converts the input audio signal into, for example,
Sample at 8KHz and quantize each sample. sampling,
The quantized audio signal is written into buffer memory 2. The buffer memory 2 outputs voice samples corresponding to one analysis window length, for example, 30 mSEC, to the voiced/unvoiced discriminator 3, the autocorrelation coefficient measuring device 4, and the voiced/unvoiced discriminator 5 every frame period, for example, every 20 mSEC. The voiced/unvoiced discriminator 3 determines whether the voice is originally voiced or unvoiced. Furthermore, the determination result is output to the pitch search device 6. The autocorrelation coefficient measuring device 4 measures the autocorrelation coefficient sequence of the audio waveform supplied from the buffer memory 2 for each frame period over a pitch period distribution range, for example, a delay range corresponding to 2.5 mSEC to 15 mSEC. Autocorrelation coefficient ρ of delay τ measured by autocorrelation coefficient measuring device 4
teeth It is. However, x(i) is the i-th audio sample in the analysis frame, and N is a natural number, for example, 15
Number equivalent to mSEC (720 for 8KHz sample)
It is. The measured autocorrelation coefficient sequence is supplied to a pitch searcher 6.

有声/無声判別器5はバツフアメモリ2よりフ
レーム周期毎に供給される音声波形の有声/無声
判別をピツチ検索範囲を決定する目的で実行す
る。有声/無声判別器5は、例えばフレーム周期
毎に供給される音声波形の電力が予め定めたしき
い値より大きければ有声と、しきい値より小さけ
れば無声と判定する。更に有声/無声判別器5の
判定結果はスイツチ7へ制御信号として出力され
る。前記しきい値は多くの有声音定常部が無声
音、無声、有声破裂音より電力が大きいことを利
用して設定されるものであり、その値は有声音定
常部の平的電力より小さく、有声破裂音の平均的
電力よりも大きいものとなる。したがつて有声/
無声判別器5は有声音定常部を、他の無声音、無
声、有声破裂音から識別する機能をもつピツチ検
索器6は自己相関係数計測器4より供給される自
己相関係数列のピツチ検索範囲に於ける最大値を
検索し、前記最大値に対応する遅れ時間τPを求め
る。前記ピツチ検索範囲はピツチ検索範囲計測器
9から供給される。ピツチ検索器6は更に有声/
無声判別器3より供給される有声/無声判別結果
が有声であれば前記τPをピツチ出力端子10とス
イツチ7へと出力し、無声であれば前記τP
“0”に変更してピツチ出力端子10とスイツチ
7とへ出力する。つまり、有声/無声判別器3
は、有声破裂音部を含む有声音部を判別し、その
判別法は前述の如く周知の方法で充分である。
The voiced/unvoiced discriminator 5 performs voiced/unvoiced discrimination of the speech waveform supplied from the buffer memory 2 every frame period for the purpose of determining the pitch search range. The voiced/unvoiced discriminator 5 determines, for example, that the voice waveform supplied in each frame period is voiced if the power is greater than a predetermined threshold, and is unvoiced if it is smaller than the threshold. Furthermore, the determination result of the voiced/unvoiced discriminator 5 is outputted to the switch 7 as a control signal. The threshold value is set based on the fact that many stationary voiced parts have higher power than unvoiced sounds, unvoiced sounds, and voiced plosives, and its value is smaller than the average power of stationary parts of voiced sounds. The power is greater than the average power of a plosive sound. Therefore voiced/
The unvoiced discriminator 5 has the function of discriminating voiced stationary parts from other unvoiced sounds, unvoiced sounds, and voiced plosives.The pitch searcher 6 has the function of discriminating voiced stationary parts from other unvoiced sounds, unvoiced sounds, and voiced plosives. The maximum value in is searched, and the delay time τ P corresponding to the maximum value is determined. The pitch search range is supplied from a pitch search range measuring device 9. The pitch searcher 6 is also voiced/
If the voiced/unvoiced discrimination result supplied from the unvoiced discriminator 3 is voiced, the τ P is output to the pitch output terminal 10 and the switch 7, and if it is unvoiced, the τ P is changed to "0" and the pitch is output. Output to output terminal 10 and switch 7. In other words, voiced/unvoiced discriminator 3
The method discriminates voiced parts including voiced plosive parts, and the well-known method described above is sufficient for the discrimination method.

スイツチ7は有声/無声判別器5から供給され
る制御信号により前記τPと“0”とを切換え、更
に一時メモリ8へ出力するものである。スイツチ
7は前記制御信号が有声を表わす場合にはτPを、
無声を表わす場合には“0”を一時メモリ8へ出
力する。一時メモリ8はスイツチ7より供給され
るτP又は“0”をフレーム周期毎に遂次記憶す
る。一時メモリ8は記憶されている過去のフレー
ムのピツチ周期τPをフレーム周期毎にピツチ検索
範囲計測器9へ出力する。
The switch 7 switches between τ P and "0" in response to a control signal supplied from the voiced/unvoiced discriminator 5, and further outputs the signal to the temporary memory 8. When the control signal indicates voiced, the switch 7 sets τ P ;
When indicating silence, "0" is output to the temporary memory 8. The temporary memory 8 sequentially stores τ P or "0" supplied from the switch 7 every frame period. The temporary memory 8 outputs the pitch period τ P of the stored past frame to the pitch search range measuring device 9 for each frame period.

ピツチ検索範囲計測器9は過去のフレームのピ
ツチ周期からピツチ検索範囲を計測し結果をピツ
チ検索器6へ出力する。ピツチ検索範囲の決定方
法は従来より行なわれている方法、例えば前記特
開昭54−139307や特開昭56−42296で挙げた方法
を用いる。
The pitch search range measuring device 9 measures the pitch search range from the pitch period of past frames and outputs the result to the pitch search device 6. The method of determining the pitch search range is a conventional method, for example, the method mentioned in the above-mentioned Japanese Patent Application Laid-Open No. 54-139307 and Japanese Patent Application Laid-Open No. 56-42296.

なお、自己相関係数計測器4におけるρ〓の計算
式は例れば 又は τ〓=Ni=1 x(i)×x(i+τ) に簡略し得る。
The formula for calculating ρ〓 in the autocorrelation coefficient measuring device 4 is, for example, Or it can be simplified to τ= Ni=1 x(i)×x(i+τ).

以上のように本発明によれば、ピツチ検索範囲
を決定する際に有声破裂音等のピツチ周期がラン
ダム性を有するものの影響を除去でき、高精度な
ピツチ抽出が可能となる。
As described above, according to the present invention, when determining the pitch search range, it is possible to remove the influence of voiced plosives and the like whose pitch period is random, and it is possible to extract pitches with high accuracy.

【図面の簡単な説明】[Brief explanation of drawings]

図は本発明の実施例を説明するためのブロツク
図である。 1……A/D変換器、2……バツフアメモリ、
3……有声/無声判別器1、4……自己相関係数
計測器、5……有声/無声判別器2、6……ピツ
チ検索器、7……スイツチ、8……一時メモリ、
9……ピツチ検索範囲計測器、10……ピツチ出
力端子。
The figure is a block diagram for explaining an embodiment of the present invention. 1...A/D converter, 2...Buffer memory,
3... Voiced/unvoiced discriminator 1, 4... Autocorrelation coefficient measuring device, 5... Voiced/unvoiced discriminator 2, 6... Pitch search device, 7... Switch, 8... Temporary memory,
9...Pitch search range measuring device, 10...Pitch output terminal.

Claims (1)

【特許請求の範囲】[Claims] 1 入力音声信号の有声音部を求め、この有声音
部の信号について求めた自己相関係数列からピツ
チ周期を抽出するピツチ抽出装置に於いて、入力
音声信号から有声破裂音部を求める第1の手段
と、前記有声音部の信号から得られたピツチ周期
の中から前記有声破裂音部の信号から得られたピ
ツチ周期を除去する第2の手段と、この第2の手
段で得られたピツチ周期を基に前記ピツチ周期を
抽出する際のピツチ検索範囲を決定する第3の手
段とを備えて成ることを特徴とするピツチ抽出装
置。
1. In a pitch extraction device that obtains a voiced part of an input speech signal and extracts a pitch period from an autocorrelation coefficient sequence obtained for the signal of this voiced part, a first step is performed to obtain a voiced plosive part from an input speech signal. means, second means for removing pitch periods obtained from the signal of the voiced plosive part from pitch periods obtained from the signal of the voiced part; and a pitch period obtained by the second means. and third means for determining a pitch search range when extracting the pitch period based on the period.
JP7611282A 1982-05-07 1982-05-07 Pitch extractor Granted JPS58193597A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP7611282A JPS58193597A (en) 1982-05-07 1982-05-07 Pitch extractor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP7611282A JPS58193597A (en) 1982-05-07 1982-05-07 Pitch extractor

Publications (2)

Publication Number Publication Date
JPS58193597A JPS58193597A (en) 1983-11-11
JPH0378636B2 true JPH0378636B2 (en) 1991-12-16

Family

ID=13595807

Family Applications (1)

Application Number Title Priority Date Filing Date
JP7611282A Granted JPS58193597A (en) 1982-05-07 1982-05-07 Pitch extractor

Country Status (1)

Country Link
JP (1) JPS58193597A (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63271572A (en) * 1987-04-28 1988-11-09 Sharp Corp Calculation system for autocorrelation coefficient

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS54139307A (en) * 1978-04-20 1979-10-29 Nec Corp Pitch extraction unit
JPS5642296A (en) * 1979-09-17 1981-04-20 Nippon Electric Co Pitch extractor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS54139307A (en) * 1978-04-20 1979-10-29 Nec Corp Pitch extraction unit
JPS5642296A (en) * 1979-09-17 1981-04-20 Nippon Electric Co Pitch extractor

Also Published As

Publication number Publication date
JPS58193597A (en) 1983-11-11

Similar Documents

Publication Publication Date Title
US4720863A (en) Method and apparatus for text-independent speaker recognition
NL192701C (en) Method and device for recognizing a phoneme in a voice signal.
CA1246228A (en) Endpoint detector
JPS597120B2 (en) speech analysis device
JPH0990974A (en) Signal processor
US4081605A (en) Speech signal fundamental period extractor
JPH0378636B2 (en)
AU612737B2 (en) A phoneme recognition system
JPH041920B2 (en)
JPS6214839B2 (en)
JPS6151320B2 (en)
JPH0122639B2 (en)
JP3423233B2 (en) Audio signal processing method and apparatus
JP2583854B2 (en) Voiced / unvoiced judgment method
JP2666296B2 (en) Voice recognition device
JP3032215B2 (en) Sound detection device and method
KR100212453B1 (en) Method for detecting the pitch of voice signal using quantization error
JPH02192335A (en) Word head detecting system
JP3049711B2 (en) Audio processing device
JPS6068000A (en) Pitch extractor
KR200239272Y1 (en) Equipment for analysis of vocal signal
JP2679039B2 (en) Vowel cutting device
JPS63281199A (en) Voice segmentation apparatus
JPS62194299A (en) Voice/voicelessness discrimination system
JPH024918B2 (en)