JPH0439680B2 - - Google Patents

Info

Publication number
JPH0439680B2
JPH0439680B2 JP60119685A JP11968585A JPH0439680B2 JP H0439680 B2 JPH0439680 B2 JP H0439680B2 JP 60119685 A JP60119685 A JP 60119685A JP 11968585 A JP11968585 A JP 11968585A JP H0439680 B2 JPH0439680 B2 JP H0439680B2
Authority
JP
Japan
Prior art keywords
cepstrum
voiced
discrimination
sound
spectrum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP60119685A
Other languages
Japanese (ja)
Other versions
JPS61278000A (en
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed filed Critical
Priority to JP60119685A priority Critical patent/JPS61278000A/en
Publication of JPS61278000A publication Critical patent/JPS61278000A/en
Publication of JPH0439680B2 publication Critical patent/JPH0439680B2/ja
Granted legal-status Critical Current

Links

Description

【発明の詳細な説明】 〔産業上の利用分野〕 この発明は、ケプストラム法によつて音声の分
析を行う音声分析装置における有声音無声音判別
装置に関するものである。
DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a voiced/unvoiced sound discriminator in a speech analysis device that analyzes speech using the cepstral method.

〔従来の技術〕[Conventional technology]

一般に音声の特徴はその周波数スペクトル、す
なわち、第5図Aに示す音声信号の各周波数成分
の分布によつて表される。従つて、音声の特徴を
表わすパラメータはスペクトルを何らかの形で表
現する物理量を用いる。ケプストラムは対数スペ
クトルのコサイン展開で求められるパラメータで
一般的に(1)式で表現される。
In general, the characteristics of a voice are represented by its frequency spectrum, that is, the distribution of each frequency component of the voice signal shown in FIG. 5A. Therefore, physical quantities that express the spectrum in some form are used as parameters representing the characteristics of the voice. The cepstrum is a parameter obtained by cosine expansion of a logarithmic spectrum, and is generally expressed by equation (1).

S(K)Mm=0 C(m)・cos(2π/Nk・m) ……(1) 但し、S(K)は対数スペクトル、 kは周波数、 C(m)はケプストラム、 である。S (K) = Mm=0 C(m)・cos(2π/Nk・m) ……(1) However, S (K) is the logarithmic spectrum, k is the frequency, C(m) is the cepstrum, and be.

そして、第5図Bは前記第5図Aに示した音声
信号をフーリエ分析・対数スペクトルで表わした
ものである。例えば(1)式において、ケプストラム
の0次項C(0)は第6図Aに示すように対数スペク
トルS(K)の平均値であり、C(1)はS(K)のコサインの
一次の成分となる。すなわち、対数スペクトル
S(K)は第6図Bに示す如く、各次項の成分の和と
して表現される。従つて、この種の有声音無声音
判別装置では上述のケプストラムの0次項の値に
対する閾値判定、ケプストラムの1次項の値に対
する閾値判定、ケプストラム各次数の2乗和、つ
まり、スペクトルの分散の値に対する閾値判定、
あるいはケプストラムの各次数の積和計算による
スペクトルのある帯域の平均値に対する閾値判
定、もしくはこれらの組合せによつて有声音か無
声音かの判別を行うようにしていた。これは新美
康永著「音声認識」第70頁〜第72頁に説明されて
いるように、例えばケプストラムの0次項による
方法とはケプストラムの0次項が音声のパワーに
相当し、この値が有声音では大きくなり、また無
声音では小さいことを利用した方法であり、更に
ケプストラムの1次項による方法とは、ケプスト
ラムの1次項がスペクトルのだいたいの傾きに相
当すること(第6図BのC(1)を用い、有声音では
低域にパワーが集中しこの値が大きくなること)
を用いた方法であり、いずれの場合にも非常に簡
単な装置として実現できる。
FIG. 5B shows the audio signal shown in FIG. 5A using Fourier analysis and a logarithmic spectrum. For example, in equation (1), the zero-order term C (0) of the cepstrum is the average value of the logarithmic spectrum S (K) , as shown in Figure 6A, and C (1) is the first-order cosine of S (K). Becomes an ingredient. i.e. the logarithmic spectrum
As shown in FIG. 6B, S (K) is expressed as the sum of the components of each order term. Therefore, this type of voiced/unvoiced sound discrimination device performs threshold judgment on the value of the zero-order term of the cepstrum, threshold judgment on the value of the first-order term of the cepstrum, and judgment on the sum of squares of each order of the cepstrum, that is, the value of the spectral variance. Threshold judgment,
Alternatively, a voiced or unvoiced sound is determined by determining a threshold value for the average value of a certain band of the spectrum by calculating the sum of products of each order of the cepstrum, or by a combination of these. As explained in "Speech Recognition" by Yasunaga Niimi, pages 70 to 72, for example, the method using the zero-order term of the cepstrum is that the zero-order term of the cepstrum corresponds to the power of the voice, and this value is This method takes advantage of the fact that voiced sounds are louder and unvoiced sounds are smaller. Furthermore, the method using the first-order term of the cepstrum is based on the fact that the first-order term of the cepstrum corresponds to the approximate slope of the spectrum (C (1 ) , and for voiced sounds, the power is concentrated in the low range and this value becomes large)
In either case, it can be realized as a very simple device.

〔発明が解決しようとする問題点〕[Problem that the invention seeks to solve]

従来の有声音無声音判別装置は以上のようにな
されていたので、装置が簡単というだけで、判定
誤差が多く、そのためこの装置を用いた音声合成
装置では合成音品質の劣下が生じ、音声認識装置
の前処理部に用いると誤認識率の低下をもたらす
などの問題点があつた。また、スペクトルの分散
による方法でも同様であつた。一方スペクトルの
ある帯域の平均値を用いる方法では音声のスペク
トラムと周波数上の荷重関数のケプストラムの積
和によつて希望する帯域の平均パワーを求めるも
のであり、有声音でパワーが集中する100〜1000
Hz程度に帯域を選べば、判別誤りはかなり少なく
なる。しかしこの様な装置ではケプストラムの次
数だけの積和計算が必要となり比較的大きな計算
量が必要であるという問題点があつた。
Conventional voiced/unvoiced sound discriminators have been designed as described above, and even though the device is simple, there are many judgment errors.As a result, in speech synthesis devices using this device, the synthesized sound quality deteriorates, making it difficult to recognize speech. When used in the preprocessing section of the device, there were problems such as a decrease in the false recognition rate. The same result was obtained using a method using spectral dispersion. On the other hand, in the method of using the average value of a certain band of the spectrum, the average power of the desired band is determined by the sum of the products of the speech spectrum and the cepstrum of the weighting function on frequency. 1000
If the band is selected to be around Hz, discrimination errors will be considerably reduced. However, such a device has a problem in that it requires product-sum calculations only for the orders of the cepstrum, which requires a relatively large amount of calculation.

この発明は、上記の様な問題点を解決するため
になされたもので、ケプストラムの低次項を加算
する加算回路と、その加算値と閾値を比較する閾
値比較回路を設けることにより、少ない計算量で
判別誤りの少ない有声音無声音判別装置を得るこ
とを目的とする。
This invention was made to solve the above-mentioned problems, and reduces the amount of calculation by providing an addition circuit that adds the low-order terms of the cepstrum and a threshold comparison circuit that compares the added value with a threshold. The purpose of the present invention is to obtain a device for discriminating voiced and unvoiced sounds with fewer discrimination errors.

〔問題点を解決するための手段〕[Means for solving problems]

この発明に係る有声音無声音判別装置はケプス
トラム分析装置から得られるケプストラム系数の
低次項の和を算出する加算回路を設け、その加算
回路の結果と閾値とを比較する閾値比較回路とを
備え閾値以上であれば有声音、閾値以下であれば
無声音と判断して有声音無声音判別結果を得るよ
うにしたものである。
The voiced/unvoiced sound discrimination device according to the present invention includes an addition circuit that calculates the sum of low-order terms of the cepstrum series obtained from the cepstral analysis device, and a threshold comparison circuit that compares the result of the addition circuit with a threshold value. If it is below a threshold value, it is determined to be a voiced sound, and if it is below a threshold value, it is determined to be an unvoiced sound, and a voiced/unvoiced sound discrimination result is obtained.

〔作用〕[Effect]

この発明における有声音無声音の判別は加算回
路で得られた判別パラメータを固定的な閾値と比
較し、その比較結果の大小に応じて有声音又は無
声音と判定する。
To discriminate between voiced and unvoiced sounds in this invention, the discrimination parameter obtained by the adding circuit is compared with a fixed threshold value, and the sound is determined to be voiced or unvoiced depending on the magnitude of the comparison result.

〔実施例〕〔Example〕

以下、この発明の一実施例を図について説明す
る。第1図は有声音無声音判別装置を示すブロツ
ク構成図で、図において、1は分析装置によつて
得られた音声のケプストラム、2はケプストラム
の各次項を加算する加算回路、3は判定パラメー
タ、4は加算回路2で得られた判定パラメータ3
を固定的な閾値と比較する閾値比較回路、5は有
声音無声音判別結果である。
An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block configuration diagram showing a voiced/unvoiced sound discriminator. In the figure, 1 is the cepstrum of the speech obtained by the analyzer, 2 is an addition circuit that adds each order term of the cepstrum, 3 is a determination parameter, 4 is the judgment parameter 3 obtained by the addition circuit 2
5 is a threshold value comparison circuit that compares the threshold value with a fixed threshold value, and 5 is the voiced/unvoiced sound discrimination result.

また、第2図は第1図の有声音無声音判別装置
における音声スペクトルと判別パラメータの関係
例を示す説明図である。
Further, FIG. 2 is an explanatory diagram showing an example of the relationship between the voice spectrum and the discrimination parameter in the voiced/unvoiced sound discriminator of FIG. 1.

次にこの発明の動作について説明する。まず、
音声の対数スペクトルS(K)は、ケプストラムC(m)
(m=0、1、…M)により、(2)式で表される。
Next, the operation of this invention will be explained. first,
The logarithmic spectrum of speech S (K) is the cepstrum C (m)
(m=0, 1,...M), it is expressed by equation (2).

S(K)Mm=0 C(m)cos(2π/Nkm) ……(2) 但し、K=0、1、…N−1、 すなわち、この対数スペクトルS(K)は第2図に
おいて音声スペクトル11で示される。これに対
して、ケプストラムの次数(m)をごく低次のみに注
目すると、 S0(K)M0m=0 C(m)cos(2π/Nkm) ……(3) (3)式となり、コサイン級数展開の意味で平滑化
したスペクトル12が得られる。このスペクトル
の周波数0に対する値、すなわち、判別パラメー
タ13は P=M0m=0 C(m) ……(4) (4)式となりケプストラムの和として表すことが
できる。今Mを3〜4程度に選ぶとこの値Pは、
もとの音声スペクトル11の従来の装置における
ケプストラムすべての次数項の積和で求めていた
有声音でパワーが集中する従来の方式の判別低域
14のパワーとほぼ同様の値となる。よつて、第
1図における加算回路2は、図示していないケプ
ストラム分析装置によつて得られるケプストラム
1の低次の数項の和を第3図の如く加算回路2に
よつて算出し、上記(4)式のPで表す判定パラメー
タ3を得る。ここで加算回路2で得られる判別パ
ラメータ13は第4図の如く表わされる。Xは無
声音、Yは有声音である。このようにして閾値比
較回路4により閾値kpとの比較がなされ、閾値kp
以上であれば有声音、閾値kp以下であれば無声音
と判断して有声音無声音判別結果5を得る。
S (K) = Mm=0 C(m)cos (2π/Nkm) ...(2) However, K=0, 1, ...N-1, that is, this logarithmic spectrum S (K) is the second This is indicated by the audio spectrum 11 in the figure. On the other hand, if we focus only on the very low order (m) of the cepstrum, S 0(K) = M0m=0 C(m)cos (2π/Nkm) ……(3) Equation (3) Thus, a spectrum 12 smoothed in the sense of cosine series expansion is obtained. The value of this spectrum for frequency 0, that is, the discrimination parameter 13, is expressed as P= M0m=0 C(m) (4) (4) and can be expressed as a sum of cepstrums. If we choose M to be around 3 to 4, this value P is
This value is almost the same as the power of the discrimination low frequency band 14 in the conventional system where the power is concentrated in voiced sounds, which was obtained by the sum of products of all order terms of the cepstrum in the conventional device of the original speech spectrum 11. Therefore, the adder circuit 2 in FIG. 1 calculates the sum of the lower-order terms of the cepstrum 1 obtained by a cepstrum analyzer (not shown) using the adder circuit 2 as shown in FIG. The determination parameter 3 represented by P in equation (4) is obtained. Here, the discrimination parameter 13 obtained by the addition circuit 2 is expressed as shown in FIG. X is a voiceless sound, and Y is a voiced sound. In this way, the threshold value comparison circuit 4 compares the threshold value k p with the threshold value k p
If it is above, it is determined to be a voiced sound, and if it is less than or equal to the threshold value k p , it is determined to be an unvoiced sound, and a voiced/unvoiced sound discrimination result 5 is obtained.

〔発明の効果〕〔Effect of the invention〕

以上のように、この発明によれば、ケプストラ
ム分析装置によつて得られるケプストラムの低次
の数項を加算回路に取り込み閾値比較回路によつ
て有声音無声音の判別を行うようにしたので、従
来の如く多くの計算量を実行していた判別パラメ
ータとほぼ同様の性能を持つパラメータが得ら
れ、従来装置では得られない高い判別率を得るこ
とができる優れた効果を奏する。
As described above, according to the present invention, the low-order terms of the cepstrum obtained by the cepstrum analyzer are incorporated into the adder circuit and the threshold comparison circuit is used to discriminate between voiced and unvoiced sounds. It is possible to obtain parameters that have almost the same performance as the discriminating parameters that require a large amount of calculation, such as the above, and have the excellent effect of obtaining a high discriminating rate that cannot be obtained with conventional devices.

【図面の簡単な説明】[Brief explanation of drawings]

第1図はこの発明の一実施例である有声音無声
音判別装置を示すブロツク構成図、第2図は第1
図の有声音無声音判別装置の判定に用いられるパ
ラメータと音声スペクトルの説明図、第3図は加
算回路の説明用図、第4図は閾値比較回路の説明
用図、第5図は従来における一般的なケプストラ
ム説明図、第6図はケプストラム低次項と対数ス
ペクトルとの関係図である。 図において、1はケプストラム、2は加算回
路、3は判別パラメータ、4は閾値比較回路、5
は有声音無声音判別結果である。
FIG. 1 is a block diagram showing a voiced/unvoiced sound discriminator which is an embodiment of the present invention, and FIG.
Fig. 3 is an explanatory diagram of the adding circuit, Fig. 4 is an explanatory diagram of the threshold comparison circuit, and Fig. 5 is a conventional general diagram. FIG. 6 is a diagram showing the relationship between the cepstrum low-order terms and the logarithmic spectrum. In the figure, 1 is a cepstrum, 2 is an addition circuit, 3 is a discrimination parameter, 4 is a threshold comparison circuit, and 5
is the voiced/unvoiced sound discrimination result.

Claims (1)

【特許請求の範囲】[Claims] 1 音声のケプストラム分析装置によつて得られ
るケプストラム系数の低次項の和を算出する加算
回路と、前記加算回路の加算結果から得られる判
別パラメータを入力として予め固定した閾値と比
較し、該閾値より該判別パラメータが大なる時に
有声音、小なる時に無声音と判別する閾値比較回
路とを備えた有声音無声音判別装置。
1. An addition circuit that calculates the sum of the low-order terms of the cepstrum series obtained by the speech cepstrum analysis device, and a discrimination parameter obtained from the addition result of the addition circuit are input and compared with a pre-fixed threshold value, and from the threshold value A voiced/unvoiced sound discrimination device comprising a threshold comparison circuit that discriminates a voiced sound when the discrimination parameter is large and a voiceless sound when the discrimination parameter is small.
JP60119685A 1985-06-04 1985-06-04 Voiced/voiceless sound discriminator Granted JPS61278000A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP60119685A JPS61278000A (en) 1985-06-04 1985-06-04 Voiced/voiceless sound discriminator

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP60119685A JPS61278000A (en) 1985-06-04 1985-06-04 Voiced/voiceless sound discriminator

Publications (2)

Publication Number Publication Date
JPS61278000A JPS61278000A (en) 1986-12-08
JPH0439680B2 true JPH0439680B2 (en) 1992-06-30

Family

ID=14767513

Family Applications (1)

Application Number Title Priority Date Filing Date
JP60119685A Granted JPS61278000A (en) 1985-06-04 1985-06-04 Voiced/voiceless sound discriminator

Country Status (1)

Country Link
JP (1) JPS61278000A (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2763322B2 (en) * 1989-03-13 1998-06-11 キヤノン株式会社 Audio processing method

Also Published As

Publication number Publication date
JPS61278000A (en) 1986-12-08

Similar Documents

Publication Publication Date Title
US7454330B1 (en) Method and apparatus for speech encoding and decoding by sinusoidal analysis and waveform encoding with phase reproducibility
US5752222A (en) Speech decoding method and apparatus
US7555434B2 (en) Audio decoding device, decoding method, and program
US5517595A (en) Decomposition in noise and periodic signal waveforms in waveform interpolation
US5842162A (en) Method and recognizer for recognizing a sampled sound signal in noise
KR100304092B1 (en) Audio signal coding apparatus, audio signal decoding apparatus, and audio signal coding and decoding apparatus
EP0837453B1 (en) Speech analysis method and speech encoding method and apparatus
US5930747A (en) Pitch extraction method and device utilizing autocorrelation of a plurality of frequency bands
US5630012A (en) Speech efficient coding method
US6532443B1 (en) Reduced length infinite impulse response weighting
JPH0869299A (en) Voice coding method, voice decoding method and voice coding/decoding method
US5983173A (en) Envelope-invariant speech coding based on sinusoidal analysis of LPC residuals and with pitch conversion of voiced speech
JP2002507291A (en) Speech enhancement method and device in speech communication system
JP4040126B2 (en) Speech decoding method and apparatus
US6023671A (en) Voiced/unvoiced decision using a plurality of sigmoid-transformed parameters for speech coding
KR100297832B1 (en) Device for processing phase information of acoustic signal and method thereof
JP3555490B2 (en) Voice conversion system
JP4760179B2 (en) Voice feature amount calculation apparatus and program
JPH0439680B2 (en)
US5890107A (en) Sound signal processing circuit which independently calculates left and right mask levels of sub-band sound samples
JP3271193B2 (en) Audio coding method
JPH1097288A (en) Background noise removing device and speech recognition system
JP3221050B2 (en) Voiced sound discrimination method
JPH0123800B2 (en)
JPH0311479B2 (en)

Legal Events

Date Code Title Description
LAPS Cancellation because of no payment of annual fees