JP2564821B2 - Voice judgment detector - Google Patents

Voice judgment detector

Info

Publication number
JP2564821B2
JP2564821B2 JP62097779A JP9777987A JP2564821B2 JP 2564821 B2 JP2564821 B2 JP 2564821B2 JP 62097779 A JP62097779 A JP 62097779A JP 9777987 A JP9777987 A JP 9777987A JP 2564821 B2 JP2564821 B2 JP 2564821B2
Authority
JP
Japan
Prior art keywords
voice
coefficient
distance
inter
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP62097779A
Other languages
Japanese (ja)
Other versions
JPS63262693A (en
Inventor
敏雄 吉川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
Nippon Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Electric Co Ltd filed Critical Nippon Electric Co Ltd
Priority to JP62097779A priority Critical patent/JP2564821B2/en
Publication of JPS63262693A publication Critical patent/JPS63262693A/en
Application granted granted Critical
Publication of JP2564821B2 publication Critical patent/JP2564821B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Description

【発明の詳細な説明】 〔産業上の利用分野〕 本発明は入力する音声を判定して検出する装置に関
し、とくに音声認識装置などにおける入力音声の存在範
囲を判定し検出する音声判定検出装置に関する。
Description: TECHNICAL FIELD The present invention relates to an apparatus for determining and detecting an input voice, and more particularly to a voice determination / detection apparatus for determining and detecting the existence range of an input voice in a voice recognition apparatus or the like. .

〔従来の技術〕 従来、音声区間を判定検出する音声判定検出装置は、
第4図に示されるように、入力する音に対応する入力信
号1からレベルを抽出する回路8と、音声入力時の雑音
レベル、ならびに入力音声レベルなどによってレベルの
しきい値を設定し、該しきい値と、前記レベル抽出回路
8から送出される入力レベル信号9とを比較して、該入
力レベル信号9が大である状態が、定められた一定時間
以上継続したとき、音声区間の始端と判定し、そのの
ち、前記しきい値と当該入力レベル信号9とを比較し
て、該入力レベル信号9が小である状態が、定められた
一定時間以上継続したときに、音声区間の終端と判定し
て、音声区間の始端ならびに終端の判定信号11を送出す
るしきい値設定回路10と、該音声区間始端終端判定信号
11を入力して、音声検出の結果信号7を送出する音声区
間の検出回路12と、を備えていて、上述の判定により決
定された始端から終端までを、音声区間として検出して
いた。
[Prior Art] Conventionally, a voice determination / detection device for determining and detecting a voice section is
As shown in FIG. 4, the circuit 8 for extracting the level from the input signal 1 corresponding to the input sound, the noise level at the time of voice input, the level threshold value is set according to the input voice level, and the like. The threshold value is compared with the input level signal 9 sent from the level extraction circuit 8, and when the state in which the input level signal 9 is high continues for a predetermined time or more, the start end of the voice section. After that, the threshold value is compared with the input level signal 9, and when the state in which the input level signal 9 is small continues for a predetermined time or more, the end of the voice section is terminated. Threshold value setting circuit 10 for transmitting the judgment signal 11 for the start and end of the voice section, and the judgment signal for the start and end of the voice section.
It is provided with a voice section detection circuit 12 for inputting 11 and transmitting a voice detection result signal 7, and detects from the start end to the end determined by the above determination as a voice section.

〔発明が解決しようとする問題点〕[Problems to be solved by the invention]

上述した従来の音声判定検出装置は、入力音声のパワ
ー情報を用いるため、周囲雑音が混入しやすく、入力音
声と周囲雑音との区別が困難という欠点がある。
Since the above-described conventional voice determination and detection apparatus uses the power information of the input voice, it has a drawback that ambient noise is easily mixed and it is difficult to distinguish between the input voice and the ambient noise.

〔問題点を解決するための手段〕[Means for solving problems]

本発明の音声判定検出装置は、入力信号を一定の抽出
区間ごとに線スペクトル対係数に変換する変換回路と、
該変換された線スペクトル対係数の隣接する係数間の距
離が、しきい値より大きいか小さいかを判定する係数間
距離の判定回路と、該係数間距離判定回路の判定結果
が、連続して一定時間以上継続したかどうかを判定し音
声を検出する有声音判定回路と、を備えている。
The voice determination and detection device of the present invention is a conversion circuit that converts an input signal into a line spectrum pair coefficient for each constant extraction section,
The inter-coefficient distance determination circuit that determines whether the distance between adjacent coefficients of the converted line spectrum pair coefficient is greater than or less than a threshold value, and the determination result of the inter-coefficient distance determination circuit are consecutive. And a voiced sound determination circuit that determines whether or not the voice has continued for a certain period of time or more and detects a voice.

〔作用〕[Action]

したがって本発明によると、入力信号が音声信号であ
るか否かの判定に、線スペクトル対係数の係数間距離を
用いるため、周囲雑音があっても音声を判定し検出する
ことができる。
Therefore, according to the present invention, since the inter-coefficient distance of the line spectrum pair coefficient is used to determine whether or not the input signal is a voice signal, the voice can be determined and detected even in the presence of ambient noise.

〔実施例〕〔Example〕

以下に本発明を、その実施例について図面を参照して
説明する。
Embodiments of the present invention will be described below with reference to the drawings.

第1図は本発明による一実施例を示すブロック図、第
2図ならびに第3図はそれぞれ、同上の実施例を説明す
るグラフ図である。入力する音に対応する入力信号1
は、通常、周囲雑音を含んでいる。線スペクトル対の変
換回路2は、入力信号1を、線形予測符号化法の一種で
ある線スペクトル対(Line Spectrum Pair、以下、LSP
と称す)方式により、周波数領域のパラメータである線
スペクトル対(LSP)係数の信号3に変換する。
FIG. 1 is a block diagram showing an embodiment according to the present invention, and FIGS. 2 and 3 are graphs for explaining the same embodiment. Input signal 1 corresponding to the input sound
Usually contains ambient noise. The line spectrum pair conversion circuit 2 converts the input signal 1 into a line spectrum pair (hereinafter, LSP), which is a kind of linear predictive coding method.
(Hereinafter, referred to as a “system”), the signal 3 of the line spectrum pair (LSP) coefficient, which is a parameter in the frequency domain, is converted.

たとえば、LSP係数は、分析次数を8次で計算する
と、第2図、第3図の如く、w1,w2,w3,〜,w8の8個が求
められる。なお、分析は、標本化周波数が8kHzで、帯域
幅を電話帯域の0.4〜3.4kHzとし、分析フレーム周期を1
0〜20m秒とする。
For example, when the analysis order is calculated by the 8th order, eight LSP coefficients w 1 , w 2 , w 3 , ..., W 8 are obtained as shown in FIGS. 2 and 3 . In the analysis, the sampling frequency is 8 kHz, the bandwidth is 0.4 to 3.4 kHz of the telephone band, and the analysis frame period is 1
0 to 20 ms.

また、LSPについては、1981年2月2日発行の「日経
エレクトロニクス」No.257の記事「線スペクトル周波数
をパラメータとした音声合成法とそのLSI化」P.P.128〜
158に解説されている。
Regarding LSP, "Speech synthesis method using line spectrum frequency as a parameter and its LSI implementation" in "Nikkei Electronics" No.257, published on February 2, 1981, PP128-
158.

LSP係数w1〜wPは、周波数領域のパラメータであっ
て、音声のホルマント周波数F1〜FP/2の近ぼうに集中
するという性質があり、また、各LSP係数w1〜wP間に
は、次の関係が成立している。すなわち、0<w1<w2
…<wP-1<wP<πであり、ここでPは分析次数である。
The LSP coefficients w 1 to w P are parameters in the frequency domain and have the property of being concentrated in the vicinity of the formant frequencies F 1 to F P / 2 of the speech, and between the LSP coefficients w 1 to w P. Has the following relationship: That is, 0 <w 1 <w 2 <
<W P-1 <w P <π, where P is the analysis order.

この性質を利用して、係数間距離の判定回路4によ
り、線スペクトル対係数信号3にて、第2図のように隣
接するLSP係数間の距離(w2−w1)〜(w8−w7)を計算
する。
Utilizing this property, the inter-coefficient distance determination circuit 4 determines the distance (w 2 −w 1 ) to (w 8 −) between adjacent LSP coefficients in the line spectrum pair coefficient signal 3 as shown in FIG. Calculate w 7 ).

係数間距離(wn−wn-1)の計算方法の一例をつぎに述
べる。LSP分析の次数がP次のとき、n=2,3,…Pにお
いて、次式を計算する。
An example of the calculation method of the inter-coefficient distance (w n −w n-1 ) will be described below. When the order of the LSP analysis is the P-th order, the following equation is calculated at n = 2, 3, ... P.

(wn−wn-1)<wTH1 ……(1) (wn−wn-1)<wTH2 ……(2) なお、wTH1とwTH2とは、LSP係数間距離(wn−wn-1
のしきい値であり、wTH1<wTH2に設定される。
(W n −w n-1 ) <w TH1 …… (1) (w n −w n-1 ) <w TH2 …… (2) Note that w TH1 and w TH2 are the distances between the LSP coefficients (w n −w n-1 )
Is a threshold value of w TH1 <w TH2 .

(1)式を満足するLSP係数w1〜wPが1個以上存在
し、かつ(2)式を満足するLSP係数w1〜wPが2個以上
存在すれば、係数間距離判定結果の信号5が、有声音で
あると判定され、次に有声音の判定回路6は、係数間距
離判定結果信号5が有声音であることを、たとえば連続
して3フレーム継続して入力されると、音声検出結果の
信号7を出力する。
If there are one or more LSP coefficients w 1 to w P satisfying the expression (1) and two or more LSP coefficients w 1 to w P satisfying the expression (2), the inter-coefficient distance determination result If the signal 5 is determined to be a voiced sound, and then the voiced sound determination circuit 6 inputs that the inter-coefficient distance determination result signal 5 is a voiced sound, for example, it is continuously input for three frames. , And outputs the signal 7 of the voice detection result.

第2図は、第1図の実施例において、有声音の場合の
周波数スペクトルとLSP係数との関係を示し、また第3
図は、第1図の実施例において、無声音あるいは周囲雑
音の周波数スペクトルとLSP係数との関係を示す。
FIG. 2 shows the relationship between the frequency spectrum and the LSP coefficient in the case of voiced sound in the embodiment of FIG.
The figure shows the relationship between the frequency spectrum of unvoiced sound or ambient noise and the LSP coefficient in the embodiment of FIG.

第2図から分かるように、有声音の場合、ホルマント
周波数F1〜F4の近ぼうにLSP係数w1〜w8が集中してい
る。また、第1ホルマント周波数F1は一般に共振の利得
が高いため、LSP係数w1,w2の集中度も強まって、LSP係
数間距離(w2−w1)は、しきい値wTH1より小さくなり、
第2ホルマント周波数F2近ぼうのLSP係数間距離(w4−w
3)はしきい値wTH2よりも小さくなる。
As can be seen from FIG. 2, in the case of voiced sound, the LSP coefficients w 1 to w 8 are concentrated near the formant frequencies F 1 to F 4 . In addition, since the first formant frequency F 1 generally has a high resonance gain, the degree of concentration of the LSP coefficients w 1 and w 2 also increases, and the distance between the LSP coefficients (w 2 −w 1 ) becomes greater than the threshold value w TH1 . Getting smaller,
Second formant frequency F 2 Distance between LSP coefficients near (w 4 −w
3 ) becomes smaller than the threshold value w TH2 .

しかし無声音や周囲雑音の場合、第3図の如く、周波
数スペクトルが平坦であり、LSP係数w1〜w8の集中は少
ない。このため、LSP係数間距離(wn−wn-1)はしきい
値wTH1,wTH2より小さくなることはない。
However, in the case of unvoiced sound or ambient noise, the frequency spectrum is flat and the LSP coefficients w 1 to w 8 are not concentrated as shown in FIG. Therefore, the distance between LSP coefficients (w n −w n−1 ) does not become smaller than the threshold values w TH1 and w TH2 .

〔発明の効果〕〔The invention's effect〕

以上説明したように本発明は、入力信号が音声信号で
あるかどうかを判定するために、入力信号レベルの大き
さで判定するかわりに、線スペクトル対係数の係数間距
離を用いることにより、周囲雑音にうもれた音声でも、
有声音であれば検出することが可能であるから、音声認
識装置における認識率の向上に効果がある。
As described above, according to the present invention, in order to determine whether or not an input signal is a voice signal, instead of determining based on the magnitude of the input signal level, the inter-coefficient distance between the line spectrum and the coefficient is used to Even voices that are noisy
Since voiced sound can be detected, it is effective in improving the recognition rate in the voice recognition device.

【図面の簡単な説明】[Brief description of drawings]

第1図は本発明による一実施例を示すブロック図、第2
図ならびに第3図は、それぞれ同上を説明するためのグ
ラフ図、第4図は従来例を示すブロック図である。 2……線スペクトル対変換回路、 4……係数間距離判定回路、 6……有声音判定回路。
FIG. 1 is a block diagram showing an embodiment of the present invention, and FIG.
FIG. 3 and FIG. 3 are graphs for explaining the same as above, and FIG. 4 is a block diagram showing a conventional example. 2 ... Line spectrum pair conversion circuit, 4 ... Inter-coefficient distance determination circuit, 6 ... Voiced sound determination circuit.

Claims (1)

(57)【特許請求の範囲】(57) [Claims] 【請求項1】入力信号を一定の抽出区間ごとに線スペク
トル対係数に変換する変換回路と、 該変換された線スペクトル対係数の隣接する係数間の距
離が、しきい値より大きいか小さいかを判定する係数間
距離の判定回路と、 該係数間距離判定回路の判定結果が、連続して一定時間
以上継続したかどうかを判定し音声を検出する有声音判
定回路と、を備えている音声判定検出装置。
1. A conversion circuit for converting an input signal into a line spectrum pair coefficient for each fixed extraction section, and whether a distance between adjacent coefficients of the converted line spectrum pair coefficient is larger or smaller than a threshold value. A voice having a determination circuit for determining the inter-coefficient distance, and a voiced sound determination circuit that determines whether or not the determination result of the inter-coefficient distance determination circuit has continued for a fixed time or longer and detects a voice. Judgment detection device.
JP62097779A 1987-04-20 1987-04-20 Voice judgment detector Expired - Lifetime JP2564821B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP62097779A JP2564821B2 (en) 1987-04-20 1987-04-20 Voice judgment detector

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP62097779A JP2564821B2 (en) 1987-04-20 1987-04-20 Voice judgment detector

Publications (2)

Publication Number Publication Date
JPS63262693A JPS63262693A (en) 1988-10-28
JP2564821B2 true JP2564821B2 (en) 1996-12-18

Family

ID=14201317

Family Applications (1)

Application Number Title Priority Date Filing Date
JP62097779A Expired - Lifetime JP2564821B2 (en) 1987-04-20 1987-04-20 Voice judgment detector

Country Status (1)

Country Link
JP (1) JP2564821B2 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2870772B2 (en) * 1988-12-26 1999-03-17 カシオ計算機株式会社 Electronic still camera
JPH0619499A (en) * 1992-07-02 1994-01-28 Kokusai Electric Co Ltd Voiced/voiceless decision making circuit
WO2001033548A1 (en) * 1999-10-29 2001-05-10 Fujitsu Limited Rate control device for variable-rate voice encoding system and its method
JP4619549B2 (en) * 2000-01-11 2011-01-26 パナソニック株式会社 Multimode speech decoding apparatus and multimode speech decoding method
KR20050049103A (en) * 2003-11-21 2005-05-25 삼성전자주식회사 Method and apparatus for enhancing dialog using formant
JP5169297B2 (en) * 2008-02-22 2013-03-27 ヤマハ株式会社 Sound processing apparatus and program

Also Published As

Publication number Publication date
JPS63262693A (en) 1988-10-28

Similar Documents

Publication Publication Date Title
US6216103B1 (en) Method for implementing a speech recognition system to determine speech endpoints during conditions with background noise
US5970441A (en) Detection of periodicity information from an audio signal
US6983242B1 (en) Method for robust classification in speech coding
JPS59115625A (en) Voice detector
US20050131689A1 (en) Apparatus and method for detecting signal
JP2564821B2 (en) Voice judgment detector
US7966179B2 (en) Method and apparatus for detecting voice region
KR0155315B1 (en) Celp vocoder pitch searching method using lsp
CN116312561A (en) Method, system and device for voice print recognition, authentication, noise reduction and voice enhancement of personnel in power dispatching system
JP3354252B2 (en) Voice recognition device
JP3413862B2 (en) Voice section detection method
JPH07199997A (en) Processing method of sound signal in processing system of sound signal and shortening method of processing time in itsprocessing
JPH08221097A (en) Detection method of audio component
JP2992324B2 (en) Voice section detection method
JPH0449952B2 (en)
US5208861A (en) Pitch extraction apparatus for an acoustic signal waveform
JP3355473B2 (en) Voice detection method
US7155387B2 (en) Noise spectrum subtraction method and system
KR100345402B1 (en) An apparatus and method for real - time speech detection using pitch information
JPH03114100A (en) Voice section detecting device
KR0171004B1 (en) Basic frequency using samdf and ratio technique of the first format frequency
KR20040073145A (en) Performance enhancement method of speech recognition system
JPH0558551B2 (en)
JP2951333B2 (en) Audio signal section discrimination method
Smith A neurally motivated technique for voicing detection and F0 estimation for speech