JPS60101598A - Voice section detector - Google Patents

Voice section detector

Info

Publication number
JPS60101598A
JPS60101598A JP58208669A JP20866983A JPS60101598A JP S60101598 A JPS60101598 A JP S60101598A JP 58208669 A JP58208669 A JP 58208669A JP 20866983 A JP20866983 A JP 20866983A JP S60101598 A JPS60101598 A JP S60101598A
Authority
JP
Japan
Prior art keywords
detection
speech
signal
section
voice section
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP58208669A
Other languages
Japanese (ja)
Inventor
中谷 奉文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP58208669A priority Critical patent/JPS60101598A/en
Publication of JPS60101598A publication Critical patent/JPS60101598A/en
Pending legal-status Critical Current

Links

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 肢生公互 本発明は、音声区間検出装置、より詳細には、音声認識
装置において、音声区間を安定して切り出すための語頭
検出装置に関する。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech segment detection device, and more particularly to a word beginning detection device for stably cutting out speech segments in a speech recognition device.

従来技術 音声認識装置において、入力信号対雑音比が良好な音声
を対象とする場合には音声の存在する区間を抽出するこ
とは比較的に容易なことである。
In a conventional speech recognition device, when the target is speech with a good input signal-to-noise ratio, it is relatively easy to extract a section where speech exists.

しかしながら、音声の認識装置が実際に使用されるよう
な環境においては、種々の騒音を含み、騒音・雑音と重
畳された形で音声が入力される。その雑音は時々刻々変
化するもので、そのため固定的な閾値を設けておいて音
声区間を切り出す方法では安定な音声区間の切り出しは
困難であり、誤認識の一因となる。この解決法として、
閾値を種々の方法で可変にする方法が提案されているが
、いづれもその方法が複雑で高価となる欠点を有してい
た。
However, in an environment in which a speech recognition device is actually used, speech is input in a form that contains various noises and is superimposed on the noise. The noise changes from moment to moment, and therefore, using a method of setting a fixed threshold value and cutting out a voice section, it is difficult to cut out a stable voice section, and this becomes a cause of misrecognition. As a solution to this
Various methods have been proposed for varying the threshold value, but each method has the drawback of being complicated and expensive.

−比−−煎 本発明は、上述のごとき実情に鑑みてなされたもので、
特に、周囲雑音レベルの高低にかかわりなく安定な音声
区間の切り出しを行うことができ、安定した認識率を確
保することのできる音声区間検出装置を提供することを
目的としてなされたものである。
-Comparison--The present invention was made in view of the above-mentioned circumstances.
In particular, the purpose of this invention is to provide a speech section detection device that can stably extract speech sections regardless of the level of ambient noise and ensure a stable recognition rate.

構成 本発明の構成について、以下、実施例に基づいて説明す
る。
Configuration The configuration of the present invention will be described below based on examples.

第1図は、本発明の動作原理を説明するための音声信号
波形図で、第1図(a)は、例えば、入力信号のパワー
の変化を示す波形図で、A点は図中の予め決めた閾値T
H1で検出した大雑把な語頭である。この場合の閾値T
H,は一般的には雑音レベルの最大値よりも大きな値に
設定する。第1図(b)は第1図(q)の信号を時間△
tだけ遅延させた波形図で、従って、A点に対応する点
はA′となる。閾値TH2はA点を検出した時点の近傍
で設定した値で、この値をもとにしてよす正確な立ち上
がり点Bを決定し、正しい音声区間を決定する。よって
、装置として信号の取り込み及び区間の検出はA点を検
出した時点より始めれば良いことになる。
FIG. 1 is an audio signal waveform diagram for explaining the operating principle of the present invention, and FIG. 1(a) is a waveform diagram showing, for example, a change in the power of an input signal. Decided threshold T
This is the rough beginning of the word detected by H1. Threshold T in this case
H, is generally set to a value larger than the maximum value of the noise level. Figure 1(b) shows the signal in Figure 1(q) over time △
In the waveform diagram delayed by t, therefore, the point corresponding to point A is A'. The threshold value TH2 is a value set near the time point A is detected, and based on this value, an accurate rising point B is determined, and a correct speech section is determined. Therefore, it is sufficient for the device to start acquiring signals and detecting sections from the time point A is detected.

上述のように、本発明は、固定閾値により予備的に音声
の語頭を検出したのちに正確な語頭を検出して音声区間
を検出しようとするもので、第2図に、その一実施例で
ある電気的ブロック線図を示す。
As described above, the present invention attempts to detect a speech interval by preliminary detection of the beginning of a speech using a fixed threshold value, and then detecting the exact beginning of the speech. 1 shows an electrical block diagram.

第2図において、■は入力信号端、2は遅延器、3は特
徴抽出部、4は量子化装置、5は音声区間検出器、6は
予備語頭検出器で、入力端1からの入力信号は遅延器2
及び予備語頭検出器6に印加され、遅延器2の出力は例
えばB、P、F群のような特徴抽出部3を経てAD変換
器のような量子化装置4でデジタル信号に変換され、音
声区間検出器5で音声区間が検出されて区間信号として
出力8される。また、予備語頭検出器6からの検出信号
7は量子化装置4及び音声区間検出器5に加えられ、こ
の信号に基づいて量子化及び区間検出を始める。
In FIG. 2, ■ is an input signal end, 2 is a delay device, 3 is a feature extractor, 4 is a quantization device, 5 is a voice section detector, 6 is a preliminary word beginning detector, and the input signal from input end 1 is is delay device 2
The output of the delay device 2 is applied to a preliminary word-initial detector 6, and the output of the delay device 2 is converted into a digital signal by a quantization device 4 such as an AD converter through a feature extraction unit 3 such as groups B, P, and F. A voice section is detected by a section detector 5 and output 8 as a section signal. Further, the detection signal 7 from the preliminary word-initial detector 6 is applied to the quantizer 4 and the speech section detector 5, and quantization and section detection are started based on this signal.

第3図は、第2図に示した予備語頭検出器6の詳細図で
、入力端1からの入力信号はパワーエネルギー又は零交
叉波数等の語頭検出パラメータ抽出器9に加えられ、そ
の出力信号が比較器工0にて閾値設定器11の設定閾値
T H、と比較されて第1図(q)のA点を検出する。
FIG. 3 is a detailed diagram of the preliminary word-initial detector 6 shown in FIG. is compared with the set threshold value TH of the threshold value setter 11 in the comparator 0, and the point A in FIG. 1(q) is detected.

ただし、設定閾値はパワーエネルギー又は零交叉波数等
パラメータに対応した値である。また、音声区間検出器
5は正確な語頭の検出をする検出部で、フレームごとの
差分により立ち上がりを検出するとか帯域毎のパワー比
から検出するとか、既に公知の種々の検出手段のいずれ
を用いて検出してもよい。
However, the set threshold value is a value corresponding to a parameter such as power energy or zero-crossing wave number. The speech interval detector 5 is a detection unit that accurately detects the beginning of a word, and can use any of various known detection means, such as detecting the rise based on the difference between frames or detecting it from the power ratio of each band. It may also be detected by

第4図は、本発明の他の実施例を説明するための構成図
で、図中、12はシフ1〜レジスタ、13は特徴抽出器
で、その他第2図と同様の作用をする部分には第2図の
場合と同一の参照番号を付しである。而して、この実施
例は、遅延器をアナログ信号のところで用いるのではな
く、量子化したデジタル信号で扱うようにしてシフトレ
ジスタで構成したもので、入力端1がらの入力信号は量
子化装置4及び予備語頭検出器6に印加される。量子化
装置4によって量子化された信号はシフトレジスタ12
でΔtだけ遅延され、例えばデジタルフィルター等の特
徴抽出器13で特徴パラメータが抽出され、このパラメ
ータに基づいて音声区間検出器5で正確な音声区間が検
出さ4tて出力8される。一方、予備語頭検出器6がら
の検出信号7は、量子化装置4、特徴抽出器13及び音
声区間検出器5に加えられ、この信号に基づいて量子化
、特徴抽出及び区間検出がなされる。
FIG. 4 is a block diagram for explaining another embodiment of the present invention. In the figure, 12 is a shift 1 to register, 13 is a feature extractor, and other parts having the same function as those in FIG. are given the same reference numerals as in FIG. Therefore, in this embodiment, the delay device is not used for analog signals, but is configured with a shift register to handle quantized digital signals, and the input signal from input terminal 1 is processed by the quantization device. 4 and a preliminary word-initial detector 6. The signal quantized by the quantizer 4 is transferred to the shift register 12.
Then, a feature extractor 13 such as a digital filter extracts a feature parameter, and based on this parameter, a speech section detector 5 detects an accurate speech section 4t and outputs it 8. On the other hand, the detection signal 7 from the preliminary word-initial detector 6 is applied to the quantizer 4, the feature extractor 13, and the speech segment detector 5, and quantization, feature extraction, and segment detection are performed based on this signal.

勢−一米 以上の説明がら明らかなように、本発明によると、最初
にあらく音声の語頭を検出し、その後に正確な?)声の
語頭及び区間を検出するようにしたので、誤検出のない
区間検出ができ、更に、装置本体の動作を予備語頭検出
後のみに限定して使用できるので効率の良い操作ができ
る。
As is clear from the above explanation, according to the present invention, the beginning of a word is roughly detected first, and then the beginning of the word is detected accurately. ) Since the beginning and section of the voice are detected, the section can be detected without false detection, and furthermore, the operation of the main body of the device can be limited to only after the preliminary beginning of the word has been detected, allowing efficient operation.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は、本発明の動作原理を説明するための波形図、
第2図は、本発明の一実施例を説明するための電気的ブ
ロック線図、第3図は、第2図に示した予備語頭検出器
6の詳細電気回路図、第4図は、本発明の他の実施例を
説明するための電気的ブロック線図である。 2・・・遅延器、3・・・特徴抽出部、4・・・量子化
装置、5・・・音声区間検出器、6・・・予備語頭検出
器、10・・・比較器、11・・・閾値設定器、12・
・・シフトレジスタ、工3・・・特徴抽出器。 第1図 −(21ン1
FIG. 1 is a waveform diagram for explaining the operating principle of the present invention,
FIG. 2 is an electrical block diagram for explaining one embodiment of the present invention, FIG. 3 is a detailed electrical circuit diagram of the pre-word beginning detector 6 shown in FIG. 2, and FIG. FIG. 3 is an electrical block diagram for explaining another embodiment of the invention. 2... Delay device, 3... Feature extractor, 4... Quantization device, 5... Speech section detector, 6... Preliminary beginning detector, 10... Comparator, 11.・Threshold value setter, 12・
...Shift register, Engineering 3...Feature extractor. Figure 1 - (21-1

Claims (4)

【特許請求の範囲】[Claims] (1)、予め設定した閾値に基づいて予備的に信号の語
頭を検出する検出手段と、該検出手段の検出信号に基づ
いて詳細な音声区間を検出する音声区間検出手段を有す
ることを特徴とする音声区間検出装置。
(1) It is characterized by having a detection means for preliminarily detecting the beginning of a word of a signal based on a preset threshold value, and a speech section detection means for detecting a detailed speech section based on the detection signal of the detection means. Voice section detection device.
(2)、前記音声区間検出手段の前段に遅延器を挿入し
て一定時間信号を遅延することを特徴とする特許請求の
範囲第(1,)項に記載の音声区間検出装置。
(2) The voice section detecting device according to claim 1, wherein a delay device is inserted before the voice section detecting means to delay the signal for a certain period of time.
(3)、前記予備的な語頭検出信号により量子化手段と
音声区間検出音段を動作させることを特徴とする特許請
求の範囲第(1)項に記載の音声区間検出装置。
(3) The speech section detection device according to claim (1), wherein the preliminary word beginning detection signal operates the quantization means and the speech section detection stage.
(4)、前記音声区間検出手段の前段にシフ1−レジス
タを挿入して一定時間信号を遅延す名ことを特徴とする
特許請求の範囲第(1)項に記載の音声区間検出装置。
(4) The voice section detecting device according to claim 1, characterized in that a shift 1 register is inserted before the voice section detecting means to delay the signal for a certain period of time.
JP58208669A 1983-11-07 1983-11-07 Voice section detector Pending JPS60101598A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP58208669A JPS60101598A (en) 1983-11-07 1983-11-07 Voice section detector

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58208669A JPS60101598A (en) 1983-11-07 1983-11-07 Voice section detector

Publications (1)

Publication Number Publication Date
JPS60101598A true JPS60101598A (en) 1985-06-05

Family

ID=16560089

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58208669A Pending JPS60101598A (en) 1983-11-07 1983-11-07 Voice section detector

Country Status (1)

Country Link
JP (1) JPS60101598A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63175896A (en) * 1987-01-17 1988-07-20 シャープ株式会社 Non-sound compression voice recorder
JPH01266599A (en) * 1988-04-18 1989-10-24 Sharp Corp No-sound compression speech recorder

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63175896A (en) * 1987-01-17 1988-07-20 シャープ株式会社 Non-sound compression voice recorder
JPH01266599A (en) * 1988-04-18 1989-10-24 Sharp Corp No-sound compression speech recorder

Similar Documents

Publication Publication Date Title
JPS5862699A (en) Voice recognition equipment
JPS59105695A (en) Voice pause recognition
JPS6245730B2 (en)
JPS60101598A (en) Voice section detector
EP0770254B1 (en) Transmission system and method for encoding speech with improved pitch detection
JPS61233791A (en) Voice section detection system for voice recognition equipment
JP2989219B2 (en) Voice section detection method
JP2737109B2 (en) Voice section detection method
JP3484559B2 (en) Voice recognition device and voice recognition method
JP2532618B2 (en) Pitch extractor
JPS59105697A (en) Voice recognition equipment
JPS6250837B2 (en)
JPS592918B2 (en) pitch extraction device
JPS63127296A (en) Voice section detection system
JPS61259296A (en) Voice section detection system
JPS62238599A (en) Voice section detecting system
JPS61140999A (en) Voice section detection system
JPS62237498A (en) Voice section detecting method
JPS63306497A (en) Voice section detecting system
JPS63306498A (en) Voice section detecting system
JPS60216399A (en) Voice section detecting circuit for voice recognition equipment
JPS59228299A (en) Voice section detecting system
JPH04251299A (en) Speech section detecting means
JPS6058622B2 (en) Reception timing extraction method
JPH0394300A (en) Voice detector