JPH071437B2 - Voice recognizer - Google Patents

Voice recognizer

Info

Publication number
JPH071437B2
JPH071437B2 JP63095697A JP9569788A JPH071437B2 JP H071437 B2 JPH071437 B2 JP H071437B2 JP 63095697 A JP63095697 A JP 63095697A JP 9569788 A JP9569788 A JP 9569788A JP H071437 B2 JPH071437 B2 JP H071437B2
Authority
JP
Japan
Prior art keywords
standard pattern
matching
pattern
voice
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP63095697A
Other languages
Japanese (ja)
Other versions
JPH01267699A (en
Inventor
洋一 元田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP63095697A priority Critical patent/JPH071437B2/en
Publication of JPH01267699A publication Critical patent/JPH01267699A/en
Publication of JPH071437B2 publication Critical patent/JPH071437B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Description

【発明の詳細な説明】 〔産業上の利用分野〕 本発明は発声された音声を標準パターンとDP(Dynamic
Programming)マツチングを行い、最小の相違度を与え
る標準パタンを求めることにより認識を行う音声認識装
置に係り、特に環境騒音の影響を受けにくい音声認識装
置に関するものである。
DETAILED DESCRIPTION OF THE INVENTION [Industrial field of application] The present invention converts uttered speech into a standard pattern and DP (Dynamic).
TECHNICAL FIELD The present invention relates to a speech recognition apparatus that performs recognition by determining a standard pattern that gives a minimum degree of difference, and particularly to a speech recognition apparatus that is not easily affected by environmental noise.

〔従来の技術〕[Conventional technology]

従来の音声認識装置では、発声された音声をマイクロホ
ンから入力し、電気信号に変換された音声信号波の振幅
(パワーを含む),スペクトルなどを検定して音声の検
出を行い、その区間の音声を認識している。
In a conventional voice recognition device, a uttered voice is input from a microphone, the amplitude (including power) and spectrum of a voice signal wave converted into an electric signal are tested to detect the voice, and the voice of the section is detected. I am aware of

そして、通常は、振幅レベルがある閾値を越えた点と下
回つた点を始端・終端としたり,あるいは上記点の近傍
でスペクトルが急激に変化した点を始端・終端として、
その音声区間に対して認識処理を行う。数字の“1"(/i
cni/)、札幌(/sapporo/)などの語中や、連続的に発
声された語と語の間には、休止区間(無音)が観測され
る。なお、語中の休止区間については音声区間に含める
方法と含めない方法がある。
And, usually, the point where the amplitude level exceeds a certain threshold value and the point where it falls below it are used as the start and end points, or the point where the spectrum changes rapidly in the vicinity of the above point is set as the start and end points.
The recognition process is performed for the voice section. The number "1" (/ i
A pause interval (silence) is observed between words such as cni /) and Sapporo (/ sapporo /), and between continuously uttered words. Note that there is a method of including a pause section in a word and a method of not including it in the voice section.

一方、音声データを入力する場所は静かな事務室だけで
なく工場内や屋外などのように、各種機械から騒音が発
生される所も多い。そして、音声認識装置では一般に雑
音消去用接話型マイクロホンを使用し雑音耐力を上げて
いるが、それでも十分とは言えない。雑音のレベルが音
声検出の閾値を越えたり、真の発声の始端・終端の前後
で雑音そのもののスペクトルが変化すると、音声検出を
誤るという事態が生じる。また、語中や語間の休止区間
に雑音が重畳し音声検出区間を誤ると、見かけ上パタン
長が長くなつて標準パタンとの整合が困難になり、発声
全体の認識結果を誤つてしまうことになる。そして、騒
音がそれほど高くない場合には、閾値を上げてかつ語中
の休止区間を音声区間から除くことにより、ある程度は
雑音の影響を受けにくくできる。しかし、雑音の振幅や
スペクトルが短時間に大幅に変化する場合、つまり、非
定常雑音である場合には、閾値を雑音のピーク値より高
く設定することになり、今度は発声の始端・終端および
休止点近傍にある振幅の小さい部分や子音部分の検出が
困難となるので、認識性能が著しく低下し、この方法は
実用的でない。
On the other hand, the place to input voice data is not only in a quiet office but also in many places such as in a factory or outdoors where noise is generated from various machines. In addition, a speech recognition apparatus generally uses a noise canceling close-talking microphone to improve noise immunity, but this is not enough. If the noise level exceeds the threshold for voice detection, or if the spectrum of the noise itself changes before and after the beginning and end of true utterance, a situation occurs in which voice detection is erroneous. Also, if noise is superimposed on a pause interval between words or between words and the voice detection section is mistaken, the pattern length becomes apparently long and it becomes difficult to match with the standard pattern, and the recognition result of the entire utterance is erroneous. become. Then, when the noise is not so high, the influence of the noise can be reduced to some extent by raising the threshold and removing the pause section in the word from the voice section. However, when the noise amplitude or spectrum changes significantly in a short time, that is, when it is non-stationary noise, the threshold is set higher than the peak value of the noise, and this time the start and end of the utterance and Since it becomes difficult to detect a small amplitude portion or a consonant portion near the pause point, the recognition performance is significantly reduced, and this method is not practical.

この音声検出誤りの影響を少なくするため発声の始端・
終端を一定に定めず、始端および終端に幅を持たせた、
いわゆる、始端・終端フリーの認識方法がある。そし
て、この始端・終端フリーの認識は始端候補点と終端候
補点が取り得る全ての組合せの区間の音声パタンについ
て比較照合を行い、認識結果として最も可能性の高いも
のを最終結果とすることにより実現される。その一例が
例えば、特願昭61−31179号明細書に詳細に記載されて
いる。そして、端点フリーの認識により音声区間の始端
・終端の検出誤りを少なくすることは可能であるが、発
声の休止区間に雑音が混入しその雑音が音声区間内に含
まれてしまう問題については何ら効果がないために正し
い認識結果が得られないことがよくおきる。
In order to reduce the effect of this voice detection error,
The end is not fixed and the beginning and end have a width.
There is a so-called start / end free recognition method. Then, the recognition of the start / end free is performed by comparing and collating the voice patterns of the sections of all the combinations of the start and end candidate points, and determining the most likely recognition result as the final result. Will be realized. One example thereof is described in detail in Japanese Patent Application No. 61-31179. Although it is possible to reduce the detection error of the beginning and end of the voice section by the end-free recognition, there is no problem about the noise included in the pause section of the utterance and the noise included in the voice section. It is often the case that correct recognition results cannot be obtained due to ineffectiveness.

〔発明が解決しようとする課題〕[Problems to be Solved by the Invention]

上述した従来の音声認識方法では、音声の振幅レベルや
スペクトル変化などで音声検出を行い、始端・終端フリ
ーで音声認識を行う場合、発声中の休止区間に音声が混
入したときの付加によるエラーは依然として解決されて
いないという課題があつた。
In the above-mentioned conventional voice recognition method, when voice detection is performed based on the amplitude level or spectrum change of the voice, and voice recognition is performed at the start / end free, an error due to addition when voice is mixed in the pause section during utterance There was a problem that it was not solved yet.

〔課題を解決するための手段〕[Means for Solving the Problems]

本発明による音声認識装置は、発声された音声を標準パ
タンとDPマツチングを行い、最小の相違度を与える標準
パタンを求めることにより認識を行う音声認識装置にお
いて、標準パタン時間長に比例した相違度を計算するDP
マツチング部を持ち、DPマツチングパスが標準パタン側
の休止位置を通過するときにそれに対応する入力パタン
側の位置の近傍で相異度の最小値を与える仮休止点を求
め、この相異度を境界条件として標準パタンの上記休止
位置以降の部分と入力パタンの上記仮休止点から先行し
た点の近傍部分との間で端点フリーにてDPマツチングを
続行する手段を備えてなるものである。
The speech recognition apparatus according to the present invention is a speech recognition apparatus which performs DP matching of uttered speech with a standard pattern and obtains a standard pattern that gives a minimum degree of difference. DP to calculate
When the DP matching path has a matching part and passes through the rest position on the standard pattern side, finds a temporary rest point that gives the minimum value of the difference near the position on the input pattern side corresponding to it, and determines the difference As a condition, there is provided means for continuing DP matching between the portion after the rest position of the standard pattern and the portion in the vicinity of the point preceding the temporary rest point of the input pattern without end points.

〔作用〕[Action]

本発明においては、標準パタン側の休止位置に対応する
入力パタンの位置の前後でも端点フリーでマツチングが
行なわれるため、入力パタンの休止区間の正確な検出が
必要でない。
In the present invention, since the end-point-free matching is performed even before and after the position of the input pattern corresponding to the rest position on the standard pattern side, it is not necessary to accurately detect the rest period of the input pattern.

〔実施例〕〔Example〕

以下、図面に基づき本発明の実施例を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

第1図は本発明の一実施例を示すブロツク図である。FIG. 1 is a block diagram showing an embodiment of the present invention.

図において、1は音声信号波sを入力する入力部、2は
この入力部1の出力を入力とする標準パタンメモリ、3
は入力部1の出力を入力とする入力パタンメモリ、4は
この入力パタンメモリ3からの入力パタン1と標準パタ
ンメモリ2よりの標準パタンjを入力とし標準パタン時
間長に比例した相異度を計算するDPマツチング部で、こ
れらはDPマツチングパスが標準パタン側の休止位置を通
過するときにそれに対応する入力パタン側の位置の近傍
で相異度の最小値を与える仮休止点を求め、この相異度
を境界条件として標準パタンの上記休止位置以降の部分
と入力パタンの上記仮休止点から先行した点の近傍部分
との間で端点フリーにてDPマツチングを続行する手段を
構成している。
In the figure, 1 is an input section for inputting an audio signal wave s, 2 is a standard pattern memory for receiving the output of the input section 1, 3
Is an input pattern memory whose input is the output of the input unit 1, and 4 is an input pattern 1 from the input pattern memory 3 and a standard pattern j from the standard pattern memory 2 is input, and a difference degree proportional to the standard pattern time length is shown. In the DP matching section for calculation, these are temporary rest points that give the minimum value of the difference in the vicinity of the corresponding position on the input pattern side when the DP matching path passes through the rest position on the standard pattern side. A means for continuing DP matching between the portion after the rest position of the standard pattern and the portion in the vicinity of the point preceding the temporary rest point of the input pattern using the degree of difference as a boundary condition is constituted.

つぎにこの第1図に示す実施例の動作を説明する。The operation of the embodiment shown in FIG. 1 will be described below.

まず、入力部1は入力される音声信号波sの振幅レベル
が予め定められた閾値より高い区間を音声区間として検
出し、特徴パラメータの時系列パタンに変換する。ま
た、語中に休止区間があればその位置も検出する。そし
て、登録時においては、時系列パタンと休止位置が標準
パタンメモリ2に記憶される。認識時においては、時系
列パタンは入力パタンメモリ3に一時的に記憶される。
First, the input unit 1 detects a section in which the amplitude level of the input voice signal wave s is higher than a predetermined threshold as a voice section and converts it into a time-series pattern of characteristic parameters. If there is a pause section in the word, its position is also detected. Then, at the time of registration, the time-series pattern and the rest position are stored in the standard pattern memory 2. At the time of recognition, the time series patterns are temporarily stored in the input pattern memory 3.

つぎに、DPマツチング部4は入力パタンメモリ3から出
力される入力パタンiと標準パタンメモリ2から出力さ
れる標準パタンjをベースにして標準パタン時間長に比
例した相異度を計算する。そして、標準パタンメモリ2
からは標準パタンの休止位置情報qもDPマツチング部4
へ指示される。
Next, the DP matching unit 4 calculates a difference degree proportional to the standard pattern time length based on the input pattern i output from the input pattern memory 3 and the standard pattern j output from the standard pattern memory 2. And the standard pattern memory 2
Also, the rest position information q of the standard pattern is displayed by the DP matching unit 4
Be instructed to.

第2図は第1図の動作説明に供するDPマツチングの過程
を説明するための図で、横軸に入力パタンiを、縦軸に
標準パタンjをとつて表わした説明図である。
FIG. 2 is a diagram for explaining the process of DP matching used in the explanation of the operation in FIG. 1, and is an explanatory diagram in which the horizontal axis represents the input pattern i and the vertical axis represents the standard pattern j.

DPマツチングパスDが標準パタン側の休止位置Qを通過
するとき、それに対応する入力パタン側の近傍K1で最小
の相異度を与える点P1(仮休止点)を求める。この点P1
の相異度を境界条件として標準パタン側の休止位置Q以
降の部分パタンと入力パタンの仮休止点から先行した点
の近傍K2との間で端点フリーでDPマツチングを続行す
る。この第2図において、端点フリーのマツチング結果
として点P2からDPマツチングが続行したことを示す。点
P1から点P2の間が入力パタン側の休止区間と扱われたこ
とになる。
When the DP matching path D passes through the rest position Q on the standard pattern side, a point P 1 (temporary rest point) that gives the minimum difference in the corresponding neighborhood K 1 on the input pattern side is obtained. This point P 1
As a boundary condition, the DP matching is continued between the partial pattern after the rest position Q on the standard pattern side and the neighborhood K 2 of the point preceding the temporary rest point of the input pattern, free of end points. In FIG. 2, it is shown that the DP matching continued from the point P 2 as a result of the end point-free matching. point
This means that the interval between P 1 and P 2 is treated as the pause interval on the input pattern side.

そして、雑音が休止区間に混入し音声検出を誤つた場合
でも、入力パタンの雑音成分はスキツプしてDPマツチン
グが行なわれる。ここで、標準パタン毎に休止位置とそ
の個数は異なるが、上記の計算を繰り返すことにより、
標準パタンと入力パタンの始端から終端までの相異度を
求め、最終的に標準パタン長Jで正規化した相異度が最
も小さい標準パタンを標準結果Rとして出力する。
Then, even if noise is mixed in the pause interval and the voice detection is erroneous, the noise component of the input pattern is skipped and DP matching is performed. Here, although the rest position and the number thereof are different for each standard pattern, by repeating the above calculation,
The difference between the start end and the end of the standard pattern and the input pattern is obtained, and finally the standard pattern with the smallest difference normalized by the standard pattern length J is output as the standard result R.

〔発明の効果〕〔The invention's effect〕

以上説明したように本発明は、標準パタン側の休止位置
に対応する入力パタンの位置の前後でも端点フリーでマ
ツチングが行なわれるため、入力パタンの休止区間の正
確な検出が必要でなく、発声の休止区間に雑音が重畳し
入力パタンが長くなり誤認識をおこすという課題を解決
することができるので、非定常騒音があつた場合でも通
常時の認識性能を維持することができるという効果があ
る。また、本発明は語中の休止区間に限らず、語間の休
止区間についても適用可能であるので、連続単語認識に
おいても効果を発揮する。
As described above, according to the present invention, since the end-point-free matching is performed even before and after the position of the input pattern corresponding to the rest position on the standard pattern side, it is not necessary to accurately detect the rest period of the input pattern, and the utterance Since it is possible to solve the problem that noise is superimposed on the pause section and the input pattern becomes long to cause erroneous recognition, there is an effect that the recognition performance in normal time can be maintained even when there is unsteady noise. Further, the present invention can be applied not only to the pause section in a word but also to the pause section between words, so that it is also effective in continuous word recognition.

【図面の簡単な説明】[Brief description of drawings]

第1図は本発明の一実施例を示すブロツク図、第2図は
第1図の動作説明に供するDPマツチングの過程を説明す
るための説明図である。 1……入力部、2……標準パタンメモリ、3……入力パ
タンメモリ、4……DPマツチング部。
FIG. 1 is a block diagram showing an embodiment of the present invention, and FIG. 2 is an explanatory diagram for explaining the process of DP matching used for the operation explanation of FIG. 1 ... Input section, 2 ... Standard pattern memory, 3 ... Input pattern memory, 4 ... DP matching section.

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】発声された音声を標準パターンとDPマツチ
ングを行い、最小の相異度を与える標準パタンを求める
ことにより認識を行う音声認識装置において、標準パタ
ン時間長に比例した相異度を計算するDPマツチング部を
持ち、DPマツチングパスが標準パタン側の休止位置を通
過するときにそれに対応する入力パタン側の位置の近傍
で相異度の最小値を与える仮休止点を求め、この相異度
を境界条件として標準パタンの前記休止位置以降の部分
と入力パタンの前記仮休止点から先行した点の近傍部分
との間で端点フリーにてDPマツチングを続行する手段を
備えてなることを特徴とする音声認識装置。
1. A speech recognition apparatus for recognizing a uttered voice by performing DP matching with a standard pattern and obtaining a standard pattern giving a minimum degree of difference, and determining a degree of difference proportional to a standard pattern time length. It has a DP matching part for calculation, and when the DP matching path passes through the rest position on the standard pattern side, finds a temporary rest point that gives the minimum value of the difference near the position on the input pattern side corresponding to this With a degree as a boundary condition, a means is provided for continuing the DP matching without end points between the portion after the rest position of the standard pattern and the portion in the vicinity of the point preceding the temporary rest point of the input pattern. And a voice recognition device.
JP63095697A 1988-04-20 1988-04-20 Voice recognizer Expired - Lifetime JPH071437B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP63095697A JPH071437B2 (en) 1988-04-20 1988-04-20 Voice recognizer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP63095697A JPH071437B2 (en) 1988-04-20 1988-04-20 Voice recognizer

Publications (2)

Publication Number Publication Date
JPH01267699A JPH01267699A (en) 1989-10-25
JPH071437B2 true JPH071437B2 (en) 1995-01-11

Family

ID=14144692

Family Applications (1)

Application Number Title Priority Date Filing Date
JP63095697A Expired - Lifetime JPH071437B2 (en) 1988-04-20 1988-04-20 Voice recognizer

Country Status (1)

Country Link
JP (1) JPH071437B2 (en)

Also Published As

Publication number Publication date
JPH01267699A (en) 1989-10-25

Similar Documents

Publication Publication Date Title
US4829578A (en) Speech detection and recognition apparatus for use with background noise of varying levels
EP0077194B1 (en) Speech recognition system
JP2768274B2 (en) Voice recognition device
JP3069531B2 (en) Voice recognition method
JPH071437B2 (en) Voice recognizer
JPH0449952B2 (en)
JP2666296B2 (en) Voice recognition device
JPH0430040B2 (en)
JP3008593B2 (en) Voice recognition device
JPS5999497A (en) Voice recognition equipment
JP3107905B2 (en) Voice recognition device
JP2882792B2 (en) Standard pattern creation method
JPH05210397A (en) Voice recognizing device
JP2844592B2 (en) Discrete word speech recognition device
JP2901976B2 (en) Pattern matching preliminary selection method
JPH1097269A (en) Device and method for speech detection
JPH0754434B2 (en) Voice recognizer
JPH0651792A (en) Speech recognizing device
JP2000155600A (en) Speech recognition system and input voice level alarming method
JPH0651793A (en) Speech recognizing device
KR19980017116A (en) Driver's voice signal section detection device and method
JPS59170894A (en) Voice section starting system
JPH0343639B2 (en)
JPS61260299A (en) Voice recognition equipment
JP3065691B2 (en) Voice recognition device