JPS59231599A

JPS59231599A - Voice recognition

Info

Publication number: JPS59231599A
Application number: JP58106133A
Authority: JP
Inventors: 角川　允彦; 猛宮川; 新居　康彦
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1983-06-14
Filing date: 1983-06-14
Publication date: 1984-12-26

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】産業上の利用分野本発明は、音声のスペクトル分析結果に基づいて音声を
認識する音声認識方法に関するものである。DETAILED DESCRIPTION OF THE INVENTION Field of Industrial Application The present invention relates to a speech recognition method for recognizing speech based on the results of spectrum analysis of speech.

従来例の構成とその問題点一般に、音声を認識するには、マイクロホン等で変換さ
れた音声電気信号波形から、スペクトル分析まだはそれ
と等価な処理を一定時間（１０〜２０　ｍ５ｅｃ　）毎
に行なう音声分析と、その分析結果を用いてカテゴリの
判別を行なう判別処理とが必要である。ところで騒音下
で音声認識を行なうと認識性能が低下する。これは、騒
音によって音声信号が汚されるためであり、標準音声と
のマツチング距離が離れることによる。Conventional configuration and problems Generally speaking, to recognize speech, it is necessary to perform spectrum analysis or equivalent processing every fixed period of time (10 to 20 m5ec) from the speech electrical signal waveform converted by a microphone, etc. This requires analysis and a discrimination process that uses the analysis results to discriminate between categories. However, when performing speech recognition under noisy conditions, recognition performance deteriorates. This is because the audio signal is contaminated by noise, and the matching distance from the standard audio is longer.

従来の音声認識方法は、騒音による音声信号の汚れを除
去するだめに、分析結果を低域フィルタに通したり、時
間平均をとって平滑化を行なっていた。このような処理
によって、音声信号に騒音が加わった時の分析結果にあ
られれる局所的なスパイク状のノイズが除去される。In conventional speech recognition methods, in order to remove the contamination of the speech signal due to noise, the analysis results are passed through a low-pass filter or smoothed by time averaging. Such processing removes local spike-like noise that appears in the analysis results when noise is added to the audio signal.

しかしながら、上記従来例による平滑化はノイズ成分を
除去するばかりでなく、本来の分析結果の波形をも変形
してしまい、判別処理での標準音声とのマツチングがと
れず、騒音下での認識性能が余り向上しない問題点があ
った。However, the smoothing according to the conventional example described above not only removes noise components, but also deforms the waveform of the original analysis result, making it difficult to match with standard speech in discrimination processing, and reducing recognition performance in noisy environments. There was a problem that the performance did not improve much.

発明の目的　　　　″ 本発明は、上記従来例の問題点を除゛去するものであシ
、分析結果の本来の波形を保存しなから重畳したノイズ
成分を除去し、騒音による認識性能の低下を防ぐ音声認
識方法を提供することを目的とするものである。Purpose of the Invention ``The present invention eliminates the problems of the above-mentioned conventional example, and removes the superimposed noise component without preserving the original waveform of the analysis result, thereby preventing the deterioration of recognition performance due to noise. The purpose is to provide a speech recognition method that prevents

発明の構成本発明は、上記目的を達成するために、音声分析結果に
メジアン平均による平滑化を行なうことによって、音声
′信号に重畳したノイズ成分を除去し、この平滑化出力
をカテゴリの判別に用いる情報とするもので、騒音下で
の認識の低下を防ぐ効果を得るものである。Structure of the Invention In order to achieve the above object, the present invention removes noise components superimposed on the audio signal by smoothing the audio analysis results by median averaging, and uses this smoothed output for category discrimination. This information is used to obtain the effect of preventing a decline in recognition under noisy conditions.

実施例の説明以下に本発明の一実施例について、図面とともに説明す
る。DESCRIPTION OF EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings.

第１図において、マイクロホン９で電気信号に変換され
た音声信号を、増幅回路１を経てフィルタバンク２に入
力する。フィルタバンク２は、複数個の帯域フィルタ２
０１，２０２．・・・２０７Ｌ　と、それぞれに接続す
る整流回路２１１．２１２＋・・・２１？Ｌ、積分回路
２２１，２２２．・・・２２ｎで構成される。フィルタ
バンク２の出力２３□、２３□、・・・２３ｎはマルチ
プレクサ３によって順次選択され、Ａ／Ｄ変換器４で量
子化されて、平滑化手段５でメジアン平均による平滑化
を行なう。平滑化処理された平滑化信号６は、判別手段
７で標準音声とマツチングされ、結果８が出力される。In FIG. 1, an audio signal converted into an electrical signal by a microphone 9 is input to a filter bank 2 via an amplifier circuit 1. The filter bank 2 includes a plurality of bandpass filters 2.
01,202. ...207L and the rectifier circuits connected to each 211.212+...21? L, integrating circuits 221, 222. ...22n. The outputs 23□, 23□, . The smoothed signal 6 that has undergone the smoothing process is matched with the standard voice by the determining means 7, and a result 8 is output.

今、３ポイントのメジアン平均をとるものとすると、信
号列をＸ（→（ｎ：サンプル位置）としたとき、メジア
ン平均された信号列ｘＭ（→は次のようになる。Now, assuming that the median average of 3 points is taken, when the signal sequence is X (→ (n: sample position)), the median averaged signal sequence xM (→ is as follows).

、Ｚ帷）−”　ｍｅｄｌ　ａｎ　（”（ｎ−１）　＋　
”ｋ）ｌ　”　（ａ＋１　）’Ｊただし、ｍｅｄｌａｎ
　Ｉ：　’＋　’／＋　ｚ）は、ｘｒ　ｙ、　ｚＯ中間
値（ｍｅｄｉａｎ　）を表わす。これによれば、信号列
の連続する３点の最大値と最小値を捨てて、中間値をそ
の位置の値とするため、騒音の影響を受けて原音声波形
にスｄイク状の波形が重畳しても、極端な変化は吸収さ
れて原音声波のスペクトル形状の連続性が保たれる。ま
た、無騒音時の音声波形のように、連続性のある変化に
対しては、本平滑化を行なっても、原形が保存される。, Z 帷)−” medlan (”(n−1) +
"k)l"(a+1)'JHowever, medlan
I:'+'/+z) represents the median value of xry, zO. According to this, the maximum and minimum values of three consecutive points in the signal string are discarded and the intermediate value is used as the value at that position, so the original audio waveform is affected by noise and has a diagonal shape. Even when superimposed, extreme changes are absorbed and the continuity of the spectral shape of the original sound wave is maintained. Furthermore, even if the main smoothing is performed, the original shape is preserved for continuous changes, such as a speech waveform in the absence of noise.

第２図にメジアン平均の概念を示しておシ、３ポインＦ
　ａｌ、　ａ２゜ａ３においては最大値ａ３と最小値ａ
１が捨てられ、ａ２が抽出され、また３ポイントＩ）ｔ
、ｌ）＋、ｂｓではｂ３が　　　゛抽出される。第３図
Ａ、Ｂはメジアン平均による平滑化前・後の状態を示し
ている。Figure 2 shows the concept of median average.
al, at a2°a3, the maximum value a3 and the minimum value a
1 is discarded, a2 is extracted, and 3 points I)t
, l) +, bs, b3 is extracted. FIGS. 3A and 3B show the state before and after smoothing by median averaging.

発明の効果本発明は上記のような構成であシ、音声信号に騒音が重
畳しても、音声信号の本来のスペクトル形状を損なわず
に騒音成分を除去できるので、標準パターンとの距離の
変化が少なく、騒音による認識性能の低下を防ぐことが
できる利点を有する。Effects of the Invention The present invention has the above configuration, and even if noise is superimposed on the audio signal, the noise component can be removed without damaging the original spectral shape of the audio signal. This has the advantage of being able to prevent deterioration in recognition performance due to noise.

なお、上記実施例では、３点のメジアン平均による平滑
化例を示しだが、本発明は３点のメジアン平均に限定さ
れるものではない。In addition, although the above-mentioned example shows an example of smoothing using the median average of three points, the present invention is not limited to the median average of three points.

[Brief explanation of the drawing]

第１図は本発明の一実施例における音声認識方法の音声
分析部のブロック図、第２図は同実施例におけるメジア
ン平均の概念図、第３図Ａ、Ｂはそれぞれ本発明の実施
例における無騒音時および騒音重畳時の平滑化前後の状
態を示す図である。１°°°増幅回路、２・・・フィルタバンク、２０．、
２０２゜・・・２０ｎ・・・帯域フィルタ、２１．．２
１□、・・・２１７Ｌ・・・整流回路、２２１．２２２
１・・・２２ｎ・・・積分回路、２３．．２３．、。・・・２３ｎ・・・出力、３・・・マルチプレクサ、４
・・・Ａ／Ｄ変換器、５・・・平滑化手段、６・・・平
滑化信号、７・・・判別手段、９・・・マイクロホン。FIG. 1 is a block diagram of the speech analysis section of the speech recognition method in an embodiment of the present invention, FIG. 2 is a conceptual diagram of the median average in the embodiment, and FIGS. 3A and B are respectively in the embodiment of the present invention. It is a figure which shows the state before and after smoothing when there is no noise and when noise is superimposed. 1°°° amplifier circuit, 2...filter bank, 20. ,
202°...20n...Band filter, 21. ．． 2
1□,...217L... Rectifier circuit, 221.222
1...22n...integrator circuit, 23. ．． 23. ,. ...23n...Output, 3...Multiplexer, 4
...A/D converter, 5. Smoothing means, 6. Smoothed signal, 7. Discrimination means, 9. Microphone.

Claims

[Claims]

A speech recognition method characterized in that speech recognition is performed using a signal obtained by spectrum-analyzing a speech signal, taking the median average of the analysis output signal, and smoothing the signal.