JPS60140300A

JPS60140300A - Nasals identifier for voice recognition equipment

Info

Publication number: JPS60140300A
Application number: JP24529283A
Authority: JP
Inventors: 晋太木村
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1983-12-28
Filing date: 1983-12-28
Publication date: 1985-07-25

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】技術分野本発明は、音声認識装置における鼻音識別装置に関する
。本発明による鼻音識別装置は、例えばワードプロセッ
サなどの情報処理装置における音声認識装置などに用い
られる。DETAILED DESCRIPTION OF THE INVENTION Technical Field The present invention relates to a nasal identification device in a speech recognition device. The nasal sound identification device according to the present invention is used, for example, in a speech recognition device in an information processing device such as a word processor.

従来技術従来、音声認識装置では音声波形′ｆ：１ＯｋＨ２程度
の間隔でサンプリングしてアナログ・デジタル変換し、
サンプリングしたデータの始点から１０ｍ　ｓｅｅ程度
の幅の窓（フレームと称される）内のデータを窓を移動
させながら周波数分析し、特徴抽出を行っている。この
ように窓幅がｔ　ｏ　ｍ　ｓｅｅ程度の狭い窓内で周波
数分析を行った場合には、時間分解能Δｔは高いものと
なるが、時間分解能△ｔと周波数分解能△ｆの積（Δｔ
ＸΔｆ）はある一定の値より小さくできないという数学
的原理かあるため、周波数分解能は低いものとなる。Prior Art Conventionally, in a speech recognition device, the speech waveform 'f' is sampled at intervals of about 10kHz and converted from analog to digital.
Data within a window (referred to as a frame) with a width of approximately 10 m from the starting point of the sampled data is subjected to frequency analysis while moving the window, and features are extracted. In this way, when frequency analysis is performed within a narrow window with a window width of about t o m see, the time resolution Δt is high, but the product of the time resolution Δt and the frequency resolution Δf (Δt
Since there is a mathematical principle that XΔf) cannot be made smaller than a certain value, the frequency resolution is low.

このため、このような狭い窓を用いて音声認識を行う場
合には、目的とする音声波形が鼻音であることの識別は
行えるが、その鼻音がす行またはマ行の子音部″ｍ″、
”ｎ”、”のいずれであるη かの識別は、その識別に必要な充分に高い周波数分解能
が得られないため、識別率の悪いものとなる。Therefore, when performing speech recognition using such a narrow window, it is possible to identify that the target speech waveform is a nasal sound, but the consonant part "m",
Discriminating whether η is either "n" or "n" results in a poor discrimination rate because a sufficiently high frequency resolution necessary for the discrimination cannot be obtained.

発明の目的本発明の目的は、鼻音であると識別される音声区間（鼻
音区間）の全体に対して周波数分析を行い、周波数分解
能の高いスペクトルを得、鼻音の１ｍ″、”ｎ’”、′
１”の周波数構造の違いを明確にするという構想に基づ
き、認識されるべき鼻音がｍ”、”ｎ”、′ｕ゛のいず
れであるかを識別する識別率を向上させることにある。OBJECTS OF THE INVENTION The purpose of the present invention is to perform frequency analysis on the entire speech section (nasal section) that is identified as a nasal sound, obtain a spectrum with high frequency resolution, and analyze the nasal sounds 1m'', "n'", ′
Based on the concept of clarifying the difference in the frequency structure of ``1'', the objective is to improve the identification rate for identifying whether the nasal sound to be recognized is m'', ``n'', or 'u゛.

発明の構成本発明においては、入力された音声信号を保持する音声
データバッファ、該音声データバッファの音声データを
フレーム単位に認識するフレーム単位認識手段、該フレ
ーム単位認識手段の認識結果から鼻音区間を検出する鼻
音区間検出手段、該鼻音区間検出手段により検出された
鼻音区間情報によって該音声データバッファの音声デー
タに対して該鼻音区間の両側の母音区間の影響を取り除
く窓演算を行う鼻音区間窓演算手段、および、該鼻音区
間窓演算手段により窓演算が行われた音声データに対し
て周波数分析、特徴抽出およびマツチングを行う鼻音認
識手段を具備する音声認識装置における鼻音識別装置が
提供される。Structure of the Invention In the present invention, there is provided an audio data buffer that holds an input audio signal, a frame unit recognition unit that recognizes audio data in the audio data buffer frame by frame, and a nasal section that is determined from the recognition result of the frame unit recognition unit. a nasal interval window operation for performing a window operation to remove the influence of vowel intervals on both sides of the nasal interval on the audio data in the audio data buffer using the nasal interval information detected by the nasal interval detection means; A nasal sound identification device in a speech recognition device is provided, which includes a nasal sound recognition means that performs frequency analysis, feature extraction, and matching on speech data subjected to window calculation by the nasal sound interval window calculation means.

発明の実施例本発明の一実施例としての鼻音識別装置が第１図に示さ
れる。第１図において、マイクロホンｌが音声入力部２
０入力端子に接続される。音声入力部２はマイク・ロホ
ン１から入力された音声信号Ｓを一定の周期でサンプリ
ングしてアナログ・デジタル変換し、得られた音声デジ
タルデータを音声データバッファ３に出力する。Embodiment of the Invention A nasal sound identification device as an embodiment of the present invention is shown in FIG. In FIG. 1, microphone l is connected to audio input section 2.
Connected to the 0 input terminal. The audio input section 2 samples the audio signal S input from the microphone/lophone 1 at regular intervals, performs analog-to-digital conversion, and outputs the resulting audio digital data to the audio data buffer 3.

音声データバッファ３は該音声デジタルデータを格納す
るメモリであり、その出力はフレーム単位認識部４の入
力端子、および鼻音区間窓演算部６のかけ算回路用入力
端子にそれぞれ導かれる。The audio data buffer 3 is a memory that stores the audio digital data, and its output is led to the input terminal of the frame unit recognition section 4 and the multiplication circuit input terminal of the nasal interval window calculation section 6, respectively.

フレーム単位認識部４は、例えば１０　ｍ　ｓｅｃのフ
レーム（窓）単位毎に入力音声波形の認識を行う公知の
形式のフレーム単位認識装置であり、周波数分析部、特
徴抽出部、辞書部、マツチング部等を含み構成される。The frame unit recognition unit 4 is a frame unit recognition device of a known type that recognizes an input speech waveform in units of frames (windows) of, for example, 10 m sec, and includes a frequency analysis unit, a feature extraction unit, a dictionary unit, and a matching unit. It is composed of such things as

フレーム単位認識部４の認識結果としての出力は、識別
結果統合部５の一方の入力端子および鼻音区間検出部６
０入力端子にそれぞれ導かれる。The output as the recognition result of the frame unit recognition unit 4 is sent to one input terminal of the identification result integration unit 5 and to the nasal segment detection unit 6.
0 input terminal, respectively.

鼻音区間検出部６は、フレーム単位認識部４において鼻
音と判定されたフレーム列の始点と終点とを検出して鼻
音区間を検出するものであり、その検出出力は鼻音区間
窓演算部７の窓関数算出回路用入力端子に導かれる。鼻
音区間窓演算部７は、鼻音区間検出部６で検出された鼻
音区間を窓とする窓関数Ｗを、音声データバッファ３７
７）ら読み出した音声データにかけ合わせ、該音声デー
タのうちから鼻音区間の音声データを抽出する回路であ
る。The nasal segment detecting unit 6 detects the nasal segment by detecting the start and end points of the frame sequence determined to be nasal in the frame unit recognition unit 4, and its detection output is sent to the window of the nasal segment window calculating unit 7. It is guided to the input terminal for the function calculation circuit. The nasal interval window calculation unit 7 converts a window function W using the nasal interval detected by the nasal interval detection unit 6 into the audio data buffer 37.
This circuit multiplies the audio data read out from 7) and extracts audio data in the nasal section from the audio data.

鼻音区間窓演算部７の出力側には周波数分析部８、特徴
抽出部９が順次に接続される＝特徴抽出部９の出力はス
イッチｌＯを介してマツチング部１１および鼻音辞書１
２にそれぞれ導かれる。スイッチｌＯは、認識側端子１
０ａと登録側端子１０ｂとを有しており、認識側端子１
０ａにはマツチング部１１が、登録側端子１０ｂには鼻
音辞書１２がそれぞれ接続される。A frequency analyzer 8 and a feature extractor 9 are sequentially connected to the output side of the nasal interval window calculation unit 7 = the output of the feature extractor 9 is connected to the matching unit 11 and the nasal dictionary 1 via a switch IO.
2 respectively. Switch IO is recognition side terminal 1
0a and a registration side terminal 10b, and a recognition side terminal 1.
A matching unit 11 is connected to 0a, and a nasal dictionary 12 is connected to the registration side terminal 10b.

鼻音辞書１２は、スイッチ１０を登録側端子１０ｂ側に
切り替えて特徴抽出部９からの特徴パターンを導くこと
によって、鼻音の“ｍ″、”ｎ−゛ジ′に対応する特（
江パターンがあらかじめ登録される。マツチング部１１
は、スイッチ１ｏが認識側端子１０ａに切り替えられて
いるときに、特徴抽出部９力ｒらの認識パタニンを体音
辞書１２の登録パターンと比較し、両者間の距離計算を
行い、もっとも距離の小さくなった辞書のカテゴリ名（
すなわち、”ｍ”、ｎ−゛η”　のいずれが）を出力す
る。By switching the switch 10 to the registration side terminal 10b and deriving the feature pattern from the feature extractor 9, the nasal dictionary 12 extracts the features corresponding to the nasal sounds "m" and "n-ji'.
The pattern is registered in advance. Matching section 11
When the switch 1o is switched to the recognition side terminal 10a, the recognition pattern of the feature extraction unit 9r and others is compared with the registered pattern of the body sound dictionary 12, the distance between the two is calculated, and the distance between the two is calculated. The category name of the smaller dictionary (
That is, either "m" or n-゛η'' is output.

マツチング部１１の識別結果としての出力は、識別結果
統合部５の他方の入力端子に導かれる。The output as the identification result of the matching section 11 is led to the other input terminal of the identification result integrating section 5.

識別結果統合部はフレーム単位認識部４の認識結果のう
ちの鼻音部分をマツチング部１１がらの出力と入れ換え
て識別結果として出方する。The identification result integrating section replaces the nasal part of the recognition result of the frame unit recognition section 4 with the output from the matching section 11 and outputs it as the identification result.

第１図装置における鼻音区間検出部６−および鼻音区間
窓演算部７の一具体側が第２図に示される。One specific side of the nasal segment detection unit 6- and the nasal segment window calculation unit 7 in the device shown in FIG. 1 is shown in FIG.

第２図において、フレーム単位認識部４からの認識結果
およびフレーム周期に同期したクロックＣＬＫ（１）が
鼻音区間検出部６に導かれており、鼻音区間検出部６に
おいては、該認識結果はクロックＣＬＫ（ｉ）に同期し
てシフトレジスタ６１に導かれる。In FIG. 2, the recognition result from the frame unit recognition section 4 and the clock CLK (1) synchronized with the frame period are led to the nasal section detection section 6, where the recognition result is clocked. The signal is guided to the shift register 61 in synchronization with CLK(i).

シフトレジスタ６１は２段に縦続接続されたレジスタ６
１　ａ、　６１　ｂからなる。レジスタ６１ａは現フレ
ーム（現在注目しているフレーム）の認識結果を保持し
、レジスタ６１ｂは前フレームの認識結果を保持する。The shift register 61 has registers 6 connected in cascade in two stages.
Consisting of 1a and 61b. The register 61a holds the recognition result of the current frame (the frame of interest at present), and the register 61b holds the recognition result of the previous frame.

レジスタ６１　ａ、　６１　ｂの各出力は、比較回路６
２．６３の一方の入力端子にそれぞれ導かれる。比較回
路６２．６３の他方の入力端子には認識結果の鼻音Ｍ　
Ｎ　ＩＩに相当する文字コードが導かれる。Each output of registers 61a and 61b is sent to comparator circuit 6.
2.63, respectively. The other input terminals of the comparison circuits 62 and 63 receive the nasal sound M resulting from the recognition.
A character code corresponding to N II is derived.

比較回路６２の出力は、否定回路６５を介してアンド回
路６７の一方の入力端子に、およびアンド回路６６の一
方の入力端子に導かれ、また比較回路６３の出力は、否
定回路６４を介してアンド回路６６の他方の入力端子に
、およびアンド回路６７の他方の入力端子に導かれる。The output of the comparison circuit 62 is led to one input terminal of the AND circuit 67 via the NOT circuit 65 and to one input terminal of the AND circuit 66. It is led to the other input terminal of AND circuit 66 and to the other input terminal of AND circuit 67.

これら否定回路６４，６５、アンド回路６６．６７’Ｃ
”らなる論理回路は、鼻音区間の始点および終点を決定
する論理演算を行う。These NOT circuits 64, 65, AND circuits 66, 67'C
The logic circuit consisting of ``performs a logical operation to determine the starting point and ending point of the nasal interval.

アンド回路６６．６７の各出力は、次に鼻音区間検出部
７０ソリツブフロップ７１のセット（Ｓ）入力端子、リ
セツ）　（Ｒ）入力端子にそれぞれ導かれ、フリップフ
ロップ７１の出力Ｑは窓関数メモリ７２のデータ入力端
子に導かれる。この窓関数メモリにおいては、その誓込
みアドレス用入力端子にカウンタ７３が、読出しアドレ
ス用入力端子にカウンタ７４が、またデータ出力端子に
アンド回路群７６ａ、７６ｂ・・・７６ルがそれぞれ接
続される。The outputs of the AND circuits 66 and 67 are then led to the set (S) input terminal and reset (R) input terminal of the nasal interval detection section 70 and the solve flop 71, respectively, and the output Q of the flip-flop 71 is a window function. It is led to a data input terminal of memory 72. In this window function memory, a counter 73 is connected to its pledge address input terminal, a counter 74 is connected to its read address input terminal, and AND circuit groups 76a, 76b, . . . , are connected to its data output terminal. .

カウンタ７３はその入力端子にクロックＣＬＫ（１）が
導かれており、該クロックＣＬＫ（１）を計数した計数
値を窓関数メモリ７２に省込みアドレスとして与える。The counter 73 has a clock CLK(1) introduced to its input terminal, and provides a count value obtained by counting the clock CLK(1) to the window function memory 72 as a write-in address.

またカウンタ７４はその入力端子にＭ分周回路７５を介
してクロックＣＬＫ　（２）が導かれる。クロックＣＬ
Ｋ　（２）は音声入力のサンプリング周期に同期した、
ｌフレーム当りＭ個のクロック信号である。したがって
、カウンタ７４には、Ｍ分周回路７５によってｌフレー
ム当りのデータθＭでＣＬＫ　（２）を分周した信号Ｃ
ＬＫ　（３）が入力されて計数されており、その計数値
が窓関数メモリ７２に読出しアドレスとして与えられる
。なお、これらフリップフロップ７１゜窓関数メモリ７
２、カウンタ７３，７４、Ｍ分周回路７５は窓関数算出
回路を構成している。Further, the clock CLK (2) is introduced to the input terminal of the counter 74 via an M frequency divider circuit 75. Clock CL
K (2) is synchronized with the sampling period of the audio input,
There are M clock signals per frame. Therefore, the counter 74 receives a signal C obtained by dividing CLK (2) by the data θM per l frame by the M frequency dividing circuit 75.
LK (3) is input and counted, and the counted value is given to the window function memory 72 as a read address. In addition, these flip-flops 71° window function memory 7
2. Counters 73, 74, and M frequency divider circuit 75 constitute a window function calculation circuit.

アンド回路群７６　ａ、　７６　ｂ・・・７６ｎはかけ
算回路を構成するものであって、各入力端子には音声デ
ータバッファ３からのデジタル音声データの各ビットが
それぞれ導かれており、その各出力は周波数分析部８０
入力端子に導かれる。AND circuit groups 76a, 76b...76n constitute a multiplication circuit, each bit of the digital audio data from the audio data buffer 3 is led to each input terminal, and each output is the frequency analysis section 80
led to the input terminal.

第１図および第２図の装置の動作が第３図を参照しつつ
以下に説明される。第３図は第１図および第２図の装置
における各部信号波形図であり、図中、（１）はマイク
ロホンｌの出力音声波形５１（２）は（１）の音声波形
Ｓのフレーム単位の認識結果、（３）は鼻音区間窓演算
回路７における窓関数Ｗの波形、（４）はクロックＣＬ
Ｋ（１）の波形、（５）はクロックＣＬＫ（２）の波形
、（６）はクロッ／　ＣＬＫ　（２）をＭ分周した信号
ＣＬＫ（３）＋７Ｊ波形をそれぞれあられ丁。The operation of the apparatus of FIGS. 1 and 2 will now be described with reference to FIG. FIG. 3 is a diagram of signal waveforms at various parts in the apparatus shown in FIGS. As a result of recognition, (3) is the waveform of the window function W in the nasal interval window calculation circuit 7, and (4) is the clock CL.
The waveform of K(1), (5) is the waveform of clock CLK(2), and (6) is the waveform of signal CLK(3)+7J, which is the clock CLK(2) divided by M.

マイクロホンｌから入力された音声信号Ｓ（第３図Ａ）
は、音声入力部２にクロックＣＬＫ　（２）のサンプリ
ング・タイはングでサンプリングされてデジタルデータ
に変換され、音声データバッファ３に格納される。フレ
ーム単位認識部４はこの音声データバッファ３を参照し
、クロックＣＬＫ（１）により決められるフレーム単位
で認識結果をめる。第３図（２）には音声信号のｒＡＮ
ＯＪをフレーム単位に認識している様子が示されている
。Audio signal S input from microphone l (Figure 3A)
is sampled by the audio input section 2 at the sampling timing of the clock CLK (2), converted into digital data, and stored in the audio data buffer 3. The frame unit recognition unit 4 refers to the audio data buffer 3 and obtains recognition results in frame units determined by the clock CLK(1). Figure 3 (2) shows the rAN of the audio signal.
It shows how OJ is recognized frame by frame.

鼻音区間検出部６では、第３図（２）に示す認識結果の
うちから鼻音“Ｎ”のフレーム列の始点と終点を検出し
鼻音区間を決定する。鼻音区間窓演算部７は、音声デー
タバッファ３内の音声データの今決定した鼻音区１！Ｉ
Ｋ第３図（３）に示すような窓をかける。この鼻音区間
窓演算部７の動作の詳しい様子が第２図を参照しつつ以
下に説明される。The nasal segment detecting unit 6 detects the start and end points of the frame sequence of the nasal sound "N" from among the recognition results shown in FIG. 3(2), and determines the nasal segment. The nasal segment window calculation unit 7 calculates the now determined nasal segment 1! of the audio data in the audio data buffer 3! I
Install a window as shown in Figure 3 (3). The detailed operation of the nasal interval window calculating section 7 will be explained below with reference to FIG.

第２図において、フレーム単位認識部４の認識結果は、
クロックＣＬＫ（１）に同期してシフトレジスタ６１に
順次にフットインされる。レジスタ６．１ａには今注目
している現フレームの認識結果が、またレジスタ６１ｂ
には前フレームの認識結果が保持される。これらレジス
タ６１ａ、６１ｂＫ保持された現フレーム、前フレーム
の認識結果は次に比較回路６２．６３に送られてここで
鼻音゛Ｎ゛。In FIG. 2, the recognition results of the frame unit recognition unit 4 are as follows:
The signals are sequentially footed into the shift register 61 in synchronization with the clock CLK(1). Register 6.1a contains the recognition result of the current frame of interest, and register 61b also contains the recognition result of the current frame of interest.
holds the recognition result of the previous frame. The recognition results of the current frame and previous frame held in these registers 61a and 61bK are then sent to comparison circuits 62 and 63, where the nasal sound 'N' is output.

の文字コードとそれぞれ比較され、次の論理により鼻音
の始点と終点が決定される。The starting point and ending point of the nasal sound are determined by the following logic.

中　鼻音の始点は、現フレームが鼻音“Ｎ”であり、か
つ前フレームが鼻音″Ｎ”でないことにより決定される
。The starting point of the middle nasal sound is determined by the fact that the current frame is the nasal sound "N" and the previous frame is not the nasal sound "N".

（１１）鼻音の終点は、現フレームが鼻音゛Ｎ”でなく
、かつ前フレームが鼻音ＩＩ　Ｎ１１であることにより
決定される。(11) The end point of the nasal sound is determined by the fact that the current frame is not the nasal sound 'N'' and the previous frame is the nasal sound II N11.

上記（１）、（１１）の論理演算は、比較回路６２．６
３、否定回路６４．６５およびアンド回路６６．６７で
実現され、その演算結果によってアンド回路６６からは
鼻音の始点検出信号Ｓ　（ａ）が、またアンド回路６７
からは鼻音の終点検出信号Ｓ　（ｅ）が鼻音区間窓演算
部７に出力される。The logical operations in (1) and (11) above are performed by the comparator circuit 62.6.
3. It is realized by NOT circuits 64.65 and AND circuits 66.67, and depending on the calculation result, the AND circuit 66 outputs the nasal starting point detection signal S(a), and the AND circuit 67
From there, the nasal end point detection signal S (e) is output to the nasal interval window calculation unit 7.

鼻音区間窓演算部７においては、フリップフロップ７１
は始点検出信号５（８）によってセットされ、終点検出
信号Ｓ　（ｅ）によってリセットされることによって第
３図（３）に示す窓関数Ｗを発生し、この窓関数Ｗの態
形を窓関数メモリ７２に格納する。窓関数メモリ７２の
曹込みアドレスはクロックＣＬＫ（１）をカウンタ７３
で計数して得る。In the nasal interval window calculation unit 7, a flip-flop 71
is set by the start point detection signal 5 (8) and reset by the end point detection signal S (e) to generate the window function W shown in FIG. 3 (3). The data is stored in the memory 72. The processing address of the window function memory 72 is based on the clock CLK (1) by the counter 73.
Obtain it by counting.

一方、音声データバッファ３からの音声の波形データＤ
はクロックＣＬＫ（２）に同期してアンド回路群７６ａ
、７６ｂ・・・７６ＴＬに送り込まれ、窓関数メモリ７
２から読み出された窓関数とこのアンド回路群７６ａ、
７６ｂ・・・７６ｎでかけ算されて周波数分析部８に出
力される。この際の窓関数メモリの読出レアドレスは、
クロックＣＬＫ（２）を　Ｍ分周回路７５によってフレ
ーム当りのデータ数Ｍで分周し該Ｍ分周した信号ＣＬＫ
（３）をカウンタ７４で計数したものが用いられる。し
たがって、音声データのうちの鼻音区間に対応する部分
に第３図（３）の態形がかけられることになり、鼻音区
間の両側の母音区間の影響が取り除かれる。On the other hand, the audio waveform data D from the audio data buffer 3
is the AND circuit group 76a in synchronization with the clock CLK(2).
, 76b...76TL, and the window function memory 7
The window function read from 2 and this AND circuit group 76a,
76b...76n and output to the frequency analysis section 8. The read address of the window function memory at this time is
The clock CLK (2) is divided by the number of data M per frame by the M frequency dividing circuit 75, and the signal CLK obtained by dividing the clock CLK(2) by the M frequency is obtained.
(3) counted by the counter 74 is used. Therefore, the form shown in FIG. 3(3) is applied to the portion of the audio data that corresponds to the nasal section, and the influence of the vowel sections on both sides of the nasal section is removed.

第１図における周波数分析部７では、鼻音区間窓のかか
ったデータの周波数分析を行い、特徴抽出部８では周波
数分析部７でめたスペクトルによって帯域パワー計算、
モーメント計算などの特徴抽出を行う。この場合、周波
数分析は、単位フレーム毎に行われるのではなく、鼻音
区間と判定された全区間を窓として行われるので、鼻音
のｎ”、′ｍ−゛η゛を高識別率で識別できるに充分な
周波数分解能が得られる。The frequency analysis section 7 in FIG. 1 performs frequency analysis of the nasal interval windowed data, and the feature extraction section 8 performs band power calculation based on the spectrum obtained by the frequency analysis section 7.
Performs feature extraction such as moment calculation. In this case, frequency analysis is not performed for each unit frame, but is performed using the entire interval determined as a nasal interval as a window, so that the nasal sounds n” and ′m−゛η゛ can be identified with a high discrimination rate. Sufficient frequency resolution can be obtained.

特徴抽出部９の特徴パラメータは、認識側端子１０ａに
切り替えられているスイッチ１０を介してマツチング部
１１に送られる。マツチング音ｔｓｔｔでは、あらかじ
め昇音辞曹１２に登録されているｍ゛°、″′ｎ−“°
η゛′のパターンと認識すべき特徴抽出部９からの特徴
パラメータとの間で距離計算を行い、もっとも距離の小
ざくなった辞書のカテゴリ名、すなわちｍ゛、′″ｌｎ
″、°′η”のいずれかを出力する。識別結果統合部５
では、フレーム単位認識部４からの認識結果のうちの鼻
音部分を、マツチング部１１からのｇ識結果と入れ換え
、識別結果を統合して後の処理工程に出力する。The feature parameters of the feature extraction section 9 are sent to the matching section 11 via the switch 10 which is switched to the recognition side terminal 10a. In the matching sound tstt, m゛°, ″'n−“°, which are registered in advance in the ascending sound dictionary 12, are used.
The distance is calculated between the pattern of η゛′ and the feature parameter from the feature extraction unit 9 to be recognized, and the category name of the dictionary with the smallest distance, that is, m゛,′″ln
″, °′η”. Identification result integration unit 5
Now, the nasal part of the recognition result from the frame unit recognition unit 4 is replaced with the g recognition result from the matching unit 11, and the recognition results are integrated and output to a subsequent processing step.

発明の効果本発明によれば、鼻音の”ｍ”、ｎ”、゛）Ｊ″の周波
数構造の違いが明確に判別され、認識されるべき鼻音が
ｍ”、ｎ−−，°″のいずれであるかを識別する識別率
が向上し、それにより音声認識装置の性能が高められる
。Effects of the Invention According to the present invention, the difference in the frequency structure of the nasal sounds "m", n", and ゛)J" can be clearly distinguished, and whether the nasal sound to be recognized is m", n--, °" The identification rate for identifying whether the speech is true or not is improved, thereby improving the performance of the speech recognition device.

[Brief explanation of drawings]

第１図は本発明の一実施例としての音声認識装置におけ
る鼻音識別装置を示すブロック図、第２図は第１図装置
における鼻音区間検出部および鼻音区間窓演算部の構成
ｒ示すブロック線図、第３図は第１区１および第２図装
置の各部における信号の波形を示す波形図である。２・・・音声入力部、３・・・音声データバッファ、４
・・・フレーム単位認識部、訃・・識別結果統合部、　６・・・鼻音区間検出部、７
・・・鼻音区間窓演算部、　８・・・周波数分析部、９
・・・特徴抽出部、　ｌｌ・・・マツチング部、１２・
・・鼻音辞書、　６１・・シフトレジスタ、６２．６３
・・・比較回路、　７１・・・フリップフロップ、７２
・・・窓関数メモリ、　７３．７４・・・カウンタ、７
５・・・Ｍ分周回路。特許出願人富士通株式会社特許出願代理人弁理士　青　木　朗弁理士　西　甜　オロ　之弁理士　内　１）幸　男弁理士　山　口　昭　之第］簡FIG. 1 is a block diagram showing a nasal identification device in a speech recognition device as an embodiment of the present invention, and FIG. 2 is a block diagram showing the configuration of a nasal segment detection unit and a nasal segment window calculation unit in the device shown in FIG. , FIG. 3 is a waveform diagram showing signal waveforms in each section of the first section 1 and the device shown in FIG. 2... Audio input section, 3... Audio data buffer, 4
...Frame unit recognition unit, ...Identification result integration unit, 6...Nasal section detection unit, 7
... Nasal interval window calculation section, 8... Frequency analysis section, 9
...Feature extraction section, ll...Matching section, 12.
...Nasal dictionary, 61...Shift register, 62.63
... Comparison circuit, 71 ... Flip-flop, 72
...Window function memory, 73.74...Counter, 7
5...M frequency dividing circuit. Patent applicant Fujitsu Ltd. Patent application representative Patent attorney Akira Aoki Patent attorney Oro Nishi Patent attorney 1) Yukio Patent attorney Aki Yamaguchi] Simplified

Claims

[Claims]

1. An audio data buffer that holds an input audio signal, a frame-by-frame recognition unit that recognizes the audio in the audio data buffer frame by frame, and a nasal interval detection unit that detects a nasal interval from the recognition result of the frame-by-frame recognition unit. , nasal interval window calculation means for performing a window operation on the audio data in the audio data buffer to remove the influence of vowel intervals on both sides of the nasal interval using the nasal interval information detected by the nasal interval detection means, and the nasal interval window calculation means; Frequency analysis is performed on the audio data subjected to window calculation by the interval window calculation means,
1. A nasal recognition device in a speech recognition device, characterized by comprising nasal recognition means for extracting features and performing matching.