JPH036519B2 - - Google Patents

Info

Publication number
JPH036519B2
JPH036519B2 JP57032424A JP3242482A JPH036519B2 JP H036519 B2 JPH036519 B2 JP H036519B2 JP 57032424 A JP57032424 A JP 57032424A JP 3242482 A JP3242482 A JP 3242482A JP H036519 B2 JPH036519 B2 JP H036519B2
Authority
JP
Japan
Prior art keywords
detector
airflow
circuit
nasal
oral
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP57032424A
Other languages
Japanese (ja)
Other versions
JPS58150995A (en
Inventor
Toyozo Sugimoto
Takeo Murata
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Institute of Advanced Industrial Science and Technology AIST
Original Assignee
Agency of Industrial Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agency of Industrial Science and Technology filed Critical Agency of Industrial Science and Technology
Priority to JP3242482A priority Critical patent/JPS58150995A/en
Publication of JPS58150995A publication Critical patent/JPS58150995A/en
Publication of JPH036519B2 publication Critical patent/JPH036519B2/ja
Granted legal-status Critical Current

Links

Description

【発明の詳細な説明】[Detailed description of the invention]

本発明は音声以外の情報から発音の認識を行な
う発音特徴抽出装置に関するものである。 音声は肺から送り出された呼気流が喉頭に存す
る声帯を通過する際に声帯が振動することにより
声に変換され、口唇や鼻腔に至る呼気の通路が形
を変えることにより変調され、これら発声器管の
総合的な働きの結果、産声される。 さて従来、このような音声を抽出するには音響
マイクロホンにより音声波を電気信号に変換し、
所定の周波数帯域を有する多数のフイルタ回路に
出力し、各フイルタ回路の出力から判断して発音
を特徴づけていた。 しかし発声器管の総合的働きの結果である音声
を、音声波のみにより全ての音素の発音特徴を抽
出して音声認識を行なうことは極めて困難難であ
る。とりわけ非定常的な子音については雑音エネ
ルギーが強く、音声波の中でほぼ確実な特徴抽出
ができる無声摩擦音|S,∫|等を除けば、無声
摩擦音|h|や無声破裂音|p,t,k|や有声
破裂音|b,d,g|や鼻音|m,n,〓|等は
その検出及び分離は非常に困難なものである。 本発明は上記欠点に鑑み、発声器管各部の動き
を検出する検出器を発声器管各部の近傍に装着ま
たは配置し、前記各検出器からの出力を処理装置
により処理させることにより、従来よりも正確に
発音抽出ができる発音特徴抽出装置を提供するも
のである。 以下、図面を参照しながら本発明の一実施例に
ついて説明する。 第1図は本発明の一実施例における発音抽出装
置のブロツク構成を示すものである。同図におい
て、1は喉頭部声帯付近に取付けられ声帯の振動
を検出する声帯振動検出器、2は鼻腔前方に配置
し鼻気流を検出する鼻気流検出器、3は口腔前方
に配置し口気流を検出する口気検出器、4は口腔
内口蓋に装着し舌と口蓋との接触を検出する口蓋
接触検出器である。 5は声帯振動検出器1、鼻気流検出器2、口気
流検出器3及び口蓋接触検出器4の出力から発音
特徴を抽出する処理装置で、以下第2図を用いて
さらに処理装置5における構成の詳細な説明を行
なう。第2図において、6は声帯振動検出器1の
声帯振動情報から特定の値に基づいて声帯振動の
有無を決定する閾値回路、7は鼻気流検出器2の
鼻気流情報から特定の値に基づいて鼻気流の有無
を決定する閾値回路、8は口気流検出器3の口気
流情報を微分することにより口気流の変化率(加
速度)を求める微分回路、9は口気流の変化率の
有無を特定の値に基づいて決定する閾値回路、1
0は口気流検出器3の口気流情報から特定の値に
基づいて口気流の有無を決定する閾値回路、11
は口蓋接触検出器4の口蓋接触情報を一旦測定回
路12により舌と口蓋との接触信号に変換した後
に後述する前舌閉鎖、後舌閉鎖及び閉鎖なしの3
種類の状態を判断する舌閉鎖検出回路、13は閾
値回路7,7,9,10から出力される各閾値情
報の有無、及び舌閉鎖検出回路11における3種
類の情報から音素分類を行なう音素分類回路であ
る。 上記のように構成された発音特徴抽出装置につ
いて、以下具体的な使用方法を第3図を用い説明
を行なう。 声帯振動検出部1として第3図に示すように加
速センサー1′を医療用両面テープにより人体に
おける喉頭の声帯部に取り付けることにより声帯
振動を検出する。検出された声帯振動は閾値回路
6に出力され、閾値回路6は声帯振動の値が特定
の値以上であれば音素分類回路13に有(+)信
号をまた一定の値以下であれば無(−)信号を出
力する。また鼻気流検出器2として熱線流計セン
サー2′をヘツドバンドのピボツト部に固定した
可動アームの先端部に取り付けて人体における鼻
腔前方に配置することにより、鼻気流を検出す
る。検出された鼻気流は閾値回路7に出力され、
閾値回路7は鼻気流の値が特定の値以上であれば
音素分類回路13に有(+)信号を、また一定の
値以下であれば無(−)信号を出力する。 また口気流検出器3として熱線流量計センサー
3′を人体における口腔前方の机上等に固定し配
置することにより、口気流の検出を行なう。検出
された口気流は微分回路8に出力され、微分回路
8では口気流の変化率を求めその変化率を閾値回
路9に出力する。そして閾値回路9は変化率の値
が特定の値以上であれば音素分類回路13に有
(+)信号をまた一定の値以下であれば無(−)
信号を出力する。一方熱線流量計センサー3′に
より検出された口気流は閾値回路10にも出力さ
れ、閾値回路10ではその口気流の値が特定値以
上であれば音素分類回路13に有(+)信号を、
また一定値以下であれば無(−)信号を出力す
る。 さらに口蓋接触検出器4としては第4図に示さ
れるような接触センサー4′を用いる。接触セン
サー4′は舌と接触する部分に多数の電極4′aを
有し、止め部4′bにより人体における口腔内口
上蓋に装着され、電極4′aにより舌との接触状
態を検出する。そして検出された電極4′aと舌
との接触状態は測定回路12及び舌閉鎖検出回路
11に順次入力され、接触状態が第5図イのよう
なパターンとなつた際には前舌閉鎖としての情報
が、第5図ロのようなパターンとなつた際には後
舌閉鎖としての情報が、また舌との接触がない場
合には閉鎖なしの情報が音素分類回路13に出力
される。 最終的に音素分類回路13では下表に示すよう
な内部の記憶テーブルから、閾値回路6,7,
9,10及び舌閉鎖検出回路11より入力した各
情報に基づいて音声を判断できる。
The present invention relates to a pronunciation feature extraction device that recognizes pronunciation from information other than speech. Speech is converted into voice when the exhaled airflow from the lungs passes through the vocal cords in the larynx, which vibrate, and is modulated by changing the shape of the exhaled air passage leading to the lips and nasal cavity. The first cry is produced as a result of the comprehensive function of the ducts. Conventionally, to extract such sounds, the sound waves are converted into electrical signals using an acoustic microphone.
The output is sent to a large number of filter circuits having a predetermined frequency band, and the pronunciation is characterized based on the output of each filter circuit. However, it is extremely difficult to perform speech recognition by extracting the pronunciation characteristics of all phonemes using only speech waves, which is the result of the comprehensive functioning of the vocal organ. In particular, non-stationary consonants have strong noise energy, and with the exception of voiceless fricatives |S, ∫|, etc., whose features can be almost certainly extracted from the speech wave, voiceless fricatives |h| and voiceless plosives |p, t , k|, voiced plosives |b, d, g|, nasal sounds |m, n, 〓|, etc. are extremely difficult to detect and separate. In view of the above-mentioned drawbacks, the present invention has been proposed by installing or arranging a detector for detecting the movement of each part of the voice tube near each part of the voice tube, and by having the output from each of the detectors processed by a processing device. The present invention also provides a pronunciation feature extraction device that can accurately extract pronunciations. An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 shows a block configuration of a pronunciation extracting device according to an embodiment of the present invention. In the figure, 1 is a vocal cord vibration detector that is attached near the vocal cords of the larynx and detects vibrations of the vocal cords, 2 is a nasal airflow detector that is placed in front of the nasal cavity and detects nasal airflow, and 3 is a nasal airflow detector that is placed in front of the oral cavity and detects oral airflow. 4 is a palate contact detector that is attached to the palate in the oral cavity and detects contact between the tongue and the palate. Reference numeral 5 denotes a processing device for extracting pronunciation features from the outputs of the vocal fold vibration detector 1, nasal airflow detector 2, oral airflow detector 3, and palate contact detector 4. A detailed explanation will be given below. In FIG. 2, 6 is a threshold circuit that determines the presence or absence of vocal fold vibration based on a specific value from the vocal fold vibration information of the vocal fold vibration detector 1, and 7 is a threshold circuit that determines the presence or absence of vocal fold vibration based on a specific value from the nasal airflow information of the nasal airflow detector 2. A threshold circuit 8 determines the rate of change (acceleration) of the oral airflow by differentiating the oral airflow information from the oral airflow detector 3; 9 a differentiation circuit that determines the presence or absence of the rate of change of the oral airflow; Threshold circuit that determines based on a specific value, 1
0 is a threshold circuit 11 that determines the presence or absence of oral airflow based on a specific value from the oral airflow information of the oral airflow detector 3;
The palate contact information from the palate contact detector 4 is once converted into a contact signal between the tongue and the palate by the measurement circuit 12, and then the three signals of anterior tongue closure, posterior tongue closure, and no closure described below are detected.
13 is a phoneme classification circuit that performs phoneme classification based on the presence or absence of each threshold information output from the threshold circuits 7, 7, 9, and 10, and three types of information in the tongue closure detection circuit 11; It is a circuit. A specific method of using the pronunciation feature extracting device configured as described above will be explained below with reference to FIG. As shown in FIG. 3, the vocal cord vibration detecting section 1 detects vocal cord vibration by attaching an acceleration sensor 1' to the vocal cord part of the larynx of the human body with medical double-sided tape. The detected vocal fold vibration is output to the threshold circuit 6, which sends a positive (+) signal to the phoneme classification circuit 13 if the value of the vocal fold vibration is above a certain value, and sends an absent (+) signal to the phoneme classification circuit 13 if it is below a certain value. −) Output a signal. Nasal airflow is detected by attaching a hot wire current meter sensor 2' as the nasal airflow detector 2 to the tip of a movable arm fixed to the pivot portion of the headband and placing it in front of the nasal cavity of the human body. The detected nasal airflow is output to the threshold circuit 7,
The threshold circuit 7 outputs a presence (+) signal to the phoneme classification circuit 13 if the value of the nasal airflow is above a specific value, and outputs an absence (-) signal if it is below a certain value. Oral airflow is detected by fixing and arranging a hot wire flow meter sensor 3' as the oral airflow detector 3 on a desk or the like in front of the oral cavity of the human body. The detected oral airflow is output to a differentiating circuit 8, which determines the rate of change in the oral airflow and outputs the rate of change to a threshold circuit 9. Then, the threshold circuit 9 sends a positive (+) signal to the phoneme classification circuit 13 if the rate of change value is above a specific value, and a negative signal (-) if the rate of change is below a certain value.
Output a signal. On the other hand, the oral airflow detected by the hot wire flow meter sensor 3' is also output to the threshold circuit 10, and if the value of the oral airflow is above a specific value, the threshold circuit 10 sends a positive (+) signal to the phoneme classification circuit 13.
Moreover, if it is below a certain value, a null (-) signal is output. Further, as the palate contact detector 4, a contact sensor 4' as shown in FIG. 4 is used. The contact sensor 4' has a large number of electrodes 4'a on the part that comes into contact with the tongue, is attached to the roof of the oral cavity in the human body by a stopper part 4'b, and detects the state of contact with the tongue by the electrode 4'a. . The detected contact state between the electrode 4'a and the tongue is sequentially input to the measurement circuit 12 and the tongue closure detection circuit 11, and when the contact state becomes a pattern as shown in Fig. 5A, it is determined that the anterior tongue is closed. When the information becomes a pattern as shown in FIG. Finally, in the phoneme classification circuit 13, threshold circuits 6, 7,
Speech can be determined based on each piece of information input from 9, 10 and the tongue closure detection circuit 11.

【表】【table】

【表】 さてたとえば第6図イに示すような音素波を有
する「hana」という音声を発声すると、加速度
センサー1′は第6図ロのような波形を閾値回路
6に出力する。そして閾値回路6では特定の閾値
から判断して「h」の部分では無(−)信号を、
「n」の部分では有(+)信号を音素分類回路1
3に出力する。 また熱線流量計センサー2′は第6図ハのよう
な波形を閾値回路7に出力する。そして閾値回路
7では特定の閾値から判断して「h」の部分では
無(−)信号を、「n」の部分では有(+)信号
を音素分類回路13に出力する。 さらに熱線流量計センサー3′では第6図ニの
ような波形を微分回路8及び閉値回路10に出力
する。そして閾値回路9では微分回路8からの微
分値を特定の閾値から判断して「h」及び「n」
の部分で無(−)信号を音素分類回路13に出力
する。また閾値回路10でも特定の閾値から判断
して「h」の部分では有(+)信号を、「n」の
部分では無(−)信号を音素分類回路13に出力
する。 一方接触センサー4′は電極4aと舌との接触
状態を検出し、測定回路12を介して舌閉鎖検出
回路11に出力する。そして舌閉鎖検出回路11
は「h」の部分で接触パターンにより「閉鎖な
し」の情報を、また「n」の部分では「前舌閉
鎖」の情報を音素分類回路13に出力する。 そして音素分類回路13では各情報に基づいて
表に示したような内部の記憶テーブルから「h」
及び「n」を認識することができる。 以上のように、声帯振動検出器1、鼻気流検出
器2、口気流検出器3及び口蓋接触検出器4によ
り各発声器管の動きを検出し、処理装置5により
各検出器が検出した情報に基づいてあらかじめ記
憶しているテーブルの巾から特定の音素を選択し
決定することにより、従来困難であつた音声の認
識を正確に行なうことができる。 以上のように本発明は声帯振動検出器が検出し
た声帯の振動情報と、鼻気流検出器が検出した鼻
腔前方の気流情報と、口気流検出器が検出した口
気流情報と、口蓋接触検出器が検出した舌と口蓋
との接触情報とに基づいて従来よりも正確に破裂
音および鼻音の各音素を識別することができ、そ
の実用的効果は大なるものがある。
[Table] For example, when the voice "hana" having a phoneme wave as shown in FIG. 6A is uttered, the acceleration sensor 1' outputs a waveform as shown in FIG. 6B to the threshold circuit 6. Then, in the threshold circuit 6, judging from a specific threshold value, there is no (-) signal at the "h" part,
In the “n” part, the presence (+) signal is sent to the phoneme classification circuit 1.
Output to 3. Further, the hot wire flowmeter sensor 2' outputs a waveform as shown in FIG. 6C to the threshold circuit 7. Then, the threshold circuit 7 outputs an absent (-) signal for the "h" portion and a present (+) signal for the "n" portion to the phoneme classification circuit 13, based on a judgment based on a specific threshold value. Furthermore, the hot wire flow meter sensor 3' outputs a waveform as shown in FIG. Then, in the threshold circuit 9, the differential value from the differentiating circuit 8 is judged from a specific threshold value, and it is determined as "h" and "n".
A null (-) signal is output to the phoneme classification circuit 13 at the part. The threshold circuit 10 also outputs a presence (+) signal at the "h" portion and a no (-) signal at the "n" portion to the phoneme classification circuit 13 based on a judgment based on a specific threshold value. On the other hand, the contact sensor 4' detects the contact state between the electrode 4a and the tongue, and outputs the detected state to the tongue closure detection circuit 11 via the measurement circuit 12. and tongue closure detection circuit 11
outputs to the phoneme classification circuit 13 information of "no closure" in the "h" part and information of "frontal tongue closure" in the "n" part, depending on the contact pattern. Then, the phoneme classification circuit 13 selects "h" from an internal memory table as shown in the table based on each information.
and "n" can be recognized. As described above, the movement of each vocal tube is detected by the vocal fold vibration detector 1, the nasal airflow detector 2, the oral airflow detector 3, and the palate contact detector 4, and the information detected by each detector is processed by the processing device 5. By selecting and determining a specific phoneme from the width of a pre-stored table based on the width of the table, it is possible to accurately recognize speech, which has been difficult in the past. As described above, the present invention combines vocal cord vibration information detected by a vocal cord vibration detector, airflow information in front of the nasal cavity detected by a nasal airflow detector, oral airflow information detected by an oral airflow detector, and palate contact detector. Each phoneme of plosives and nasals can be identified more accurately than before based on the contact information between the tongue and the roof of the mouth detected by the system, and its practical effects are significant.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明の一実施例における発音特徴抽
出装置のブロツク図、第2図は同発音特徴抽出装
置における処理のブロツク図、第3図は同発音特
徴抽出装置の使用例を示す図、第4図は接触セン
サーの平面図、第5図は舌と口蓋との接触パター
ンを示す図、第6図は各検出器の波形図である。 1……声帯振動検出器、2……鼻気流検出器、
3……口気流検出器、4……口蓋接触検出器、5
……処理装置。
FIG. 1 is a block diagram of a pronunciation feature extraction device according to an embodiment of the present invention, FIG. 2 is a block diagram of processing in the same pronunciation feature extraction device, and FIG. 3 is a diagram showing an example of use of the same pronunciation feature extraction device. FIG. 4 is a plan view of the contact sensor, FIG. 5 is a diagram showing a contact pattern between the tongue and the palate, and FIG. 6 is a waveform diagram of each detector. 1... Vocal cord vibration detector, 2... Nasal airflow detector,
3... Oral air flow detector, 4... Palate contact detector, 5
...Processing device.

Claims (1)

【特許請求の範囲】[Claims] 1 喉頭部に取り付けた声帯振動検出器と、鼻腔
前方に設置した鼻気流検出器と、口腔前方に配置
した口気流検出器と、舌と口蓋の接触を検出する
口蓋接触検出器とを備えかつ、口気流検出器の出
力に基づいて破裂音p,t,k,b,d,gおよ
びhのグルーブを抽出し、鼻気流検出器の出力に
基づいて鼻音m,n,〓を抽出し、声帯振動検出
器の出力に基づいてp,t,k,hとb,d,g
とを分離し、口蓋接触検出器の出力に基づいて
p,h,t,k,b,d,g,m,n,〓とに分
離、識別し、さらに口気流検出器の出力にもとづ
く口気流の変化率によりpとhを分離する処理装
置とを具備したことを特徴とする発音特徴抽出装
置。
1 Equipped with a vocal fold vibration detector attached to the larynx, a nasal airflow detector placed in front of the nasal cavity, an oral airflow detector placed in the front of the oral cavity, and a palate contact detector that detects contact between the tongue and the palate. , extracting the grooves of the plosive sounds p, t, k, b, d, g, and h based on the output of the oral airflow detector, and extracting the nasal sounds m, n, 〓 based on the output of the nasal airflow detector, Based on the output of the vocal cord vibration detector, p, t, k, h and b, d, g
and p, h, t, k, b, d, g, m, n, 〓 based on the output of the palate contact detector. 1. A pronunciation feature extraction device comprising: a processing device that separates p and h based on a rate of change in airflow.
JP3242482A 1982-03-03 1982-03-03 Speech feature extractor Granted JPS58150995A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP3242482A JPS58150995A (en) 1982-03-03 1982-03-03 Speech feature extractor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP3242482A JPS58150995A (en) 1982-03-03 1982-03-03 Speech feature extractor

Publications (2)

Publication Number Publication Date
JPS58150995A JPS58150995A (en) 1983-09-07
JPH036519B2 true JPH036519B2 (en) 1991-01-30

Family

ID=12358565

Family Applications (1)

Application Number Title Priority Date Filing Date
JP3242482A Granted JPS58150995A (en) 1982-03-03 1982-03-03 Speech feature extractor

Country Status (1)

Country Link
JP (1) JPS58150995A (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS59107399A (en) * 1982-12-13 1984-06-21 リオン株式会社 Measurement of nasalization level
WO2002025635A2 (en) 2000-09-19 2002-03-28 Logometrix Corporation Palatometer and nasometer apparatus
AU2002236483A1 (en) 2000-11-15 2002-05-27 Logometrix Corporation Method for utilizing oral movement and related events
US9816882B2 (en) * 2013-01-29 2017-11-14 Suzhou Institute Of Nano-Tech And Nano-Bionics (Sinano), Chinese Academy Of Sciences Electronic skin, preparation method and use thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS501846A (en) * 1973-05-14 1975-01-09
JPS5648700A (en) * 1979-09-28 1981-05-01 Matsushita Electric Ind Co Ltd Nasal sound detector

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS501846A (en) * 1973-05-14 1975-01-09
JPS5648700A (en) * 1979-09-28 1981-05-01 Matsushita Electric Ind Co Ltd Nasal sound detector

Also Published As

Publication number Publication date
JPS58150995A (en) 1983-09-07

Similar Documents

Publication Publication Date Title
US7529670B1 (en) Automatic speech recognition system for people with speech-affecting disabilities
US4718096A (en) Speech recognition system
Abberton Some laryngographic data for Korean stops
Löfqvist Acoustic and aerodynamic effects of interarticulator timing in voiceless consonants
Abdul-Kadir et al. Difficulties of standard arabic phonemes spoken by non-arab primary school children based on formant frequencies
JPS6129000B2 (en)
JPH036519B2 (en)
JPH036520B2 (en)
Kuo et al. Discriminating speakers with vocal nodules using aerodynamic and acoustic features
Paul et al. Speech recognition of throat microphone using MFCC approach
JPH0475520B2 (en)
Demolin et al. Whispery voiced nasal stops in rwanda.
JPH0139600B2 (en)
JPS5949742A (en) Apparatus for detecting exhalation force
JPH034919B2 (en)
JPH025099A (en) Voiced, voiceless, and soundless state display device
JPS6331795B2 (en)
Koenig et al. The time course of intraoral pressure changes during the formation and release of consonants and consonant clusters of German
JPS6258519B2 (en)
JPS60238899A (en) Breathing flow detector
JPS6329759B2 (en)
JPS63175897A (en) Breathing flow detector
JPS6236700A (en) Inhaling flow detector
JPS60238898A (en) Short syllable recognition
JPS616696A (en) Fracturing tendency detector