JPS58223188A - Plosive recognition equipment - Google Patents

Plosive recognition equipment

Info

Publication number
JPS58223188A
JPS58223188A JP10541782A JP10541782A JPS58223188A JP S58223188 A JPS58223188 A JP S58223188A JP 10541782 A JP10541782 A JP 10541782A JP 10541782 A JP10541782 A JP 10541782A JP S58223188 A JPS58223188 A JP S58223188A
Authority
JP
Japan
Prior art keywords
plosive
sound
section
information
expiratory flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP10541782A
Other languages
Japanese (ja)
Other versions
JPS6331795B2 (en
Inventor
杉本 豊三
村田 程夫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Institute of Advanced Industrial Science and Technology AIST
Original Assignee
Agency of Industrial Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agency of Industrial Science and Technology filed Critical Agency of Industrial Science and Technology
Priority to JP10541782A priority Critical patent/JPS58223188A/en
Publication of JPS58223188A publication Critical patent/JPS58223188A/en
Publication of JPS6331795B2 publication Critical patent/JPS6331795B2/ja
Granted legal-status Critical Current

Links

Landscapes

  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 本発明は破裂音を認識する破裂音認識装置に関するもの
である。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a plosive sound recognition device for recognizing plosive sounds.

近年、音声認識装置は研究が盛んとなり、いくつかのも
のは実用化され、市販されるようになってきている。壕
だ急速な進歩を続ける計算機と人間との最も自然な交信
手段としてもますます期待が高まりつつある。
In recent years, research into speech recognition devices has become active, and some devices have been put into practical use and are now on the market. Expectations are increasing as the most natural means of communication between humans and computers, which continue to advance rapidly.

しかしながら、任意の話者の発話が正しく認識されるよ
うな装置は得られておらず、とりわけ非定常な子音につ
いては困難であり、破裂子音についても確実な検出およ
び分類は非常に難かしい。
However, a device that can correctly recognize the utterances of any speaker has not been obtained, and it is particularly difficult to detect and classify irregular consonants, and it is also very difficult to reliably detect and classify plosive consonants.

また従来の方法では、例えば多数の帯域フィルタ出力か
ら特徴ベクトルを計算したり1周波数の偏りを計算する
など、乗除算を含む多数回の演算処理が必要である。
Furthermore, the conventional method requires a large number of arithmetic operations including multiplication and division, such as calculating a feature vector from the outputs of a large number of bandpass filters or calculating the bias of one frequency.

本発明の目的は任意の話者の破裂子音を確実に検出し分
類する破裂音認識装置を提供することにある。本発明の
他の目的は破裂音認識部に乗除算を必要とせず演算処理
の簡単な破裂音認識装置を提供することにある。
An object of the present invention is to provide a plosive recognition device that reliably detects and classifies plosive consonants of any speaker. Another object of the present invention is to provide a plosive recognition device that does not require multiplication and division in the plosive recognition unit and has simple arithmetic processing.

以下、本発明の一実施例について図面を参照し々がら説
明する。
Hereinafter, one embodiment of the present invention will be described with reference to the drawings.

第1図は本発明の一実施例における破裂音認識装置のブ
ロック図である。同図において、1は音声波検出器で、
例えば低雑音接話型マイクロホン等である02は呼気流
速検出器で、例えば熱線流量計センサー等であり、使用
に際しては口腔前方に配置して呼気流速を検出する。3
は喉頭振動検出器で、例えば振動ピックアップ等であり
、使用に際しては喉頭の声帯付近に医療用両面テープな
どによって取り付けて喉頭振動を検出する。4は舌と硬
口蓋との接触情報を検出する口蓋接触検出器で、その形
状例を第2図に示す。口蓋接触検出器4は複数個の電極
41Lが配列されており、使用に際しては口腔内の硬口
蓋に装着することによって舌と硬口蓋の接触状態を検出
することができる。
FIG. 1 is a block diagram of a plosive recognition device according to an embodiment of the present invention. In the figure, 1 is an audio wave detector;
For example, 02, which is a low-noise close-talk type microphone, is an exhalation flow rate detector, such as a hot wire flowmeter sensor, etc., and when in use, it is placed in front of the oral cavity to detect the exhalation flow rate. 3
is a laryngeal vibration detector, such as a vibration pickup, which is attached to the larynx near the vocal cords using medical double-sided tape or the like to detect laryngeal vibration. 4 is a palate contact detector that detects contact information between the tongue and the hard palate, and an example of its shape is shown in FIG. The palate contact detector 4 has a plurality of electrodes 41L arranged, and when used, it can detect the state of contact between the tongue and the hard palate by attaching it to the hard palate in the oral cavity.

第3図(a)〜(d)は舌と硬口蓋との接触状態を模式
的にパターンで示したもので、斜線部が接触した部分を
示す。同図(IL)は/S 、 Z/の発話等に典型的
に見られるパターン、同図(b)は/i、j、f/  
などの発話時に見られるパターン、同図(0)は/l、
d、n/などの発話時に見られるパターン、同図((1
)は/r/の発話時に典型的に見られるパターンである
FIGS. 3(a) to 3(d) schematically show patterns of the state of contact between the tongue and the hard palate, and the hatched areas indicate the contact areas. The figure (IL) shows a pattern typically seen in /S, Z/ utterances, and the figure (b) shows /i, j, f/.
The pattern seen when uttering words such as (0) in the same figure is /l,
Patterns seen when uttering d, n/, etc., in the same figure ((1
) is a pattern typically seen when /r/ is uttered.

6は音声波の強度検出器で1例えば検波平滑回路であり
、音声波の包絡線を抽出する。6は喉頭振動の強度検出
器で、例えば検波平滑回路であり喉頭振動の包絡線を抽
出する07は検出情報の一時記憶部で一強度検出器6の
出力である音声波強度情報と、呼気流速検出器2の出力
である呼気流速情報と、強度検出器6の出力である喉頭
振動強度情報と、口蓋接触検出器4の出力である口蓋接
触情報とを一時記憶する。8は一時記憶部の情報にもと
づいて破裂音を認識する破裂音認識部で、以下第4図を
用いてさらに詳しく説明する。
Reference numeral 6 denotes a voice wave intensity detector, and 1 is, for example, a detection and smoothing circuit, which extracts the envelope of the voice wave. 6 is a laryngeal vibration intensity detector, for example, a detection smoothing circuit, which extracts the envelope of laryngeal vibration; 07 is a temporary storage unit for detection information; Expiratory flow rate information that is the output of the detector 2, laryngeal vibration intensity information that is the output of the intensity detector 6, and palate contact information that is the output of the palate contact detector 4 are temporarily stored. Reference numeral 8 denotes a plosive sound recognition section that recognizes plosive sounds based on information in the temporary storage section, which will be explained in more detail below with reference to FIG.

第4図において、81は有音区間検索部で、一時記憶部
7の音声波強度情報(イ)にもとづいて音のある区間を
検索する082は破裂音検査部で、音のある区間が見つ
かった場合に、一時記憶部7の呼気流速情報にもとづい
て破裂音かどうかを検査する。83は接触パターン検査
′部で、一時記憶部7の口蓋接触情報に)にもとづいて
、破裂音のうち/l、d/と/ K 、 P 、 g 
、 b /とに分類検査を行なう。
In FIG. 4, reference numeral 81 denotes a sound interval search unit, which searches for an interval with a sound based on the sound wave intensity information (a) in the temporary storage unit 7. Reference numeral 082 denotes a plosive sound inspection unit, which searches for an interval with a sound based on the sound wave intensity information (a) in the temporary storage unit 7. If the sound is a plosive sound, it is checked based on the expiratory flow rate information in the temporary storage section 7. Reference numeral 83 denotes a contact pattern inspection unit which detects /l, d/ and /K, P, g among plosives based on the palate contact information in the temporary storage unit 7.
, b / and perform a classification check.

84は第1の有声音検査部で、一時記憶部7の喉頭振動
強度情報(ハ)にもとづいて、/l 、 (1/の検査
を行ない/1/と/d/を認識分類する。86は第2の
有声音検査部で、喉頭振動強度情報部(ハ)にもとづい
て、/に、p、g、b/の検査を行ない/k 、 p/
と/g 、 b/の二つに分類する。86は第1の呼気
流速検査部で、呼気流速情報(ロ)にもとづいて、/k
 、 p/の検査を行ない/に/と/p/を認識分類す
る。87は第2の呼気流速検査部で、呼気流速情報(ロ
)にもとづいて、/g 、 b/の検査を行ない/g/
と/b/を認識分類する0 前記のように構成された破裂音認識装置により発話され
た音素が、/l/、/d/、/に/、/p/。
Reference numeral 84 denotes a first voiced sound testing unit, which tests /l, (1/ based on the laryngeal vibration intensity information (c) in the temporary storage unit 7) and recognizes and classifies /1/ and /d/. 86 is the second voiced sound test section, which tests p, g, b/ on / based on the laryngeal vibration intensity information section (c) /k, p/
It is classified into two types: /g and b/. 86 is a first expiratory flow rate test section, which detects /k based on the expiratory flow rate information (b).
, p/ is inspected and /ni/ and /p/ are recognized and classified. Reference numeral 87 denotes a second expiratory flow rate test section, which tests /g and b/ based on the expiratory flow rate information (b).
Recognize and classify 0 and /b/ The phonemes uttered by the plosive recognition device configured as described above are /l/, /d/, /ni/, and /p/.

/g/、/b/のいずれの破裂音であるのか、無音また
は非破裂音であるのか認識することができる。
It is possible to recognize whether /g/ or /b/ is a plosive, and whether it is silent or non-plosive.

以下、破裂音認識部8の動作について、第6図に示すフ
ローチャートに従って説明する。
The operation of the plosive recognition unit 8 will be explained below according to the flowchart shown in FIG.

(a)  まず有音区間を検索する〇−一時記憶部の音
声波強度情報により、音声波の強度が実験によって決定
した閾値よりも大きくかつ、継続時間長に対する条件1
例えば8QmSOC以上という条件を満足すればその区
間を有音区間とするO前記条件が満たされない場合は無
音と認識する。
(a) First, search for a sound interval.〇-Condition 1 for the voice wave intensity to be greater than the threshold determined by experiment and duration length according to the voice wave intensity information in the temporary storage unit
For example, if the condition of 8QmSOC or more is satisfied, the section is determined to be a sound section; if the condition is not satisfied, it is recognized as silent.

(有音区間検索部81) (b)  破裂音かどうかの検査を行なう0有音区間の
始まり前後において、一時記憶部7の呼気流速情報によ
り、呼気流速が実験によって求めた閾値と継続時間長に
対する条件を満足する呼気流区間が存在し、かつ呼気流
区間の始まりにおいて呼気流速の変化率が実験によって
決定した閾値よりも太きいとき破裂音、他の場合は非破
裂音と認識する。
(Sound interval search unit 81) (b) Before and after the start of the 0-sound interval in which the test is performed to determine whether it is a plosive, the expiratory flow velocity is determined by the expiratory flow rate information in the temporary storage unit 7 to the threshold value and duration determined experimentally. If there is an expiratory flow section that satisfies the conditions for and the rate of change in expiratory flow velocity at the beginning of the expiratory flow section is greater than an experimentally determined threshold, it is recognized as a plosive sound, and in other cases it is recognized as a non-plosive sound.

(破裂音検査部82) (0)  次に破裂音を/l 、 d/と/k 、 p
 、 g 、b/の二つに分類する。有音区間の始まり
の前方において、一時記憶部70ロ蓋接触情報を調べ、
閉鎖の接触パターンTDNまたはR(第3図(C)。
(Plosive sound inspection unit 82) (0) Next, check the plosive sounds /l, d/, /k, p
It is classified into two categories: , g, and b/. In front of the beginning of the sound section, check the temporary storage unit 70 lid contact information,
Closure contact pattern TDN or R (Fig. 3(C)).

((1)参照)がありかつ、実験によって求めた継続時
間長に対する条件を満足すれば/l 、 d/のいずれ
か、条件が満足されない場合/に、p、g、b/のいず
れかとする。
(See (1)) and if the condition for the duration determined by experiment is satisfied, then either /l or d/, and if the condition is not satisfied, then p, g, or b/. .

(接触パターン検査部) ((1)/l、d/のいずれかであった場合に、/1/
と/d/を認識する。有音区間の始まりの前後において
、一時記憶部7の喉頭振動強度情報により、喉頭振動強
度が実験によって決定した閾値と継続時間長に対する条
件を満足する区間を検査I〜、その区間を有声区間とす
る。
(Contact pattern inspection unit) ((1) If either /l or d/, /1/
Recognize and /d/. Before and after the start of a voiced section, a section in which the laryngeal vibration intensity satisfies the conditions for the threshold value and duration length determined by experiment is examined based on the laryngeal vibration intensity information in the temporary storage section 7, and that section is designated as a voiced section. do.

有声区間の始まりと有音区間の始まりとの時間差を求め
、有声区間の始まりの方が有音区間の始まりよりも実験
によって決定した一定時間、例えば2omsec以上先
行していれば有声音、他は無声音と判定する。従って有
声破裂音/d/と無声破裂音/1/とを認識分類するこ
とができる。
Find the time difference between the start of the voiced section and the start of the voiced section, and if the start of the voiced section is ahead of the start of the voiced section by a certain amount of time determined by experiment, for example 2 omsec or more, it is voiced, and otherwise Determined as voiceless sound. Therefore, it is possible to recognize and classify the voiced plosive /d/ and the voiceless plosive /1/.

(有声音検査部84) (617に、p、g、b/のいずれがであった場合、有
声音/g、b/ と無声音/に、p/の二つに分類する
。有声音と無声音の判定は前記(d)(有声音検査部8
4)と同様にして行なう。
(Voiced sound inspection unit 84) (If any of p, g, and b/ is found in 617, it is classified into voiced sounds /g, b/ and unvoiced sounds /, and p/. Voiced sounds and unvoiced sounds The determination is made by the voiced sound testing section 8 (d)
Proceed in the same manner as 4).

(有声音検査i[5s) (f)  /に、p/のいずれかであった場合、/に/
と/’ p/”を認識する。有音区間の始まりと呼気流
区間の始まりとを比較し、有音区間の始まりが、呼気流
区間の始まりよりも、実験によって求めた一定時間以上
先行していれば/に/、他の場合は/p/と認識する。
(Voiced sound test i [5s) (f) If / is either p/, then / is /
and /'p/''.The start of the sound section and the beginning of the expiratory flow section are compared, and the start of the sound section precedes the start of the expiration flow section by a certain amount of time determined experimentally. If so, it is recognized as /ni/, and in other cases, it is recognized as /p/.

(呼気流速検査部86) (g)  / g 、 b /のいずれかであった場合
、/g/と/b/を認識する。有音区間の始まりと呼気
流区間の始まりとを比較し、有音区間の始まりが、呼気
流区間の始まりよりも、実験によって求めた一定時間以
上先行していれば/g/、他の場合は/b/と認識する
(Expiratory flow rate test unit 86) (g) If it is either /g or b/, /g/ and /b/ are recognized. Compare the start of the sound section and the start of the expiratory flow section, and if the start of the sound section precedes the start of the expiration flow section by a certain amount of time determined by experiment, /g/, otherwise is recognized as /b/.

(呼気流速検査部87) 以上のように本実施例によれば、音声波強度情報にもと
づいて有音区間を検索し、呼気流速情報にもとづいて破
裂音を確定し、口蓋接触情報にもとづいて破裂音を/l
 、 d/と/に、p3g、b/に二分類した後、喉頭
振動強度情報にもとづいて検査した有声区間から、有声
破裂音/d、g、b/と無声破裂音/l、に、p/を分
類する。さらに呼気流速情報にもとづいて検査した呼気
流区間と前記有音区間との位相差から口唇破裂音/p 
、 b/と後舌付近の破裂音/k 、 g/とを分離す
ることによって、発話された音素が破裂音/l/、/(
1/。
(Expiratory flow rate testing unit 87) As described above, according to this embodiment, a sound interval is searched based on the sound wave intensity information, a plosive is determined based on the expiratory flow rate information, and a plosive is determined based on the palate contact information. plosive /l
, d/ and /, p3g, b/, and then inspected based on the laryngeal vibration intensity information, voiced plosives /d, g, b/ and voiceless plosives /l, p, /Classify. Furthermore, based on the phase difference between the expiratory flow section inspected based on the expiratory flow velocity information and the sound section, the labial plosive sound/p
, b/ and the plosive sounds /k, g/ near the back of the tongue, the uttered phoneme becomes a plosive sound /l/, /(
1/.

/に/、/p/、/g/、/b/のいずれであるか、無
音または非破裂音であるかを認識することができる。
It is possible to recognize whether it is /ni/, /p/, /g/, /b/, or whether it is silent or a non-plosive sound.

なお本実施例では/l 、 d/と/に、p、g、b/
の二分類を行なってから有声破裂音と無声破裂音の分類
を行なっているが、有声音と無声音の分類を行って後、
接触パターンの検査を行ってもよい。
In this example, p, g, b/ are added to /l, d/ and /.
After performing the two classifications, voiced plosives and voiceless plosives are classified, but after classifying voiced and voiceless sounds,
Contact pattern testing may also be performed.

また有音区間検索部81および各検査部82〜87にお
いて、そのつどの継続時間長に対する条件判定を行なわ
ずに検査分類を行ない、最後に継続時間長に対する条件
を判定するように構成することもできる。
It is also possible to configure the sound interval search section 81 and each of the inspection sections 82 to 87 to perform test classification without determining the conditions for each duration length, and finally to determine the conditions for the duration length. can.

以上のように本発明は音声強度情報、呼気流速情報、喉
頭振動強度情報および口蓋接触情報を一時記憶し、この
一時記憶した情報にもとづき破裂音の認識を行うように
構成したので、任意の話者の破裂子音を確実に検出し分
類することができ、また破裂音認識部に乗除算を必要と
せず演算処理の簡単な破裂音認識装置が実現でき、音声
認識などに著るしい効果を与えるものである。
As described above, the present invention is configured to temporarily store voice intensity information, expiratory flow velocity information, laryngeal vibration intensity information, and palate contact information, and recognize plosive sounds based on this temporarily stored information. It is possible to reliably detect and classify plosive consonants in people, and it is possible to realize a plosive consonant recognition device that requires no multiplication or division in the plosive recognition unit and has simple arithmetic processing, which has a significant effect on speech recognition, etc. It is something.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明の一実施例における破裂音認識装置を示
すブロック図、第2図は口蓋接触検出器の形状例を示す
図、第3図(a)〜(d)は口蓋接触検出器と舌との接
触パターン例を示す図、第4図は破裂音認識部の一具体
例を示すブロック図、第6図は破裂音認識部の動作を説
明するためのフローチャートである。 1・・・・・・音声波検出器、2・・・・・・呼気流速
検出器。 3・・・・・・喉頭振動検出器、4・・・・・・口蓋接
触検出器。 6・・・・・・強度検出器、6・・・・・・強度検出器
、了・・・・・・一時記憶部、8・・・・・・破裂音認
識部。 特許出願人  工業技術院長  石 坂 誠 −第 2
 図 第3図 1α)    l約 (C+         (dl 第5図
FIG. 1 is a block diagram showing a plosive recognition device according to an embodiment of the present invention, FIG. 2 is a diagram showing an example of the shape of a palate contact detector, and FIGS. 3(a) to (d) are diagrams showing a palate contact detector. FIG. 4 is a block diagram showing a specific example of the plosive recognition section, and FIG. 6 is a flowchart for explaining the operation of the plosive recognition section. 1...Audio wave detector, 2...Expiratory flow rate detector. 3... Laryngeal vibration detector, 4... Palate contact detector. 6...Intensity detector, 6...Intensity detector, End...Temporary storage section, 8......Plosive sound recognition section. Patent applicant Makoto Ishizaka, Director of the Agency of Industrial Science and Technology - 2nd
Figure 3 Figure 1α) l approx. (C+ (dl Figure 5

Claims (1)

【特許請求の範囲】[Claims] 音声波を検出する手段と、検出した音声波から音声の強
度情報を得る手段と、呼気流速を検出する手段と、喉頭
振動を検出する手段と、検出した喉頭振動から喉頭振動
の強度情報を得る手段と、舌と硬口蓋との接触情報を検
出する手段と、前記音声強度情報、呼気流速情報、喉頭
振動強度情報および口蓋接触情報を一時記憶する一時記
憶部と、一時記憶部の情報にもとづき破裂音の認識を行
なう破裂音認識部とを備えたことを特徴とする破裂音認
識装置。
means for detecting voice waves, means for obtaining voice intensity information from the detected voice waves, means for detecting expiratory flow velocity, means for detecting laryngeal vibrations, and obtaining laryngeal vibration intensity information from the detected laryngeal vibrations. means for detecting contact information between the tongue and the hard palate; a temporary storage section that temporarily stores the voice intensity information, expiratory flow rate information, laryngeal vibration intensity information, and palate contact information; A plosive sound recognition device comprising: a plosive sound recognition unit that recognizes plosive sounds.
JP10541782A 1982-06-21 1982-06-21 Plosive recognition equipment Granted JPS58223188A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP10541782A JPS58223188A (en) 1982-06-21 1982-06-21 Plosive recognition equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP10541782A JPS58223188A (en) 1982-06-21 1982-06-21 Plosive recognition equipment

Publications (2)

Publication Number Publication Date
JPS58223188A true JPS58223188A (en) 1983-12-24
JPS6331795B2 JPS6331795B2 (en) 1988-06-27

Family

ID=14407024

Family Applications (1)

Application Number Title Priority Date Filing Date
JP10541782A Granted JPS58223188A (en) 1982-06-21 1982-06-21 Plosive recognition equipment

Country Status (1)

Country Link
JP (1) JPS58223188A (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS501846A (en) * 1973-05-14 1975-01-09

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS501846A (en) * 1973-05-14 1975-01-09

Also Published As

Publication number Publication date
JPS6331795B2 (en) 1988-06-27

Similar Documents

Publication Publication Date Title
US8566088B2 (en) System and method for automatic speech to text conversion
Yang et al. BaNa: A noise resilient fundamental frequency detection algorithm for speech and music
JPS5972496A (en) Single sound identifier
JPS6247320B2 (en)
CN107610691B (en) English vowel sounding error correction method and device
JPS60200300A (en) Voice head/end detector
Glass et al. Detection and recognition of nasal consonants in American English
JPS58223188A (en) Plosive recognition equipment
Niederjohn et al. Computer recognition of the continuant phonemes in connected English speech
JPS58224393A (en) Fricative recognition equipment
JPS59121099A (en) Voice section detector
JPS58223191A (en) Nasal recognition equipment
JPS58150997A (en) Speech feature extractor
JP6730636B2 (en) Information processing apparatus, control program, and control method
Dikshit et al. Electroglottograph as an additional source of information in isolated word recognition
JPH036519B2 (en)
JPS63217399A (en) Voice section detecting system
Perdigão et al. Pathological Voice Detection using Turbulent Speech Segments.
Das Speaker Verification Using Simple Temporal Features and Pitch Synchronous Cepstral Coefficients
JPS59149399A (en) Consonant sorter
JP2744622B2 (en) Plosive consonant identification method
Zewoudie Discriminative features for GMM and i-vector based speaker diarization
JPH0682275B2 (en) Voice recognizer
JPH025099A (en) Voiced, voiceless, and soundless state display device
Ruinskiy et al. An algorithm for accurate breath detection in speech and song signals