JP2985976B2

JP2985976B2 - Syllable recognition device with tongue movement detection

Info

Publication number: JP2985976B2
Application number: JP3038845A
Authority: JP
Inventors: 明平岩; 勝憲下原; 匡内山; 一彦篠沢
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1991-02-12
Filing date: 1991-02-12
Publication date: 1999-12-06
Anticipated expiration: 2014-12-06
Also published as: JPH04257900A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、ユーザ（発声者）の発
声または発声動作（発声の口動作）に伴う舌の動作を検
出して音節の認識を行う舌動作検出型音節認識装置に関
するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a tongue movement detection type syllable recognition device which detects a utterance of a user (speaker) or a tongue movement accompanying a utterance movement (mouth operation of utterance) to recognize a syllable. It is.

【０００２】[0002]

【従来の技術】従来、コンピュータなどへの音声による
情報の入力を目的として、ユーザが発声した音節の認識
を行う音節認識装置の開発が進んでいる。2. Description of the Related Art A syllable recognizing apparatus for recognizing a syllable uttered by a user for the purpose of inputting information by voice to a computer or the like has been developed.

【０００３】この種の音節認識装置としては、たとえ
ば、以下に示すものが提案されている。As a syllable recognition device of this type, for example, the following device has been proposed.

【０００４】（１）マイクで検出した音声波形をＤＰマ
ッチングによって解析して音節認識を行うもの（迫江
他，音響学会誌，２，９，Ｐ．４３ー４９，１９７
８）。(1) A syllable recognition system which analyzes a speech waveform detected by a microphone by DP matching (Sakoe
Et al., Journal of the Acoustical Society of Japan, 2, 9, p. 43-49, 197
8).

【０００５】（２）マイクで検出した音声波形を隠れマ
ルコフモデルによって解析して音節認識を行うもの（Ｒ
ａｂｉｎｅｒ．Ｌ．Ｒ． et al，ＢｅｌｌＳｙｓｔ．
Ｔｅｃｈ．Ｊ．，６２，４，ｐ．１０７５ー１１０５，
１９８３）。(2) A syllable recognition system which analyzes a speech waveform detected by a microphone using a hidden Markov model (R)
abiner. L. R. et al, Bell Syst.
Tech. J. , 62, 4, p. 1075-1105,
1983).

【０００６】（３）マイクで検出した音声波形を神経回
路網によって解析して音節認識を行うもの（河原他，
電子情報通信学会，音声技報ＳＰ８８ー３１，１９８
８）。しかし、これらの音節認識装置は、マイクで検出
した音声波形にノイズが混入されていると音節の認識率
が低下するため、静寂な環境が要求される反面、マイク
で検出できる程度の大きさで音声を発しなければ認識す
ることができないという欠点がある。(3) A syllable recognition method in which a speech waveform detected by a microphone is analyzed by a neural network (Kawahara et al.,
IEICE, Speech Technical Report SP88-31, 198
8). However, these syllable recognition devices require a quiet environment because the recognition rate of syllables decreases when noise is mixed in the voice waveform detected by the microphone. There is a drawback that recognition cannot be performed unless voice is emitted.

【０００７】この欠点を解消するため、音声以外の発声
に関わる情報を補完的に利用する音節認識装置が考えら
れ、その一つとして、リップリーディングをコンピュー
タにより行う音節認識装置が提案されている（Ｐｅｔａ
ｊａｎＥ．，ＩＥＥＥＣＶＰＲ’８５，ｐ．４０ー
４７，１９８５）。In order to solve this drawback, a syllable recognizing device which uses information relating to utterance other than speech in a complementary manner has been considered, and as one of them, a syllable recognizing device which performs lip reading by a computer has been proposed ( Peta
jan E. , IEEE CVPR'85, p. 40-47, 1985).

【０００８】[0008]

【発明が解決しようとする課題】しかしながら、上述し
た、リップリーディングをコンピュータにより行う音節
認識装置は、カメラを用いてユーザの唇の動きを検出す
ることによってリップリーディングを行うため、カメラ
をユーザの顔の正面に設置する必要があるので、使用可
能な場所が制限されてしまうという欠点があった。However, the above-described syllable recognition apparatus for performing lip reading by a computer uses the camera to detect the movement of the user's lips to perform the lip reading. There is a drawback that the usable place is restricted because it needs to be installed in front of the camera.

【０００９】本発明の目的は、使用場所に制限されず、
かつ騒音環境下においても使用することができる舌動作
検出型音節認識装置を提供することにある。The object of the present invention is not limited to the place of use,
Another object of the present invention is to provide a syllable recognition device for detecting tongue movement that can be used even in a noisy environment.

【００１０】[0010]

【課題を解決するための手段】本発明の舌動作検出型音
節認識装置は、口腔内に設置される舌動作検出送信部
と、口腔外に設置される音節認識部とからなり、前記舌
動作検出送信部が、近赤外光を照射する発光素子と、前
記近赤外光の反射光を受光して受光信号に変換する受光
素子と、該受光素子で変換された前記受光信号を送信信
号に変換して前記音節認識部へ送信する送信機と、該送
信機と前記発光素子と前記受光素子とへ電力を供給する
電池とを含み、前記音節認識部が、前記送信信号を受信
するアンテナと、該アンテナで受信された前記送信信号
から前記受光信号を復調する受信機と、該受信機で復調
された前記受光信号に応じて各出力ユニットの出力値を
出力する神経回路網と、該神経回路網から出力された前
記各出力ユニットの出力値の大小比較を行って、該出力
値が最大となる前記出力ユニットに対応する音節を出力
する比較器とを含む。The tongue movement detection type syllable recognition apparatus of the present invention comprises a tongue movement detection transmission unit installed in the oral cavity and a syllable recognition unit installed outside the oral cavity. A light-emitting element that irradiates near-infrared light, a light-receiving element that receives reflected light of the near-infrared light and converts the light into a light-receiving signal, and transmits the light-receiving signal converted by the light-receiving element to a transmission signal. And a battery that supplies power to the transmitter, the light-emitting element, and the light-receiving element, and the syllable recognition section receives the transmission signal. A receiver for demodulating the received light signal from the transmission signal received by the antenna, a neural network for outputting an output value of each output unit according to the received light signal demodulated by the receiver, Of each of the output units output from the neural network Performing a magnitude comparison of force values, the output value and a comparator for outputting a syllable corresponding to said output unit is maximum.

【００１１】[0011]

【作用】本発明の舌動作検出型音節認識装置では、発光
素子から舌面に向けて照射された近赤外光の反射光を受
光素子で検出して得た、ユーザの発声または発声動作に
伴う舌の動作に応じた受光信号を用いて、神経回路網お
よび比較器で音節認識を行うため、ユーザが発した音声
を検出することなく音声認識を行うことができるので、
騒音環境下および絶対的な静寂性が要求される環境下に
おいても音節を精度よく認識することができる。また、
受光信号は送信信号に変換されて舌動作検出送信部から
音節認識部へ送信されるため、音節認識部の設置場所は
送信信号が受信できる場所であればどこでもよいので、
使用場所が制限されることがない。In the tongue movement detection type syllable recognition device of the present invention, the utterance or utterance movement of the user obtained by detecting the reflected light of the near-infrared light emitted from the light emitting element toward the tongue surface by the light receiving element. Using the received light signal according to the accompanying tongue movement, the syllable recognition is performed by the neural network and the comparator, so that the voice recognition can be performed without detecting the voice emitted by the user.
Syllables can be accurately recognized even in a noisy environment or an environment where absolute silence is required. Also,
Since the received light signal is converted to a transmission signal and transmitted from the tongue movement detection transmission unit to the syllable recognition unit, the syllable recognition unit can be installed at any location as long as the transmission signal can be received,
The place of use is not restricted.

【００１２】音節認識部が、学習動作時に発せらた音声
を検出して検出音声信号に変換するマイクと、該マイク
で変換された検出音声信号を増幅する増幅器と、該増幅
器で増幅された検出音声信号から音節を認識して、該音
節を示す教師信号パターンを出力する音声認識回路とを
さらに含むとともに、神経回路網が、音声認識回路から
入力された教師信号パターンおよび学習動作時に受信機
で復調された受光信号より該受光信号と音節との関係を
学習する学習機能をさらに有することにより、ユーザご
とに学習動作を行わせたのち、認識動作を行わせること
ができる。A syllable recognizing unit for detecting a voice emitted during the learning operation and converting the voice into a detected voice signal; an amplifier for amplifying the detected voice signal converted by the microphone; A speech recognition circuit for recognizing a syllable from the speech signal and outputting a teacher signal pattern indicating the syllable, and a neural network, wherein the teacher signal pattern input from the speech recognition circuit and the receiver at the time of learning operation. By further providing a learning function for learning the relationship between the received light signal and the syllable from the demodulated received light signal, the learning operation can be performed for each user, and then the recognition operation can be performed.

【００１３】[0013]

【実施例】次に、本発明の実施例について図面を参照し
て説明する。Next, embodiments of the present invention will be described with reference to the drawings.

【００１４】図１（Ａ），（Ｂ）は本発明の舌動作検出
型音節認識装置の第１の実施例を示すブロック図、図２
は図１（Ａ）の舌動作検出送信部１０の格納容器１５の
外観を示す図、図３は図１（Ａ）の舌動作検出送信部１
０の口腔内への設置方法を示す図、図４は図１（Ｂ）の
神経回路網２６の構成を示す図である。FIGS. 1A and 1B are block diagrams showing a first embodiment of a syllable recognition device of the tongue movement detecting type according to the present invention, and FIGS.
FIG. 1A is a diagram showing the appearance of the storage container 15 of the tongue movement detection / transmission unit 10 of FIG. 1A, and FIG.
FIG. 4 is a diagram illustrating a method of setting the neural network in the oral cavity, and FIG. 4 is a diagram illustrating a configuration of the neural network 26 in FIG.

【００１５】本実施例の舌動作検出型音節認識装置は、
口腔内に設置される舌動作検出送信部１０と、口腔外に
設置される音節認識部２０とからなる。The syllable recognition device of the tongue movement detecting type of this embodiment
It comprises a tongue movement detection transmission unit 10 installed in the oral cavity and a syllable recognition unit 20 installed outside the oral cavity.

【００１６】舌動作検出送信部１０は、図１（Ａ）に示
すように、近赤外光Ｌを照射する複数個の発光素子１１
と、近赤外光Ｌの反射光を受光して受光信号α，β，γ
に変換する複数個の受光素子１２と、受光素子１２で変
換された受光信号α，β，γを送信信号Ｗに変換して音
節認識部２０へ送信する送信機１３と、送信機１３と各
発光素子１１と各受光素子１２とへ電力を供給する電池
１４とからなる。As shown in FIG. 1A, the tongue movement detecting and transmitting section 10 includes a plurality of light emitting elements 11 for emitting near-infrared light L.
And the received light signals α, β, γ
A plurality of light-receiving elements 12 for converting the light-receiving signals α, β, and γ converted by the light-receiving elements 12 into transmission signals W and transmitting the transmission signals W to the syllable recognition unit 20; It comprises a light emitting element 11 and a battery 14 for supplying power to each light receiving element 12.

【００１７】音節認識部２０は、図１（Ｂ）に示すよう
に、送信信号Ｗを受信するアンテナ２１と、アンテナ２
１で受信された送信信号Ｗから各受光信号α，β，γを
復調する受信機２２と、学習動作時に発せられた音声を
検出して検出音声信号Ｓに変換するマイク２３と、マイ
ク２３で変換された検出音声信号Ｓを増幅する増幅器２
４と、増幅器２４で増幅された検出音声信号Ｓから音節
を認識して、この音節を示す教師信号パターンを出力す
る音声認識回路２５と、音声認識回路２５から入力され
た教師信号パターンおよび学習動作時に受信機２２で復
調された各受光信号α，β，γより各受光信号α，β，
γと音節との関係を学習する学習機能を有するととも
に、受信機２２で復調された各受光信号α，β，γに応
じて各出力ユニット５３₁〜５３_Nの出力値ｙ₁〜ｙ_N（図
４参照）を出力する神経回路網２６と、神経回路網２６
から出力された各出力ユニット５３₁〜５３_Nの出力値ｙ
₁〜ｙ_N の大小比較を行い、出力値が最大となる出力ユ
ニットに対応する音節を出力する比較器２７とからな
る。As shown in FIG. 1B, the syllable recognition unit 20 includes an antenna 21 for receiving the transmission signal W, and an antenna 2
The receiver 22 demodulates the received light signals α, β, and γ from the transmission signal W received in step 1, the microphone 23 that detects the sound emitted during the learning operation and converts the sound into the detected sound signal S, and the microphone 23. Amplifier 2 for amplifying converted detection audio signal S
4, a speech recognition circuit 25 for recognizing a syllable from the detected speech signal S amplified by the amplifier 24 and outputting a teacher signal pattern indicating the syllable, a teacher signal pattern input from the speech recognition circuit 25, and a learning operation. The received light signals α, β, γ are demodulated by the receiver 22 at
In addition to having a learning function of learning the relationship between γ and syllables, the output values y _{1 to} y _N of the output units 53 _{1 to} 53 _N (depending on the received light signals α, β, γ demodulated by the receiver 22) 4) and a neural network 26 for outputting
Output value y of each output unit 53 _{1 to} 53 _N output from
And a comparator 27 for comparing the magnitudes of _{1 to} y _N and outputting a syllable corresponding to the output unit having the maximum output value.

【００１８】舌動作送信部１０は、図２に示す格納容器
１５に格納されており、舌動作送信部１０の口腔内への
設置は、図３に示すように、格納容器１５の図２図示両
端近傍に設けられた２つのＹ字形状のブリッジ１６が、
歯と歯の間にそれぞれ差込まれることにより行われる。
このとき、発光素子１１から近赤外光Ｌを舌面１９に向
けて照射し、近赤外光Ｌの舌面１９からの反射光を受光
素子１２で受光することができるように、格納容器１５
は発光素子１１および受光素子１２が舌面１９と対向す
るよう上顎部に設置される。The tongue operation transmitter 10 is stored in a storage container 15 shown in FIG. 2, and the tongue operation transmitter 10 is installed in the oral cavity as shown in FIG. Two Y-shaped bridges 16 provided near both ends are
This is done by being inserted between each tooth.
At this time, the light receiving element 11 irradiates the near infrared light L toward the tongue surface 19, and the light receiving element 12 can receive the reflected light of the near infrared light L from the tongue surface 19. Fifteen
Is installed on the upper jaw such that the light emitting element 11 and the light receiving element 12 face the tongue surface 19.

【００１９】また、神経回路網２６は、図４に示すよう
に、各受光信号α，β，γのサンプル値ｄ₁〜ｄ₅，ｅ₁
〜ｅ₅，ｆ₁〜ｆ₅（図７参照）が数値パターンとして入
力される、複数個の入力ユニット５１₁〜５１₁₅ からな
る入力層３１と、重み３９を有する第１のリンク３６を
介して入力層３１と接続された、複数個の中間ユニット
５２₁〜５２_Mからなる中間層３２と、重み３９を有する
第２のリンク３７を介して中間層３２と接続された、複
数個の出力ユニット５３₁〜５３_Nからなる出力層３３
と、各受光信号α，β，γのサンプル値ｄ₁〜ｄ₅，ｅ₁
〜ｅ₅，ｆ₁〜ｆ₅からなる数値パターン，出力層３３の
各出力ユニット５３₁〜５３_Nの出力値ｙ₁〜ｙ_Nからなる
出力パターンおよび音声認識回路２５から入力される教
師信号パターンを用いて、第１および第２のリンク３
６，３７の各重み３９を更新する重み制御部３４とから
なる。なお、各出力ユニット５３₁〜５３_Nは、各単音節
に対応するようになっている（たとえば、出力ユニット
５３₁ が単音節“あ”に、出力ユニット５３₂ が単音節
“い”に、出力ユニット５３₃ が単音節“う”に対
応）。Further, as shown in FIG. 4, the neural network 26 samples the light receiving signals α, β, γ with sample values d _{1 to} d ₅ , e _1.
~e _{_5,} f ₁ ~f ₅ (see FIG. 7) is input as numerical pattern, an input layer 31 comprising a plurality of input units 51 ₁ to 51 _15, via a first link 36 having a weight 39 Layer 32 composed of a plurality of intermediate units 52 _{1 to} 52 _M connected to the input layer 31 and a plurality of outputs connected to the intermediate layer 32 via a second link 37 having a weight 39. Output layer 33 composed of units 53 _{1 to} 53 _N
And sample values d _{1 to} d ₅ , e _{1 of} the respective light receiving signals α, β, γ
To e _5, f ₁ ~f numerical pattern of _five, the teacher signal pattern input from the output unit 53 ₁ to 53 _N of the output value y ₁ consisting ~y _N output pattern and a voice recognition circuit 25 of the output layer 33 And the first and second links 3
And a weight control unit 34 for updating each of the weights 39 of 37. Each output module 53 ₁ to 53 _N is arranged to correspond to each monosyllabic (e.g., the output unit 53 ₁ is a single syllable "Ah", the output unit 53 ₂ is the monosyllable "have", output unit 53 ₃ corresponds to a single syllable "cormorant").

【００２０】次に、本実施例の舌動作検出型音節認識装
置の動作について、学習動作および認識動作に分けて説
明する。Next, the operation of the syllable recognition apparatus of the tongue movement detection type according to the present embodiment will be described separately for a learning operation and a recognition operation.

【００２１】まず、学習動作について、図５に示すフロ
ーチャート，図６に示す各信号の流れを示す図，図７に
示す波形図および図８に示す波形図を用いて説明する。
なお、簡単のため、発光素子１１および受光素子１２の
数は３個として説明する。First, the learning operation will be described with reference to a flowchart shown in FIG. 5, a diagram showing the flow of each signal shown in FIG. 6, a waveform diagram shown in FIG. 7, and a waveform diagram shown in FIG.
Note that, for simplicity, the number of light emitting elements 11 and light receiving elements 12 will be described as three.

【００２２】学習動作は、音声認識回路２５において正
しく音節認識ができる程度に静寂な環境下で行われる。
この環境下で、ユーザ４０により単音節からなる音声が
発せられる（ステップ110 ）。発せられた音声はマイク
２３で検出され検出音声信号Ｓに変換される。検出音声
信号Ｓは増幅器２４で増幅されたのち、音声認識回路２
５に入力される。音声認識回路２５において、検出音声
信号Ｓによる音声認識が行われることにより、発声され
た単音節が何であったかが認識される（ステップ111
）。一方、単音節からなる音声が発せられたときの舌
の動作が、発光素子１１から近赤外光Ｌが舌面１９に向
けて照射され、近赤外光Ｌの舌面１９からの反射光が受
光素子１２で受光されることによって検出され、たとえ
ば図７（Ａ）〜（Ｃ）にそれぞれ示すような各受光信号
α，β，γに変換される。各受光信号α，β，γは、送
信機１２で送信信号Ｗに変換されたのち、音節認識部２
０へ送信される（ステップ112 ）。なお、発光素子１
１，受光素子１２および送信機１２を動作させるために
必要な電力は、電池１３からそれぞれに供給される。送
信信号Ｗは、音節認識部２０のアンテナ２１で受信され
て受信機２２で復調されることにより、各受光信号α，
β，γに戻されたのち、神経回路網２６に入力される。The learning operation is performed in a quiet environment in which the voice recognition circuit 25 can correctly recognize syllables.
In this environment, a voice composed of a single syllable is emitted by the user 40 (step 110). The emitted sound is detected by the microphone 23 and converted into a detected sound signal S. After the detected voice signal S is amplified by the amplifier 24, the voice recognition circuit 2
5 is input. The speech recognition circuit 25 performs speech recognition based on the detected speech signal S, thereby recognizing what the uttered monosyllable was (step 111).
). On the other hand, the operation of the tongue when a sound consisting of a single syllable is emitted is such that the near-infrared light L is emitted from the light emitting element 11 toward the tongue surface 19 and the reflected light of the near-infrared light L from the tongue surface 19 Are detected by being received by the light receiving element 12, and are converted into light receiving signals α, β, and γ as shown in FIGS. 7A to 7C, for example. Each of the received light signals α, β, and γ is converted into a transmission signal W by the transmitter 12 and then transmitted to the syllable recognition unit 2.
0 (step 112). The light emitting element 1
1, power required to operate the light receiving element 12 and the transmitter 12 is supplied from the battery 13 to each. The transmission signal W is received by the antenna 21 of the syllable recognizing unit 20 and demodulated by the receiver 22, so that each of the light receiving signals α,
After being returned to β and γ, they are input to the neural network 26.

【００２３】神経回路網２６における各受光信号α，
β，γの抜取りは、以下のようにして行われる（ステッ
プ113 ）。Each received light signal α,
Extraction of β and γ is performed as follows (step 113).

【００２４】図８（Ａ）に示す増幅器２４より入力され
る検出音声信号Ｓを全波整流することにより、同図
（Ｂ）に示す包絡線波形Ｅを得る。その後、包絡線波形
Ｅの振幅が所定のいき値θ₁ となる時刻ｔ_L1，ｔ_L2を求
め、同図（Ｃ）に示す時刻ｔ_L1から時刻ｔ_L2まで振幅が
“１”となるゲート信号Ｇを作成する。このゲート信号
Ｇで、受信機２２から入力される受光信号αに同図
（Ｄ）で示すような観測窓をかけることにより、受光信
号αの抜取りを行う。他の２つの受光信号β，γの抜取
りも同様にして行われる。By performing full-wave rectification on the detected audio signal S input from the amplifier 24 shown in FIG. 8A, an envelope waveform E shown in FIG. 8B is obtained. Thereafter, times t _L1 and t _{L2 at} which the amplitude of the envelope waveform E reaches the predetermined threshold value θ ₁ are obtained, and a gate signal whose amplitude is “1” from time t _L1 to time t _L2 shown in FIG. Create G. With the gate signal G, the received light signal α input from the receiver 22 is applied to an observation window as shown in FIG. Extraction of the other two light receiving signals β and γ is performed in the same manner.

【００２５】このようにして抜取られた各受光信号α，
β，γは、図７（Ａ）〜（Ｃ）にそれぞれ示すように所
定の時間間隔でサンプリングされ、各サンプル値ｄ₁〜
ｄ₅，ｅ₁〜ｅ₅，ｆ₁〜ｆ₅が、数値パターンとして神経
回路網２６の入力層３１の各入力ユニット５１₁〜５１
₁₅ にそれぞれに入力される（ステップ114 ）。一方、
音声認識回路２５で認識された単音節は、学習用の教師
信号パターンとして神経回路網２６の重み制御部３４に
入力される。たとえば、単音節が“う”の場合には、学
習用教師信号パターン“００１００・・・・０”として
神経回路網２６の重み制御部３４に入力される。Each of the light-receiving signals α,
β and γ are sampled at predetermined time intervals as shown in FIGS. 7A to 7C, and each sample value d ₁ to
d ₅ , e _{1 to} e ₅ , f _{1 to} f ₅ are input patterns 51 _{1 to} 51 of the input layer 31 of the neural network 26 as numerical patterns.
₁₅ are input to each (step 114). on the other hand,
The monosyllable recognized by the speech recognition circuit 25 is input to the weight control unit 34 of the neural network 26 as a teacher signal pattern for learning. For example, when the single syllable is “U”, it is input to the weight control unit 34 of the neural network 26 as the learning teacher signal pattern “00100... 0”.

【００２６】神経回路網２６は、各サンプル値ｄ₁〜
ｄ₅，ｅ₁〜ｅ₅，ｆ₁〜ｆ₅からなる数値パターンと音声
認識回路２５で認識された単音節の学習用の教師信号パ
ターンとの関係を公知の誤差逆伝播法（Ｄ．ＥＲｕｍ
ｅｌｈａｒｔ et al.，ＰａｒａｌｌｅｌＤｉｓｔｒ
ｉｂｕｔｅｄＰｒｏｃｅｓｓｉｎｇ，ＭＩＴＰｒｅ
ｓｓ．，１９８６）により繰返し学習する（ステップ11
5 ）。この学習は、たとえば、教師信号パターン“００
１００・・・・０”に対して、神経回路網２６の各出力
ユニット５３₁〜５３_Nの出力値ｙ₁〜ｙ_Nのうち単音節
“う”に対応する出力ユニット５３₃ の出力値ｙ₃ が最
大となり、単音節“う”が比較器２７から出力されるま
で、重み制御部３４による第１，第２のリンク３６，３
７の重み３９の更新を繰返すことにより行われる（ステ
ップ116 ）。The neural network 26 calculates each sample value d _1-
_{_{_{d 5, e 1 ~e 5,}}} f 1 ~f of _five numeric pattern and a known error backpropagation the relationship between the teacher signal pattern for learning of the recognized syllable in the speech recognition circuit 25 (D.E Rum
elhart et al., Parallel Distr
ibued Processing, MIT Pre
ss. , 1986) (step 11).
Five ). This learning is performed, for example, with the teacher signal pattern “00”.
100 "for a single syllable of the output values y ₁ ~y _N of the output units 53 ₁ to 53 _N of the neural network 26" ... 0 output value y of the output unit 53 ₃ corresponding to the Hare " ₃ until the monosyllable “U” is output from the comparator 27, the first and second links 36, 3
This is performed by repeating the updating of the weight 39 of step 7 (step 116).

【００２７】このようにして一つの単音節の学習が終了
すると、他の単音節の学習が同様にして行われ、すべて
の単音節の学習が終了すると、学習動作が終了される
（ステップ117 ）。When the learning of one single syllable is completed in this way, the learning of the other single syllable is performed in the same manner. When the learning of all the single syllables is completed, the learning operation is completed (step 117). .

【００２８】次に、本実施例の舌動作検出型音節認識装
置の認識動作について、図９に示すフローチャート，図
１０に示す各信号の流れを示す図、および図１１に示す
波形図を用いて説明する。Next, the recognition operation of the tongue movement detection type syllable recognition device of the present embodiment will be described with reference to the flowchart shown in FIG. 9, the diagram showing the flow of each signal shown in FIG. 10, and the waveform diagram shown in FIG. explain.

【００２９】騒音環境下においては、ユーザ４０は複数
の単音節からなる音声を発する必要は必ずしもなく、発
声動作（発声の口動作）だけ行えばよい（ステップ210
）。発光素子１１から近赤外光Ｌが舌面１９に向けて
照射され、近赤外光Ｌの舌面１９からの反射光が受光素
子１２で受光されることによって、ユーザ４０の発声ま
たは発声動作に伴う舌の動作が検出され、各受光信号
α，β，γに変換される。各受光信号α，β，γは、送
信機１２で送信信号Ｗに変換されたのち、音節認識部２
０へ送信される（ステップ211 ）。送信信号Ｗは、音節
認識部２０のアンテナ２１で受信されて受信機２２で復
調されることにより、各受光信号α，β，γに戻された
のち、神経回路網２６に入力される。認識動作時には、
ユーザ４０が発声するとは限らないので、神経回路網２
６における各受光信号α，β，γの抜取りを学習動作時
と同様に検出音声信号Ｓを用いて行うことができない。
そこで、騒音環境下であっても各受光信号α，β，γに
重畳されるノイズは、静寂な環境下における場合に比べ
て大差がないことに着目して、図１１に示すように、各
受光信号α，β，γの振幅が所定のいき値θ₂ を最初に
横切る時刻ｔ_K1から所定の時間幅Ｔ_W の観測窓を定め、
この観測窓を各受光信号α，β，γにかけることによ
り、各受光信号α，β，γの抜取りが行われる（ステッ
プ212 ）。このようにして観測窓がかけられた各受光信
号α，β，γは、所定の時間間隔でサンプリングされ、
各サンプル値ｄ₁〜ｄ₅，ｅ₁〜ｅ₅，ｆ₁〜ｆ₅が、数値パ
ターンとして神経回路網２６の入力層３１に入力される
（ステップ213 ）。神経回路網２６の出力層３３の各出
力ユニット５３₁〜５３_Nの出力値ｙ₁ 〜ｙ_N が比較器２
７に入力されたのち、比較器２７で、各出力値ｙ₁〜ｙ_N
の大小比較が行われ、出力値が最大となる出力ユニット
に対応する単音節が比較器２７から順次出力される（ス
テップ214 ）。以上の動作は、すべての認識動作が終了
するまで繰返される（ステップ215 ）。In a noisy environment, the user 40 does not necessarily need to emit a voice consisting of a plurality of monosyllables, but only needs to perform a utterance operation (mouth operation of utterance) (step 210).
). The near-infrared light L is emitted from the light-emitting element 11 toward the tongue surface 19, and the reflected light of the near-infrared light L from the tongue surface 19 is received by the light-receiving element 12, so that the user 40 utters or speaks. Of the tongue associated with the light is detected and converted into light receiving signals α, β, and γ. Each of the received light signals α, β, and γ is converted into a transmission signal W by the transmitter 12 and then transmitted to the syllable recognition unit 2.
0 (step 211). The transmission signal W is received by the antenna 21 of the syllable recognition unit 20 and demodulated by the receiver 22 to be returned to the respective light reception signals α, β, and γ, and then input to the neural network 26. During the recognition operation,
Since the user 40 does not always speak, the neural network 2
6, the light receiving signals α, β, and γ cannot be sampled using the detected voice signal S as in the learning operation.
Therefore, paying attention to the fact that the noise superimposed on each of the light receiving signals α, β, and γ does not greatly differ from that in a quiet environment even under a noise environment, as shown in FIG. From the time t _{K1 at} which the amplitude of the light receiving signals α, β, γ first crosses the predetermined threshold θ ₂ , an observation window of a predetermined time width T _W is determined,
By applying this observation window to each of the received light signals α, β, and γ, the received light signals α, β, and γ are extracted (step 212). Each of the received light signals α, β, and γ thus applied to the observation window is sampled at predetermined time intervals,
Each sample value _{_{_{d 1 ~d 5, e 1 ~e}}} 5, f 1 ~f 5 is input as a numerical pattern to the input layer 31 of the neural network 26 (step 213). Output values of the output units 53 ₁ to 53 _N of the output layer 33 of the neural network 26 y ₁ ~y _N is the comparator 2
7, each of the output values y _{1 to} y _{N is} output from the comparator 27.
Are compared, and single syllables corresponding to the output unit having the maximum output value are sequentially output from the comparator 27 (step 214). The above operation is repeated until all the recognition operations are completed (step 215).

【００３０】したがって、本実施例の舌動作検出型音節
認識装置では、ユーザ４０が発する音声を用いずに認識
動作が行えるため、騒音環境下においても精度よく音節
認識ができる。また、ユーザ４０の口腔内に設置された
舌動作検出送信部１０から各受光信号α，β，γを送信
信号Ｗに変換して音節認識部２０へ送信するため、音節
認識部２０の設置場所が制限されることがない。Therefore, in the tongue movement detection type syllable recognition apparatus of the present embodiment, the recognition operation can be performed without using the voice uttered by the user 40, so that the syllable recognition can be performed accurately even in a noise environment. In addition, since the tongue movement detection and transmission unit 10 installed in the mouth of the user 40 converts each of the received light signals α, β, and γ into a transmission signal W and transmits the transmission signal W to the syllable recognition unit 20, the installation location of the syllable recognition unit 20 Is not limited.

【００３１】なお、神経回路網２６における各受光信号
α，β，γの抜取りは、図１１に示したものに限らず、
たとえば、図１２に示すように行ってもよい。すなわ
ち、各受光信号α，β，γを所定の時間間隔でサンプリ
ングしてメモリに格納したのち、任意の時刻ｔ₀ から所
定の時間幅の各加速度信号α，β，γの各サンプル値を
前記メモリから読出すことにより、各受光信号α，β，
γに第１の観測窓をかけ、次に、時刻ｔ₀ ＋Δｔから前
記所定の時間幅の各受光信号α，β，γの各サンプル値
を前記メモリから読出すことにより、各受光信号α，
β，γに第２の観測窓をかけ、次に、時刻ｔ₀ ＋２・Δ
ｔから前記所定の時間幅の各受光信号α，β，γの各サ
ンプル値を前記メモリから読出すことにより、各受光信
号α，β，γに第３の観測窓をかける。以上の動作を所
定回数だけ繰返すことにより、各受光信号α，β，γの
抜取りを行ってもよい。ただし、この場合には、たとえ
ば、図１２の第２の観測窓がかけられた各受光信号α，
β，γに対応する単音節はないので、神経回路網２６の
誤動作を防止するために、神経回路網２６の出力層３３
に“音節なし”を示す出力ユニットを追加しておいた方
がよい。The extraction of the light receiving signals α, β, and γ in the neural network 26 is not limited to that shown in FIG.
For example, it may be performed as shown in FIG. That is, after sampling each of the light receiving signals α, β, γ at predetermined time intervals and storing them in the memory, the respective sample values of the acceleration signals α, β, γ having a predetermined time width from an arbitrary time t ₀ are obtained. By reading from the memory, each light receiving signal α, β,
γ is multiplied by a first observation window, and then, from time t ₀ + Δt, each sample value of each of the light receiving signals α, β, and γ having the predetermined time width is read from the memory, so that each of the light receiving signals α,
A second observation window is applied to β and γ, and then the time t ₀ + 2 · Δ
A third observation window is applied to each light receiving signal α, β, γ by reading each sample value of each light receiving signal α, β, γ of the predetermined time width from t from t. The light receiving signals α, β, and γ may be extracted by repeating the above operation a predetermined number of times. However, in this case, for example, each of the light receiving signals α,
Since there are no single syllables corresponding to β and γ, the output layer 33 of the neural network 26 is
It is better to add an output unit that indicates "no syllable" to.

【００３２】また、図１に示した舌動作検出型音節認識
装置では、各受光信号α，β，γのみ用いて認識動作を
行ったが、騒音があまり大きくなく、ユーザ４０が発し
た音声がマイク２３である程度検出できる場合には、図
１３に示すように検出音声信号Ｓを補完的に用いて認識
動作を行ってもよい。すなわち、この場合には、図４に
示した神経回路網２６の入力層３１に、検出音声信号Ｓ
の各サンプル値がそれぞれ入力される複数個の入力ユニ
ットを追加し、各受光信号α，β，γと検出音声信号Ｓ
と教師信号パターンとを用いて前述した学習動作と同様
な学習動作を行ったのち、各受光信号α，β，γと検出
音声信号Ｓを用いて前述した認識動作と同様な認識動作
を行ってもよい。Further, in the syllable recognition apparatus of the tongue movement detection type shown in FIG. 1, the recognition operation is performed using only the respective light receiving signals α, β, and γ. If the detection can be performed to some extent by the microphone 23, the recognition operation may be performed using the detected audio signal S complementarily as shown in FIG. That is, in this case, the detected voice signal S is input to the input layer 31 of the neural network 26 shown in FIG.
, A plurality of input units to which the respective sample values are respectively input are added, and each of the received light signals α, β, γ and the detected audio signal S
After performing a learning operation similar to the above-described learning operation using the learning signal and the teacher signal pattern, a recognition operation similar to the above-described recognition operation is performed using each of the light receiving signals α, β, and γ and the detected voice signal S. Is also good.

【００３３】さらに、学習動作にあたり、単音節ごとに
学習を行ったが、神経回路網２６の入力層３１の入力ユ
ニットの数，中間層３２の中間ユニットの数および出力
層３３の出力ユニットの数を増やして、複数の単音節ご
とに学習を行ってもよい。In the learning operation, learning was performed for each single syllable. However, the number of input units of the input layer 31 of the neural network 26, the number of intermediate units of the intermediate layer 32, and the number of output units of the output layer 33 are different. And the learning may be performed for each of a plurality of single syllables.

【００３４】単音節の学習方法として、図４に示したよ
うな時空間パターンを空間パターンに展開して認識する
多層神経回路網からなる神経回路網２６を用い、誤差逆
伝播法によって学習する方法を採用したが、たとえば、
電子情報通信学会，音声技報，ＳＰ８７ー１００，１９
８７年１１月に記載されている時間遅れニューラルネッ
トワーク（ＴＤＮＮ：Ｐｈｏｎｅｍｅｒｅｃｏｇｉｎ
ｉｔｉｏｎｕｓｉｎｇｔｉｍｅ−ｄｅｌａｙｎｅ
ｕｒａｌｎｅｔｗｏｒｋｓ）のような時空間パターン
を処理する他の神経回路網を用いてもよい。As a method of learning a single syllable, a method of learning by a back propagation method using a neural network 26 composed of a multilayer neural network for recognizing a spatiotemporal pattern as shown in FIG. But, for example,
IEICE, Speech Technical Report, SP87-100, 19
A time-delay neural network (TDNN: Phoneme recognize, described in November 1987)
ion using time-delay ne
Other neural networks that process spatiotemporal patterns, such as ural networks, may be used.

【００３５】図１４は、本発明の舌動作検出型音節認識
装置の第２の実施例を示す音節認識部７０のブロック図
である。FIG. 14 is a block diagram of a syllable recognizing unit 70 showing a second embodiment of the syllable recognition type syllable recognition apparatus of the present invention.

【００３６】本実施例の舌動作検出型音節認識装置は、
神経回路網７６として、不特定多数のユーザに対応でき
るよう予め学習動作を行ったものを用いている点が、図
１に示した舌動作検出型音節認識装置と異なる。したが
って、本実施例の舌動作検出型音節認識装置では、前述
した学習動作が不要であるため、音節認識部７０には、
ユーザが発した音声を検出して検出音声信号Ｓに変換す
るマイク，検出音声信号Ｓを増幅する増幅器および学習
動作時に教師信号パターンを出力する音声認識回路が不
要となる。なお、本実施例の舌動作検出型音節認識装置
における認識動作は、前述した図９に示したフローチャ
ートに従って同様にして行われる。The tongue movement detection type syllable recognition device of this embodiment
The difference from the tongue movement detection type syllable recognition device shown in FIG. 1 is that a neural network 76 that has been subjected to a learning operation in advance so as to be able to handle an unspecified number of users is used. Therefore, in the tongue movement detection type syllable recognition device of the present embodiment, since the above-described learning operation is unnecessary, the syllable recognition unit 70 includes:
A microphone for detecting a voice emitted by the user and converting it to a detected voice signal S, an amplifier for amplifying the detected voice signal S, and a voice recognition circuit for outputting a teacher signal pattern during a learning operation are not required. Note that the recognition operation in the tongue movement detection type syllable recognition device of this embodiment is performed in the same manner according to the flowchart shown in FIG.

【００３７】以上の説明においては、各受光信号α，
β，γの抜取りは、神経回路網２６で行われたが、受信
機２２がこの機能を有してもよいし、抜取り装置を別途
設けてもよい。また、舌動作検出送信部１０は、図３に
示すように、２つのブリッジ１６を歯と歯との間にそれ
ぞれ差込んで設置されたが、たとえば、入れ歯と連結ま
たは一体化することも可能である。In the above description, each light receiving signal α,
The sampling of β and γ was performed by the neural network 26, but the receiver 22 may have this function, or a sampling device may be separately provided. Further, as shown in FIG. 3, the tongue movement detection transmitting unit 10 is installed with the two bridges 16 inserted between the teeth, respectively. However, for example, it is also possible to connect or integrate with the dentures It is.

【００３８】本発明の舌動作検出型音節認識装置の応用
分野としては、コンピュータへの音声入力装置のほか、
ワードプロセッサ，機械翻訳機，自動車電話の電話番
号，車内積載コンピュータおよび航空機コクピット内積
載コンピュータへの音声入力装置や、音節認識部のネッ
トワーク化による内緒話ツールとしての応用などが考え
られる。The application fields of the tongue movement detecting type syllable recognition device of the present invention include, in addition to a voice input device to a computer,
It can be applied to a word processor, a machine translator, a telephone number of a car phone, a voice input device to a computer loaded in a vehicle and a computer loaded in an aircraft cockpit, and an application as a secret talk tool by networking a syllable recognition unit.

【００３９】[0039]

【発明の効果】以上説明したように、本発明は次のよう
な効果がある。As described above, the present invention has the following effects.

【００４０】（１）舌動作検出送信部に設けられた発光
素子および受光素子で、ユーザの発声または発声動作に
伴う舌の動作を検出して受光信号に変換するとともに、
神経回路網と比較器とが設けられた音節認識部に前記受
光信号を送信することにより、該受光信号を用いて音節
を認識するため、騒音環境下および絶対的な静寂性が要
求される環境下においても音節を精度よく認識すること
ができる。また、音節認識部の設置場所は送信信号が受
信できる場所であればどこでもよいため、使用場所が制
限されることがない。(1) A light emitting element and a light receiving element provided in the tongue movement detection transmitting unit detect a user's utterance or a tongue movement associated with the utterance movement and convert it into a light reception signal.
By transmitting the received light signal to a syllable recognizing unit provided with a neural network and a comparator, the syllable is recognized using the received light signal. Therefore, in a noise environment or an environment where absolute silence is required. Syllables can be recognized accurately even below. Further, the syllable recognition unit may be installed in any location as long as the transmission signal can be received, so that the use location is not limited.

【００４１】（２）神経回路網に学習機能をもたせるこ
とにより、ユーザごとに学習動作を行ったのち、認識動
作を行うことができるため、いかなるユーザに対しても
精度よく認識動作を行うことができる。(2) By making the neural network have a learning function, a learning operation can be performed for each user, and then a recognition operation can be performed. Therefore, a recognition operation can be performed accurately for any user. it can.

[Brief description of the drawings]

【図１】本発明の舌動作検出型音節認識装置の第１の実
施例を示すブロック図であり、（Ａ）はその舌動作検出
送信部のブロック図であり、（Ｂ）はその音節認識部の
ブロック図である。FIG. 1 is a block diagram showing a first embodiment of a tongue movement detection type syllable recognition device according to the present invention, in which (A) is a block diagram of a tongue movement detection and transmission unit, and (B) is its syllable recognition. It is a block diagram of a part.

【図２】図１（Ａ）に示した舌動作検出送信部１０の格
納容器１５の外観を示す図である。FIG. 2 is a diagram illustrating an appearance of a storage container 15 of the tongue movement detection transmission unit 10 illustrated in FIG.

【図３】図１（Ａ）に示した舌動作検出送信部１０の口
腔内への設置方法を示す図である。FIG. 3 is a diagram showing a method of installing the tongue movement detection transmitting unit 10 shown in FIG.

【図４】図１（Ｂ）に示した神経回路網２６の構成を示
す図である。FIG. 4 is a diagram showing a configuration of a neural network 26 shown in FIG. 1 (B).

【図５】図１に示した舌動作検出型音節認識装置の学習
動作を説明するフローチャートである。FIG. 5 is a flowchart illustrating a learning operation of the tongue motion detection type syllable recognition device shown in FIG. 1;

【図６】図１に示した舌動作検出型音節認識装置の学習
動作を説明する各信号の流れを示す図である。6 is a diagram showing a flow of each signal for explaining a learning operation of the tongue movement detection type syllable recognition device shown in FIG. 1;

【図７】各受光信号を示す波形図であり、（Ａ）は受光
信号αの波形図、（Ｂ）受光信号βの波形図、（Ｃ）は
受光信号γの波形図である。FIGS. 7A and 7B are waveform diagrams showing each light receiving signal, FIG. 7A is a waveform diagram of a light receiving signal α, FIG. 7B is a waveform diagram of a light receiving signal β, and FIG.

【図８】図１（Ｂ）に示した神経回路網２６における学
習動作時の各受光信号の抜取り方法を説明する波形図で
あり、（Ａ）は検出音声信号Ｓの波形図、（Ｂ）包絡線
波形Ｅの波形図、（Ｃ）はゲート信号Ｇの波形図、
（Ｄ）は受光信号αの波形図である。8A and 8B are waveform diagrams for explaining a method of extracting each light receiving signal at the time of a learning operation in the neural network 26 shown in FIG. 1B, wherein FIG. 8A is a waveform diagram of a detected voice signal S, and FIG. FIG. 7C is a waveform diagram of the envelope waveform E, FIG.
(D) is a waveform diagram of the light reception signal α.

【図９】図１に示した舌動作検出型音節認識装置の認識
動作を説明するフローチャートである。FIG. 9 is a flowchart illustrating a recognition operation of the tongue movement detection type syllable recognition device illustrated in FIG. 1;

【図１０】図１に示した舌動作検出型音節認識装置の認
識動作を説明する各信号の流れを示す図である。10 is a diagram showing a flow of each signal for explaining a recognition operation of the tongue movement detection type syllable recognition device shown in FIG. 1;

【図１１】図１（Ｂ）に示した神経回路網２６における
認識動作時の各受光信号の抜取り方法を説明する波形図
である。11 is a waveform diagram illustrating a method of extracting each light receiving signal during a recognition operation in the neural network 26 illustrated in FIG. 1 (B).

【図１２】図１（Ｂ）に示した神経回路網２６における
認識動作時の各受光信号の他の抜取り方法を説明する波
形図である。12 is a waveform diagram illustrating another method of extracting each light receiving signal during a recognition operation in the neural network 26 illustrated in FIG. 1 (B).

【図１３】図１に示した舌動作検出型音節認識装置の他
の認識動作を説明する各信号の流れを示す図である。13 is a diagram showing a flow of each signal for explaining another recognition operation of the tongue movement detection type syllable recognition device shown in FIG.

【図１４】本発明の舌動作検出型音節認識装置の第２の
実施例を示す音節認識部のブロック図である。FIG. 14 is a block diagram of a syllable recognition unit according to a second embodiment of the tongue movement detection type syllable recognition device of the present invention.

[Explanation of symbols]

１０舌動作検出送信部１１発光素子１２受光素子１３送信機１４電池１５格納容器１６ブリッジ１９舌面２０，７０音節認識部２１，７１アンテナ２２，７２受信機２３マイク２４増幅器２５音声認識回路２６，７６神経回路網２７，７７比較器３１入力層３２中間層３３出力層３４重み制御部３６，３７リンク３９重み５１₁〜５１₁₅ 入力ユニット５２₁〜５２_M 中間ユニット５３₁〜５３_N 出力ユニット α，β，γ 受光信号Ｓ検出音声信号Ｗ送信信号ｄ₁〜ｄ₅，ｅ₁〜ｅ₅，ｆ₁〜ｆ₅ サンプル値ｙ₁〜ｙ_N 出力値DESCRIPTION OF SYMBOLS 10 Tongue motion detection transmission part 11 Light emitting element 12 Light receiving element 13 Transmitter 14 Battery 15 Storage container 16 Bridge 19 Tongue surface 20, 70 Syllable recognition part 21, 71 Antenna 22, 72 Receiver 23 Microphone 24 Amplifier 25 Voice recognition circuit 26, 76 neural network 27, 77 comparator 31 input layer 32 intermediate layer 33 output layer 34 weight control unit 36, 37 link 39 weight 51 _{1 to} 51 ₁₅ input unit 52 _{1 to} 52 _M intermediate unit 53 _{1 to} 53 _N output unit α , Β, γ Light reception signal S Detection voice signal W Transmission signal d _{1 to} d ₅ , e _{1 to} e ₅ , f _{1 to} f ₅ Sample value y _{1 to} y _N output value

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号ＦＩＧ１０Ｌ 3/00 ５７１Ｈ０４Ｂ 7/00 Ｇ０１Ｓ 17/88 ＺＨ０４Ｂ 7/00 Ａ６１Ｂ 5/10 ３１０Ｌ (72)発明者篠沢一彦東京都千代田区内幸町一丁目１番６号日本電信電話株式会社内 (56)参考文献特開平２−297599（ＪＰ，Ａ) 特開平２−144036（ＪＰ，Ａ) 特開平２−196537（ＪＰ，Ａ) 特開昭57−209036（ＪＰ，Ａ) 特開昭62−174787（ＪＰ，Ａ) 特開平２−144035（ＪＰ，Ａ) 特開平２−140146（ＪＰ，Ａ) 特開昭58−79297（ＪＰ，Ａ) 特公平３−6519（ＪＰ，Ｂ２) 特公昭61−27760（ＪＰ，Ｂ２) 特表平３−502770（ＪＰ，Ａ) 1989年電子情報通信学会秋季全国大会講演論文集，第６分冊，「音節発声直前の準備電位時空間パターンのニューラルネットによる認識の検討」，ｐ．６− ８，（1989年) (58)調査した分野(Int.Cl.⁶，ＤＢ名) G10L 3/00 - 9/20 ＪＩＣＳＴファイル（ＪＯＩＳ)────────────────────────────────────────────────── ─── Continued on the front page (51) Int.Cl. ⁶ Identification code FI G10L 3/00 571 H04B 7/00 G01S 17/88 Z H04B 7/00 A61B 5/10 310L (72) Inventor Kazuhiko Shinozawa Tokyo Nippon Telegraph and Telephone Corporation, 1-6-1, Uchisaiwai-cho, Chiyoda-ku (56) References JP-A-2-297599 (JP, A) JP-A-2-144036 (JP, A) JP-A-2-196537 (JP) JP-A-57-209036 (JP, A) JP-A-62-174787 (JP, A) JP-A-2-144035 (JP, A) JP-A-2-140146 (JP, A) 58-79297 (JP, A) JP-B 3-6519 (JP, B2) JP-B 61-61,760 (JP, B2) JP-B 3-502770 (JP, A) 1989 IEICE Autumn National Convention Proceedings, 6th volume, “Syllable utterances Study of recognition by the neural network of the previous preparation potential space-time pattern ", p. 6-8, (1989) (58) Field surveyed (Int. Cl. ⁶ , DB name) G10L 3/00-9/20 JICST file (JOIS)

Claims

(57) [Claims]

1. A light emitting element comprising a tongue movement detection transmission unit installed in the oral cavity and a syllable recognition unit installed outside the oral cavity, wherein the tongue movement detection transmission unit irradiates near-infrared light; A light-receiving element that receives the reflected light of the near-infrared light and converts it into a light-receiving signal, and a transmitter that converts the light-receiving signal converted by the light-receiving element into a transmission signal and transmits the signal to the syllable recognition unit, A battery that supplies power to the transmitter, the light emitting element, and the light receiving element, wherein the syllable recognition unit receives the transmission signal, and receives the light reception signal from the transmission signal received by the antenna. A receiver for demodulating, a neural network for outputting an output value of each output unit in accordance with the received light signal demodulated by the receiver, and a magnitude of an output value of each output unit output from the neural network. A comparison is made to determine the output value at which the output value is maximized. A tongue movement detection type syllable recognition device including a comparator that outputs a syllable corresponding to the force unit.

2. A microphone, wherein the syllable recognition unit detects a voice uttered during a learning operation and converts the voice into a detected voice signal.
An amplifier for amplifying the detected voice signal converted by the microphone; and a voice recognition circuit for recognizing a syllable from the detected voice signal amplified by the amplifier and outputting a teacher signal pattern indicating the syllable. The neural network further includes a learning function of learning a relationship between the received light signal and a syllable from the teacher signal pattern input from the voice recognition circuit and a received light signal demodulated by the receiver during a learning operation. Item 4. A syllable recognition device of the tongue movement detection type according to Item 1.