JP2006126558A

JP2006126558A - Voice speaker authentication system

Info

Publication number: JP2006126558A
Application number: JP2004315622A
Authority: JP
Inventors: Sakae Fujimaki; 栄藤巻; Yasukazu Mizushima; 靖和水嶋
Original assignee: Asahi Kasei Corp
Current assignee: Asahi Kasei Corp
Priority date: 2004-10-29
Filing date: 2004-10-29
Publication date: 2006-05-18

Abstract

<P>PROBLEM TO BE SOLVED: To provide a voice speaker authentication system which prevents "pretence" by synthesized voice using recorded voice or voice synthesis technology, and is hardly affected by ambient noise. <P>SOLUTION: When releasing door lock, an utterer 100 who is an authentication registrant is mounted with a NAM microphone 101, and utters a keyword for authentication in a non-audible murmur note (NAM note) or a usual note. The NAM note or the usual note is picked up by the NAM microphone 101, is amplified by a mic-amplifier 103 and digitized by an A-D converter 104, thereafter, input to a voice authentication part 106. The voice authentication part 106 calculates a degree of similarity between a voice pattern which has been created and registered based on the NAM note or the usual note pre-uttered for registration and has been stored in a registration pattern storage part 107, and the NAM note or the usual note uttered in authentication, and compares the degree of similarity with a preset threshold value to judge whether or not the utterer 100 is authorized to enter or exit from a room. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は音声を用いることで発話者を特定する音声話者認証システムに関し、特に耳介の後方下部の、頭蓋骨の乳様突起直下の、胸鎖乳突筋上の皮膚表面に装着され、発声器官の運動に伴う共振フィルタ特性変化により調音された、声帯の規則振動を伴わない、外部からは非可聴な呼吸音の体内軟部組織を伝導する振動音である非可聴つぶやき音（Non-Audible Murmur; NAM）あるいは通常音声などの体内軟部組織を伝導する体内伝導音声を採取して音声入力するマイクロフォン（ＮＡＭマイクロフォン）から採取される音声を用いる音声話者認証システムに関する。 The present invention relates to a voice speaker authentication system for identifying a speaker by using voice, and in particular, mounted on the skin surface on the sternocleidomastoid muscle of the lower rear part of the auricle, directly below the mastoid process of the skull, Non-Audible Murmur (Non-Audible Murmur), which is a vibration sound transmitted through the soft tissue of the body that is not audible from the outside and is not accompanied by the regular vibration of the vocal cords, and is tuned by the resonance filter characteristic change accompanying the movement of The present invention relates to a voice speaker authentication system that uses a voice collected from a microphone (NAM microphone) that collects and inputs a body conduction voice that conducts through soft tissue in the body such as NAM) or normal voice.

従来、入出門管理や金融機関のＡＴＭにおいて本人かどうか確認する方法として、磁気カード、ＩＣカード、パスワードなどが用いられてきた。しかしながら、磁気カード、ＩＣカードは紛失、盗難、偽造の危険性があり、またパスワードは忘失、盗難の恐れがある。
このため、近年、本人の身体的特徴を用いる指紋認証、虹彩認証、音声認証などの生体認証が提案されている。このうち音声認証は、音声の個人差を用いる方法で、そのアルゴリズムに関しては、例えば非特許文献１に記載されているように、まず、キーワードを発声して音声パターンを登録しておき、認証時に同一キーワードを発声して登録パターンとの類似度を計算し、予め設定した閾値と比較して本人かどうか判定する。音声認証には、指紋や虹彩などで用いられるような特殊な入力装置を必要とせず低コストで実現可能、電話を使用すれば遠隔地でも認証可能などのメリットがあり、これまでに一部銀行等で導入されている。 Conventionally, a magnetic card, an IC card, a password, and the like have been used as a method for confirming the identity of the person in entrance / exit management or an ATM of a financial institution. However, there is a risk of loss, theft and forgery of magnetic cards and IC cards, and passwords may be forgotten or stolen.
For this reason, in recent years, biometric authentication such as fingerprint authentication, iris authentication, and voice authentication using the physical characteristics of the person has been proposed. Of these, voice authentication is a method that uses individual differences in voice. Regarding the algorithm, for example, as described in Non-Patent Document 1, first, a keyword is uttered and a voice pattern is registered. The same keyword is uttered, the similarity with the registered pattern is calculated, and compared with a preset threshold value to determine whether or not the user is the person. Voice authentication can be realized at low cost without the need for special input devices such as those used for fingerprints and irises, and has the advantage of being able to authenticate from a remote location using a telephone. Etc. are introduced.

音声認証による本人確認システムをより広く普及させるため課題としては、テープレコーダー等による録音音声や音声合成技術を使用した合成音声による詐称（以後、これを「なりすまし」と呼ぶ）への対策や、周辺雑音による誤認証の低減などがあげられる。
この「なりすまし」の問題に対しては、以下の対処方法が提案されている。すなわち、予め複数のキーワードを登録しておき、システムがランダムにキーワードを指定してある時間内に音声入力がない場合は、例え登録パターンとの類似度が大きくても合成音声や録音音声と判断する対処方法（例えば、特許文献１参照）や、認証時に複数回キーワードを発声してもらい、複数の音声間の類似度が完全一致に近い場合は、入力音声が合成音声や録音音声などの不自然な音声であると判断するなどの対処方法（例えば、特許文献２参照）が提案されている。
特開昭６１−２７２７９８号公報特開２００１−２６５３８７号公報瀬戸洋一著「生体認証技術」共立出版社２００２年５月ｐｐ．６４〜６８ In order to make identity verification systems based on voice authentication more widespread, there are issues such as countermeasures against fraud (hereinafter referred to as “spoofing”), such as voice recordings using tape recorders and synthesized voices using voice synthesis technology. For example, the false authentication due to noise can be reduced.
The following countermeasures have been proposed for this “spoofing” problem. That is, if multiple keywords are registered in advance and there is no voice input within the time when the keyword is randomly specified by the system, even if the similarity to the registered pattern is large, it is judged as synthesized voice or recorded voice If the keyword is spoken multiple times during authentication and the similarity between the multiple voices is close to perfect match, the input voice is not a synthesized voice or recorded voice. A coping method (for example, see Patent Document 2) such as determining that the sound is natural is proposed.
Japanese Patent Application Laid-Open No. 61-272798 JP 2001-265387 A Yoichi Seto “Biometric Authentication Technology” Kyoritsu Publishing Co., Ltd. May 2002 pp. 64-68

しかし、最近の音声合成技術の進歩により、容易に短時間で認証対象者の音声を生成したり、また音声合成パラメータの調整等によって合成音声に揺らぎを持たせることが可能となり、上記方法では「なりすまし」に対して十分対応できなくなってきたという問題がある。
また、周辺雑音の問題に対しては、指向性マイクロフォンを使用するなど考えられるが、現状では、抜本的な対策はない。
本発明は上述した問題を解決するためになされたものであり、その目的は、テープレコーダー等による録音音声や音声合成技術を使用した合成音声による「なりすまし」を防ぎ、かつ、周辺雑音の影響を受けにくい音声話者認証システムを提供することである。 However, with recent advances in speech synthesis technology, it is possible to easily generate the speech of the person to be authenticated in a short time, and to make the synthesized speech fluctuate by adjusting the speech synthesis parameters. There is a problem that it is no longer possible to cope with “spoofing”.
For the problem of ambient noise, it is conceivable to use a directional microphone, but there is no drastic countermeasure at present.
The present invention has been made in order to solve the above-mentioned problems, and its purpose is to prevent “spoofing” by a recorded voice by a tape recorder or the like and a synthesized voice using a voice synthesis technique, and to prevent the influence of ambient noise. It is to provide a voice speaker authentication system that is difficult to receive.

本発明の請求項１による音声話者認証システムは、認証用の体内伝導音声に関するデータが予め記憶されている記憶手段と、認証対象者の体内伝導音声を皮膚表面から入力するための体内伝導音声入力手段と、前記体内伝導音声入力手段により入力された体内伝導音声に関するデータと前記記憶手段に記憶されている体内伝導音声に関するデータとを照合することによって認証を行う認証手段とを含み、前記認証手段の認証結果に応じて外部機器を制御するようにしたことを特徴とする。このように構成すれば、テープレコーダー等による録音音声や音声合成技術を使用した合成音声による「なりすまし」を防ぎ、かつ、周辺雑音の影響を受けにくい。 According to a first aspect of the present invention, there is provided a voice speaker authentication system in which data relating to body conduction speech for authentication is stored in advance, and body conduction speech for inputting body conduction speech of the person to be authenticated from the skin surface. An authentication unit that performs authentication by collating data related to body conduction speech input by the body conduction speech input unit and data related to body conduction speech stored in the storage unit; The external device is controlled according to the authentication result of the means. With this configuration, it is possible to prevent “spoofing” due to a voice recorded by a tape recorder or the like and a synthesized voice using a voice synthesis technique, and is less susceptible to ambient noise.

本発明の請求項２による音声話者認証システムは、請求項１において、前記体内伝導音声入力手段は、ＮＡＭ（Non-Audible Murmur）マイクロフォンであることを特徴とする。このようにＮＡＭマイクロフォンを用いれば、非可聴つぶやき音を容易に入力できる。
本発明の請求項３による音声話者認証システムは、請求項１又は２において、前記体内伝導音声は、非可聴つぶやき音であることを特徴とする。非可聴つぶやき音を認証対象とすることにより、風邪などにより認証対象者の発声が正常状態でない場合においても、適切に認証できる。 The voice speaker authentication system according to claim 2 of the present invention is characterized in that, in claim 1, the in-vivo conduction voice input means is a NAM (Non-Audible Murmur) microphone. In this way, using an NAM microphone makes it possible to easily input inaudible tweets.
A voice speaker authentication system according to claim 3 of the present invention is characterized in that, in claim 1 or 2, the in-body conduction speech is a non-audible murmur. By making a non-audible murmur sound as an authentication target, even when the utterance of the authentication target person is not in a normal state due to a cold or the like, it can be properly authenticated.

本発明の請求項４による音声話者認証システムは、請求項２又は３において、前記ＮＡＭマイクロフォンは、自システムに対して着脱自在に構成されていることを特徴とする。このようにＮＡＭマイクロフォンをシステムに対して着脱自在に構成すれば、ＮＡＭマイクロフォンを携行でき、システムの利便性が向上する。
本発明の請求項５による音声話者認証システムは、請求項１から４までのいずれか１項において、前記体内伝導音声入力手段により入力された体内伝導音声に関するデータを、無線通信方式によって前記認証手段に伝達する無線通信手段を更に含むことを特徴とする。このように構成すれば、システムの構成の一部分を携行でき、システムの利便性が向上する。 The voice speaker authentication system according to claim 4 of the present invention is characterized in that, in claim 2 or 3, the NAM microphone is configured to be detachable from the own system. If the NAM microphone is configured to be detachable from the system as described above, the NAM microphone can be carried and the convenience of the system is improved.
A voice speaker authentication system according to claim 5 of the present invention is the voice speaker authentication system according to any one of claims 1 to 4, wherein data relating to the in-body conduction speech input by the in-body conduction speech input means is authenticated by a wireless communication method. Further comprising wireless communication means for communicating to the means. With this configuration, a part of the system configuration can be carried and the convenience of the system is improved.

本発明の請求項６による音声話者認証システムは、請求項１から５までのいずれか１項において、前記認証対象者に予め付与され該認証対象者を識別するための識別情報を入力するためのＩＤ入力手段を更に含み、前記認証手段は、前記記憶手段に記憶されているデータのうち、前記ＩＤ入力手段によって入力された識別情報に対応するデータと、前記体内伝導音声入力手段により入力された体内伝導音声に関するデータとを照合することを特徴とする。このように構成すれば、複数の認証対象者に対応することができる。 The voice speaker authentication system according to claim 6 of the present invention is the method for inputting identification information for identifying the authentication target person given in advance to the authentication target person in any one of claims 1 to 5. The authentication means is inputted by the body conduction voice input means and data corresponding to the identification information inputted by the ID input means among the data stored in the storage means. It is characterized by collating with the data concerning the conduction sound in the body. If comprised in this way, it can respond to a some authentication subject.

本発明の請求項７による音声話者認証システムは、請求項６において、前記ＩＤ入力手段は、非接触ＩＤタグに記憶されている前記識別情報を読み出すことによって該識別情報が入力されることを特徴とする。ＲＦＩＤ（ＲａｄｉｏＦｒｅｑｕｅｎｃｙＩｄｅｎｔｉｆｉｃａｔｉｏｎ）などの非接触ＩＤタグを用いれば、該認証対象者を識別するための識別情報を容易に入力することができる。 According to a seventh aspect of the present invention, there is provided the voice speaker authentication system according to the sixth aspect, wherein the ID input means inputs the identification information by reading the identification information stored in a non-contact ID tag. Features. If a non-contact ID tag such as RFID (Radio Frequency Identification) is used, identification information for identifying the person to be authenticated can be easily input.

本発明の請求項８による音声話者認証システムは、請求項１から７までのいずれか１項において、前記認証手段は、前記体内伝導音声入力手段により入力された体内伝導音声に関するデータと前記記憶手段に記憶されている体内伝導音声に関するデータとの類似度と、所定閾値とを比較することを特徴とする。このように構成すれば、入力された非可聴つぶやき音などの体内伝導音声に関するデータと予め記憶されている非可聴つぶやき音などの体内伝導音声に関するデータとを容易に照合することができる。 The voice speaker authentication system according to claim 8 of the present invention is the voice speaker authentication system according to any one of claims 1 to 7, wherein the authentication means includes the data related to the body conduction speech input by the body conduction speech input means and the storage. The similarity with the data related to the in-body conduction speech stored in the means is compared with a predetermined threshold value. If comprised in this way, the data regarding the in-vivo conduction sound, such as a non-audible murmur sound, and the data regarding the in-body conduction sound, such as a non-audible muzzle sound, which are input in advance can be easily collated.

本発明の請求項９による音声話者認証システムは、請求項１から８までのいずれか１項において、前記外部機器は、前記認証結果に対応する電気信号によって扉の施錠を制御することを特徴とする。このように構成すれば、扉の施錠を制御することができる。
本発明の請求項１０による音声話者認証システムは、請求項１から８までのいずれか１項において、前記外部機器は、前記認証結果に対応するデータによってネットワークへのログイン可否を制御することを特徴とする。このように構成すれば、ネットワークへのログイン可否を制御することができる。
本発明の請求項１１による音声話者認証システムは、請求項１から８までのいずれか１項において、前記外部機器は、前記認証結果に対応するデータによってデータベースへのアクセス可否を制御することを特徴とする。このように構成すれば、データベースへのアクセス可否を制御することができる。 The voice speaker authentication system according to claim 9 of the present invention is the voice speaker authentication system according to any one of claims 1 to 8, wherein the external device controls locking of the door by an electrical signal corresponding to the authentication result. And If comprised in this way, locking of a door can be controlled.
The voice speaker authentication system according to claim 10 of the present invention is the voice speaker authentication system according to any one of claims 1 to 8, wherein the external device controls whether to log in to the network based on data corresponding to the authentication result. Features. With this configuration, it is possible to control whether to log in to the network.
The voice speaker authentication system according to an eleventh aspect of the present invention is the voice speaker authentication system according to any one of the first to eighth aspects, wherein the external device controls whether or not to access a database based on data corresponding to the authentication result. Features. With this configuration, it is possible to control whether or not access to the database is possible.

本発明によれば、「なりすまし」への耐性が飛躍的に向上し、セキュリティ性が高い音声認証システムの構築が可能となる。さらに、認証にＮＡＭ音声を使用すれば、第三者に認証用キーワードを聞かれることなく音声認証を行うことができるため、よりシステムのセキュリティ性が高まる。
また、ＮＡＭマイクロフォンを使うことにより、環境雑音の影響を受けにくくなるため、より高い認証性能を有する音声認証システムを実現することが可能となる。 According to the present invention, resistance to “spoofing” is dramatically improved, and a voice authentication system with high security can be constructed. Furthermore, if NAM voice is used for authentication, voice authentication can be performed without a third party being asked for an authentication keyword, thereby further improving the security of the system.
In addition, since the NAM microphone is less affected by environmental noise, it is possible to realize a voice authentication system having higher authentication performance.

以下、本発明の実施の形態を、図面を参照して説明する。なお、以下の説明において参照する各図では、他の図と同等部分は同一符号によって示されている。
まず非可聴つぶやき（ＮＡＭ）とは、発声器官の運動に伴う共振フィルタ特性変化により調音された、声帯の規則変動を伴わない、外部からは非可聴な呼吸音の体内軟部組織を伝導する振動音である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the drawings referred to in the following description, the same parts as those in the other drawings are denoted by the same reference numerals.
First, inaudible tweet (NAM) is a vibration sound that is tuned by the resonance filter characteristic change accompanying the movement of the vocal organs, and that is not accompanied by the regular fluctuation of the vocal cords and that is transmitted from the outside to the soft tissue of the body that is inaudible. It is.

（原理）
国際公開ＷＯ２００４／０２１７３８号パンフレットにおいて提案されているＮＡＭマイクロフォンで採取される肉伝導音は、過程で複数の異なる音響インピーダンスを有する物質を通るため、空気伝導に比較して、特に高周波帯域の信号の減衰が大きい。また、音韻や韻律に依存して音源の場所が変わるため、肉伝導の経路が一様ではなく減衰特性も異なる。本発明では、音の肉伝導特性と空気伝導特性の違いを利用して、「なりすまし」を防止することを考えた。 (principle)
Since the meat conduction sound collected by the NAM microphone proposed in the pamphlet of International Publication No. WO 2004/021738 passes through a substance having a plurality of different acoustic impedances in the process, the signal of the signal in the high frequency band is particularly compared with the air conduction. Attenuation is large. Further, since the location of the sound source changes depending on the phoneme and prosody, the path of the meat conduction is not uniform and the attenuation characteristics are also different. In the present invention, it is considered to prevent “spoofing” by utilizing the difference between the sound conduction characteristic and the air conduction characteristic of sound.

まず、発明者は、図２及び図３に示すようなＮＡＭマイクロフォンを試作した。図２はＮＡＭマイクロフォンの側面断面図、図３はＮＡＭマイクロフォンの正面図である。両図に示されているＮＡＭマイクロフォン１０１は、接触部３０１ａと、フレーム３０１ｂと、外部雑音防音空間３０１ｃと、反射板３０１ｄと、コンデンサマイクロフォン３０１ｅとを含んで構成されている。コンデンサマイクロフォン３０１ｅは、振動板電極３０１ｆと、リード線３０１ｇとを有している。接触部３０１ａは、皮膚との間で音響インピーダンスの不整合が生じないように、人間の軟体組織に近い音響インピーダンスを有する生体適合性物質であるシリコーンゴムで構成した。 First, the inventor made a prototype of a NAM microphone as shown in FIGS. FIG. 2 is a side sectional view of the NAM microphone, and FIG. 3 is a front view of the NAM microphone. The NAM microphone 101 shown in both figures includes a contact portion 301a, a frame 301b, an external noise soundproof space 301c, a reflecting plate 301d, and a condenser microphone 301e. The condenser microphone 301e has a diaphragm electrode 301f and a lead wire 301g. The contact portion 301a is made of silicone rubber, which is a biocompatible material having an acoustic impedance close to that of human soft tissue so that acoustic impedance mismatch with the skin does not occur.

図４（ａ）は、ある話者がある内容の文章を発声して生じた空気伝導音を、口元に置いたＮＡＭマイクロフォン１０１から採取した信号のスペクトルを示す図、図４（ｂ）は、同一話者が同一内容の文章を発声して生じた肉伝導音を、体表に装着したＮＡＭマイクロフォン１０１から採取した信号のスペクトルを示す図である。
同図（ａ）を参照すると、空気伝導音では４ｋＨｚ以上の信号が含まれている。これに対し、同図（ｂ）を参照すると、肉伝導音では４ｋＨｚ以上の信号が減衰していることがわかる。 FIG. 4A is a diagram showing a spectrum of a signal obtained from the NAM microphone 101 placed at the mouth of air conduction sound generated by uttering a sentence of a certain content, and FIG. It is a figure which shows the spectrum of the signal which extract | collected the meat conduction sound produced when the same speaker uttered the sentence of the same content from the NAM microphone 101 with which the body surface was mounted | worn.
Referring to FIG. 5A, the air conduction sound includes a signal of 4 kHz or more. On the other hand, referring to FIG. 5B, it can be seen that the signal of 4 kHz or more is attenuated in the meat conduction sound.

また、図５（ａ）は、ある話者がそれぞれ「あ」と発声して生じた空気伝導音を、口元に置いたＮＡＭマイクロフォン１０１から採取してオクターブ分析した図、図５（ｂ）は、同一話者が同じく「あ」と発声して生じた肉伝導音を、体表に装着した同一ＮＡＭマイクロフォン１０１から採取してオクターブ分析した図である。両図において、中心周波数は、左からそれぞれ８０Hz、１００Hz、１２５Hz、１６０Hz、２００Hz、２５０Hz、３１５Hz、４００Hz、５００Hz、６５０Hz、８００Hz、１０００Hz、１２５０Hz、１６００Hz、２０００Hz、２５００Hz、３１５０Hz、４０００Hz、５０００Hzである。 FIG. 5 (a) is a diagram in which an air conduction sound produced by each speaker uttering “A” is sampled from the NAM microphone 101 placed at the mouth and subjected to octave analysis, and FIG. 5 (b) is illustrated. FIG. 4 is a diagram obtained by performing octave analysis on the meat conduction sound generated by the same speaker uttering “A” and collected from the same NAM microphone 101 mounted on the body surface. In both figures, the center frequencies are 80 Hz, 100 Hz, 125 Hz, 160 Hz, 200 Hz, 250 Hz, 315 Hz, 400 Hz, 500 Hz, 650 Hz, 800 Hz, 1000 Hz, 1250 Hz, 1600 Hz, 2000 Hz, 2500 Hz, 3150 Hz, 4000 Hz, and 5000 Hz, respectively, from the left. .

同様に、図６（ａ）と図６（ｂ）は「い」と発声した場合、図７（ａ）と図７（ｂ）は「う」と発声した場合、図８（ａ）と図８（ｂ）は「え」と発声した場合、図９（ａ）と図９（ｂ）は「お」と発声した場合である。これらの図を参照すると、いずれの母音においても、ＮＡＭマイクロフォン１０１を体表に装着した状態で採取した肉伝導音の周波数パターンと、口元に置いたＮＡＭマイクロフォン１０１から採取した空気伝導音の周波数パターンとが異なり、かつ、高域における周波数帯域毎の空気伝導音と肉伝導音のパワー比が、「あ」、「い」、「う」、「え」、「お」で異なることがわかる。 Similarly, FIG. 6 (a) and FIG. 6 (b) say “Yes”, FIG. 7 (a) and FIG. 7 (b) say “Yes”, FIG. 8 (a) and FIG. 8 (b) is when “e” is uttered, and FIGS. 9 (a) and 9 (b) are when “o” is uttered. Referring to these figures, for any vowel, the frequency pattern of meat conduction sound collected with the NAM microphone 101 attached to the body surface and the frequency pattern of air conduction sound collected from the NAM microphone 101 placed at the mouth. In addition, it can be seen that the power ratio of air conduction sound and meat conduction sound for each frequency band in the high frequency range is different for “A”, “I”, “U”, “E”, “O”.

したがって、単純に空気伝導音の高域をフィルタ等で減衰させても、肉伝導音の周波数パターンには一致しない。通常の「なりすまし」では、「なりすまし」対象者の空気伝導音を録音／分析し、空気伝導音を再生／合成してスピーカ等マイクロフォンから入力する。したがって、ＮＡＭマイクロフォンを体表に装着した状態で採取した肉伝導音をもとに登録用の認証パターンを作成しておけば、第三者が上記方法で「なりすまし」対象者の録音音声や合成音声をＮＡＭマイクロフォンに入力してもパターンが一致せず、「なりすまし」を防止できる。また、対象者の空気伝導音をもとに肉伝導音を生成してＮＡＭマイクロフォンに入力しようとしても、ターゲットとなる肉伝導音が第三者にはわからないため、現実には非常に困難である。
さらに、日常使用しないため安定して発声できるようになるまで訓練を要するが、登録用の認証パターンをＮＡＭ発声音声から作成すれば、キーワード自体を第三者に知られることがなくなり、セキュリティ性が一層向上する。 Therefore, even if the high band of air conduction sound is simply attenuated by a filter or the like, it does not match the frequency pattern of meat conduction sound. In normal “spoofing”, the air conduction sound of the “spoofing” subject is recorded / analyzed, and the air conduction sound is reproduced / synthesized and input from a microphone such as a speaker. Therefore, if an authentication pattern for registration is created based on the meat conduction sound collected with the NAM microphone attached to the body surface, the recorded voice or synthesized voice of the person impersonated by the above method can be used by a third party. Even if the voice is input to the NAM microphone, the pattern does not match and “spoofing” can be prevented. In addition, even if an attempt is made to generate a meat conduction sound based on the subject's air conduction sound and input it to the NAM microphone, the target meat conduction sound is not known by a third party, which is very difficult in reality. .
Furthermore, since it is not used everyday, training is required until it can be uttered stably. However, if the authentication pattern for registration is created from the NAM uttered speech, the keyword itself will not be known to third parties, and security will be improved. Further improve.

次に、ＮＡＭマイクロフォン１０１の耐環境雑音の特性について説明する。まず、図１０（ａ）に示すように、ある話者が「あー」と発声した場合の空気伝導音と肉伝導音を、通常マイクロフォンＭＣと乳様突起直下の皮膚表面に装着したＮＡＭマイクロフォン１０１とからそれぞれ採取した。次に、図１０（ｂ）に示すように、上記話者の位置に設置したスピーカから、通常マイクロフォンＭＣで採取された上記話者音声の「あー」を、同図（ａ）において採取した上記話者音声と同じ音量になるようにゲイン調整して再生し、その空気伝導音を、スピーカＳＰの隣で上記話者の乳様突起直下の皮膚表面に装着されたＮＡＭマイクロフォン１０１から採取した。
ＮＡＭマイクロフォン１０１から採取された肉伝導音が図１１（ａ）、ＮＡＭマイクロフォン１０１から採取された空気伝導音が図１１（ｂ）に示されている。同図（ａ）及び同図（ｂ）を参照すると、空気伝導音は肉伝導音に比べ、３０ｄＢ以上減衰していることがわかる。 Next, the environmental noise resistance characteristics of the NAM microphone 101 will be described. First, as shown in FIG. 10 (a), the NAM microphone 101 in which air conduction sound and flesh conduction sound when a certain speaker utters “a” is attached to the skin surface directly below the normal microphone MC and the mastoid process. And were collected respectively. Next, as shown in FIG. 10B, the above-mentioned “a” of the speaker voice normally collected by the microphone MC from the speaker installed at the speaker position is collected in FIG. The gain was adjusted so that the volume was the same as that of the speaker voice, and the air conduction sound was collected from the NAM microphone 101 mounted on the skin surface immediately below the speaker's mastoid next to the speaker SP.
The meat conduction sound collected from the NAM microphone 101 is shown in FIG. 11 (a), and the air conduction sound collected from the NAM microphone 101 is shown in FIG. 11 (b). Referring to FIGS. 4A and 4B, it can be seen that the air conduction sound is attenuated by 30 dB or more compared to the meat conduction sound.

図１は本発明にかかる音声認証によるドアロック解錠システムの構成例を示す図である。同図に示されているドアロック解錠システムは、ＮＡＭマイクロフォン１０１と、音声前処理部１０２と、音声認証システム１０５と、ドアロック制御システム１０８と、ドア１０９とから構成されている。さらに、音声前処理部１０２は、マイクアンプ１０３とＡＤコンバータ１０４からなり、音声認証システム１０５は、音声認証部１０６と登録パターン記憶部１０７からなる。 FIG. 1 is a diagram showing a configuration example of a door lock unlocking system based on voice authentication according to the present invention. The door lock unlocking system shown in FIG. 1 includes a NAM microphone 101, a voice preprocessing unit 102, a voice authentication system 105, a door lock control system 108, and a door 109. Furthermore, the voice preprocessing unit 102 includes a microphone amplifier 103 and an AD converter 104, and the voice authentication system 105 includes a voice authentication unit 106 and a registered pattern storage unit 107.

ドアロックを解錠したい場合、認証登録者である発話者１００は、耳介の後方下部の、頭蓋骨の乳様突起直下の、胸鎖乳突筋上の皮膚表面にはＮＡＭマイクロフォン１０１を装着し、認証用キーワードを、非可聴つぶやき音（ＮＡＭ音）または通常音で発声する。上記ＮＡＭ音または通常音はＮＡＭマイクロフォン１０１により採取され、マイクアンプ１０３に入力される。マイクアンプ１０３で増幅されたＮＡＭ音または通常音はＡＤコンバータ１０４でデジタル化された後、音声認証部１０６へ入力される。音声認証部１０６は、予め登録用に発声されたＮＡＭ音または通常音をもとに作成・登録され、登録パターン記憶部１０７に記憶された音声パターンと、認証時に発声されたＮＡＭ音または通常音との類似度を計算し、予め設定した閾値と比較して、発話者１００が部屋への入出権限を持つものか否かを判断する。つまり、音声認証部１０６は、入力された非可聴つぶやき音に関するデータと、登録パターン記憶部１０７に記憶されている非可聴つぶやき音に関するデータとを照合することによって認証を行っている。 When unlocking the door lock, the utterer 100 who is an authentication registrant wears the NAM microphone 101 on the skin surface on the sternocleidomastoid muscle in the lower rear part of the auricle, directly below the mastoid process of the skull, The authentication keyword is uttered with a non-audible murmur (NAM sound) or normal sound. The NAM sound or normal sound is collected by the NAM microphone 101 and input to the microphone amplifier 103. The NAM sound or normal sound amplified by the microphone amplifier 103 is digitized by the AD converter 104 and then input to the voice authentication unit 106. The voice authentication unit 106 is created and registered based on the NAM sound or normal sound uttered in advance for registration, and is stored in the registered pattern storage unit 107, and the NAM sound or normal sound uttered at the time of authentication. Is compared with a threshold value set in advance, and it is determined whether or not the speaker 100 has authority to enter and exit the room. That is, the voice authentication unit 106 performs authentication by collating the input data related to the inaudible murmur and the data related to the inaudible murmur stored in the registered pattern storage unit 107.

音声認証部１０６で得られた判断結果はドアロック制御システム１０８へ送信され、ドアロック制御システム１０８は上記判断結果に従い、入出権限を持つものと判断された場合に、ドア１０９のドアロックを一定時間開錠する。
なお、認証対象者である発話者１００は、ＮＡＭ発声、通常発声いずれの方法で発声してもよいが、登録時と認証時で同じ発声方法を用いる必要がある。ただし、登録時に、ＮＡＭ発声と通常発声の両方の登録パターンを用意し、認証時には、周囲の状況に応じて発声方法を選択することも可能である。 The determination result obtained by the voice authentication unit 106 is transmitted to the door lock control system 108. When the door lock control system 108 determines that the user has authority to enter and exit according to the determination result, the door lock of the door 109 is fixed. Unlock for hours.
Note that the speaker 100, who is the subject of authentication, may utter using either the NAM utterance or the normal utterance, but it is necessary to use the same utterance method during registration and during authentication. However, it is also possible to prepare registration patterns for both NAM utterances and normal utterances at the time of registration, and to select an utterance method according to the surrounding situation at the time of authentication.

図１２は本発明にかかる音声認証によるドアロック解錠システムの別の構成例を示す図である。図１２の構成では、図１の構成に新たに無線送信ユニット２０１と無線受信ユニット２０４とが付加され、無線送信ユニット２０１はデジタル変調部２０２と送信用アンテナ２０３とを含み、無線受信ユニット２０４は受信用アンテナ２０５とデジタル復調部２０６とを含んでいる。 FIG. 12 is a diagram showing another configuration example of the door lock unlocking system by voice authentication according to the present invention. In the configuration of FIG. 12, a radio transmission unit 201 and a radio reception unit 204 are newly added to the configuration of FIG. 1, and the radio transmission unit 201 includes a digital modulation unit 202 and a transmission antenna 203. A receiving antenna 205 and a digital demodulator 206 are included.

ドアロックを解錠したい場合、認証登録者である発話者１００は、耳介の後方下部の、頭蓋骨の乳様突起直下の、胸鎖乳突筋上の皮膚表面にはＮＡＭマイクロフォン１０１を装着し、認証用キーワードを、非可聴つぶやき音（ＮＡＭ音）または通常音で発声する。上記ＮＡＭ音または通常音はＮＡＭマイクロフォン１０１により採取され、マイクアンプ１０３に入力される。マイクアンプ１０３で増幅されたＮＡＭ音または通常音はＡＤコンバータ１０４でデジタル化された後、デジタル変調部２０２でデジタル変調され、送信用アンテナ２０３、受信用アンテナ２０５を経て、デジタル復調部２０６に送られる。デジタル復調部２０６は、受信信号からもとのデジタル音声データを抽出し、音声認証部１０６に入力する。
以後の動作は実施例１の場合と同様のため省略する。 When unlocking the door lock, the utterer 100 who is an authentication registrant wears the NAM microphone 101 on the skin surface on the sternocleidomastoid muscle in the lower rear part of the auricle, directly below the mastoid process of the skull, The authentication keyword is uttered with a non-audible murmur (NAM sound) or normal sound. The NAM sound or normal sound is collected by the NAM microphone 101 and input to the microphone amplifier 103. The NAM sound or normal sound amplified by the microphone amplifier 103 is digitized by the AD converter 104, then digitally modulated by the digital modulation unit 202, and sent to the digital demodulation unit 206 via the transmission antenna 203 and the reception antenna 205. It is done. The digital demodulation unit 206 extracts the original digital audio data from the received signal and inputs it to the audio authentication unit 106.
Subsequent operations are the same as those in the first embodiment, and will be omitted.

図１３は本発明にかかる音声認証によるドアロック解錠システムのさらに別の構成例を示す図である。同図の構成が実施例１（図１）の場合と異なる点は、音声認証システム１４０１に個人ＩＤ入力部１４０２が追加され、音声認証部１４０３がＡＤコンバータ１０４からのデジタル化された音声データの他に、個人ＩＤ入力部１４０２からの個人ＩＤ情報をも入力とする点である。 FIG. 13 is a diagram showing still another configuration example of the door lock unlocking system by voice authentication according to the present invention. The configuration shown in the figure is different from that in the first embodiment (FIG. 1) in that a personal ID input unit 1402 is added to the voice authentication system 1401, and the voice authentication unit 1403 receives digitized voice data from the AD converter 104. In addition, the personal ID information from the personal ID input unit 1402 is also input.

ドアロックを解錠したい場合、認証登録者である発話者１００は、個人ＩＤ入力部１４０２から、ＩＤカードまたはテンキーにより個人ＩＤを入力し、耳介の後方下部の、頭蓋骨の乳様突起直下の、胸鎖乳突筋上の皮膚表面にはＮＡＭマイクロフォン１０１を装着し、認証用キーワードを、非可聴つぶやき音（ＮＡＭ音）または通常音で発声する。上記ＮＡＭ音または通常音はＮＡＭマイクロフォン１０１により採取され、マイクアンプ１０３に入力される。マイクアンプ１０３で増幅されたＮＡＭ音または通常音はＡＤコンバータ１０４でデジタル化された後、音声認証部１４０３へ入力される。 When it is desired to unlock the door lock, the speaker 100 who is an authentication registrant inputs a personal ID from the personal ID input unit 1402 using an ID card or a numeric keypad, and directly below the mastoid process of the skull at the lower rear part of the auricle. The NAM microphone 101 is attached to the skin surface on the sternocleidomastoid muscle, and the authentication keyword is uttered by an inaudible murmur (NAM sound) or a normal sound. The NAM sound or normal sound is collected by the NAM microphone 101 and input to the microphone amplifier 103. The NAM sound or normal sound amplified by the microphone amplifier 103 is digitized by the AD converter 104 and then input to the voice authentication unit 1403.

音声認証部１４０３は、個人ＩＤ入力部１４０２から入力された個人ＩＤに対応した、予め登録用に発声されたＮＡＭ音または通常音をもとに作成・登録され、登録パターン記憶部１０７に記憶された音声パターンと、認証時に発声されたＮＡＭ音または通常音との類似度を計算し、予め設定した閾値と比較して、発話者１００が部屋への入出権限を持つものか否かを判断する。
以後の動作は実施例１の場合と同様のため省略する。 The voice authentication unit 1403 is created and registered based on the NAM sound or normal sound uttered for registration corresponding to the personal ID input from the personal ID input unit 1402 and stored in the registration pattern storage unit 107. The similarity between the voice pattern and the NAM sound or normal sound uttered at the time of authentication is calculated and compared with a preset threshold value to determine whether or not the speaker 100 has authority to enter and exit the room. .
Subsequent operations are the same as those in the first embodiment, and will be omitted.

図１４は本発明にかかる音声認証によるネットワークログイン管理システムの構成例を示す図である。同図を参照すると、ネットワークログイン管理システムはＮＡＭマイクロフォン１０１と、音声前処理部１０２と、音声認証システム１４０１と、ネットワークログイン管理部１５０１と、ネットワーク１５０２とから構成されている。さらに音声前処理部１０２は、マイクアンプ１０３とＡＤコンバータ１０４とからなる。また、音声認証システム１４０１は、個人ＩＤ入力部１４０２と、音声認証部１４０３と、登録パターン記憶部１０７とからなる。 FIG. 14 is a diagram showing a configuration example of a network login management system by voice authentication according to the present invention. Referring to FIG. 2, the network login management system includes a NAM microphone 101, a voice preprocessing unit 102, a voice authentication system 1401, a network login management unit 1501, and a network 1502. Furthermore, the audio preprocessing unit 102 includes a microphone amplifier 103 and an AD converter 104. The voice authentication system 1401 includes a personal ID input unit 1402, a voice authentication unit 1403, and a registered pattern storage unit 107.

ネットワーク１５０２にログインしたい場合、認証登録者である発話者１００は、個人ＩＤ入力部１４０２から、ＩＤカードまたはテンキーにより個人ＩＤを入力し、耳介の後方下部の、頭蓋骨の乳様突起直下の、胸鎖乳突筋上の皮膚表面にはＮＡＭマイクロフォン１０１を装着し、認証用キーワードを、非可聴つぶやき音（ＮＡＭ音）または通常音で発声する。上記ＮＡＭ音または通常音はＮＡＭマイクロフォン１０１により採取され、マイクアンプ１０３に入力される。マイクアンプ１０３で増幅されたＮＡＭ音または通常音はＡＤコンバータ１０４でデジタル化された後、音声認証部１４０３へ入力される。 When logging in to the network 1502, the speaker 100 who is an authentication registrant inputs a personal ID from the personal ID input unit 1402 using an ID card or a numeric keypad, and directly below the mastoid process of the skull, The NAM microphone 101 is attached to the skin surface on the sternocleidomastoid muscle, and the authentication keyword is uttered with an inaudible murmur (NAM sound) or a normal sound. The NAM sound or normal sound is collected by the NAM microphone 101 and input to the microphone amplifier 103. The NAM sound or normal sound amplified by the microphone amplifier 103 is digitized by the AD converter 104 and then input to the voice authentication unit 1403.

音声認証部１４０３は、個人ＩＤ入力部１４０２から入力された個人ＩＤに対応した、予め登録用に発声されたＮＡＭ音または通常音をもとに作成・登録され、登録パターン記憶部１０７に記憶された音声パターンと、認証時に発声されたＮＡＭ音または通常音との類似度を計算し、予め設定した閾値と比較して、発話者１００が部屋への入出権限を持つものか否かを判断する。
音声認証部１４０３で得られた判断結果はネットワークログイン管理部１５０１へ送信され、ネットワークログイン管理部１５０１は上記判断結果に従い、ログイン権限を持つものと判断された場合に、ネットワーク１５０２へのログインを許可する。 The voice authentication unit 1403 is created and registered based on the NAM sound or normal sound uttered for registration corresponding to the personal ID input from the personal ID input unit 1402 and stored in the registration pattern storage unit 107. The similarity between the voice pattern and the NAM sound or normal sound uttered at the time of authentication is calculated and compared with a preset threshold value to determine whether or not the speaker 100 has authority to enter and exit the room. .
The determination result obtained by the voice authentication unit 1403 is transmitted to the network login management unit 1501, and the network login management unit 1501 permits login to the network 1502 when it is determined that the user has login authority according to the determination result. To do.

図１５は本発明にかかる音声認証によるデータベースアクセス管理システムの構成例を示す図である。同図を参照すると、データベースアクセス管理システムは、ＮＡＭマイクロフォン１０１と、音声前処理部１０２と、音声認証システム１４０１と、データベースアクセス管理部１６０１と、データベース１６０２から構成されている。さらに音声前処理部１０２は、マイクアンプ１０３とＡＤコンバータ１０４、音声認証システム１４０１は、個人ＩＤ入力部１４０２、音声認証部１４０３、登録パターン記憶部１０７からなる。 FIG. 15 is a diagram showing a configuration example of a database access management system based on voice authentication according to the present invention. Referring to the figure, the database access management system includes a NAM microphone 101, a voice preprocessing unit 102, a voice authentication system 1401, a database access management unit 1601, and a database 1602. Further, the voice preprocessing unit 102 includes a microphone amplifier 103 and an AD converter 104, and the voice authentication system 1401 includes a personal ID input unit 1402, a voice authentication unit 1403, and a registered pattern storage unit 107.

データベース１６０２にアクセスしたい場合、認証登録者である発話者１００は、個人ＩＤ入力部１４０２から、ＩＤカードまたはテンキーにより個人ＩＤを入力し、耳介の後方下部の、頭蓋骨の乳様突起直下の、胸鎖乳突筋上の皮膚表面にはＮＡＭマイクロフォン１０１を装着し、認証用キーワードを、非可聴つぶやき音（ＮＡＭ音）または通常音で発声する。上記ＮＡＭ音または通常音はＮＡＭマイクロフォン１０１により採取され、マイクアンプ１０３に入力される。マイクアンプ１０３で増幅されたＮＡＭ音または通常音はＡＤコンバータ１０４でデジタル化された後、音声認証部１４０３へ入力される。音声認証部１４０３は、個人ＩＤ入力部１４０２から入力された個人ＩＤに対応した、予め登録用に発声されたＮＡＭ音または通常音をもとに作成・登録され、登録パターン記憶部１０７に記憶された音声パターンと、認証時に発声されたＮＡＭ音または通常音との類似度を計算し、予め設定した閾値と比較して、発話者１００が部屋への入出権限を持つものか否かを判断する。音声認証部１４０３で得られた判断結果はデータベースアクセス管理部１６０１へ送信され、データベースアクセス管理部１６０１は上記判断結果に従い、アクセス権限を持つものと判断された場合に、データベース１６０２へのアクセスを許可する。 To access the database 1602, the speaker 100 who is an authentication registrant inputs a personal ID from the personal ID input unit 1402 using an ID card or a numeric keypad, and directly below the mastoid of the skull at the lower rear part of the auricle. The NAM microphone 101 is attached to the skin surface on the sternocleidomastoid muscle, and the authentication keyword is uttered with an inaudible murmur (NAM sound) or a normal sound. The NAM sound or normal sound is collected by the NAM microphone 101 and input to the microphone amplifier 103. The NAM sound or normal sound amplified by the microphone amplifier 103 is digitized by the AD converter 104 and then input to the voice authentication unit 1403. The voice authentication unit 1403 is created and registered based on the NAM sound or normal sound uttered for registration corresponding to the personal ID input from the personal ID input unit 1402 and stored in the registration pattern storage unit 107. The similarity between the voice pattern and the NAM sound or normal sound uttered at the time of authentication is calculated and compared with a preset threshold value to determine whether or not the speaker 100 has authority to enter and exit the room. . The determination result obtained by the voice authentication unit 1403 is transmitted to the database access management unit 1601, and the database access management unit 1601 permits access to the database 1602 when it is determined that the user has access authority according to the determination result. To do.

図１６は本発明にかかる音声認証によるドアロック解錠システムのさらに別の構成例を示す図である。同図を参照すると、ドアロック解錠システムは、ＲＦＩＤタグ付きＮＡＭマイクロフォン１７０１と、マイク入力端子１７０２と、音声前処理部１０２と、音声認証システム１７０３と、ドアロック制御システム１０８と、ドア１０９とから構成されている。さらに音声前処理部１０２は、マイクアンプ１０３とＡＤコンバータ１０４とからなる。音声認証システム１７０３は、ＲＦＩＤタグリーダライタ１７０４と、音声認証部１７０５と、登録パターン記憶部１０７とからなる。 FIG. 16 is a diagram showing still another configuration example of the door lock unlocking system by voice authentication according to the present invention. Referring to the figure, the door lock unlocking system includes an NAM microphone 1701 with an RFID tag, a microphone input terminal 1702, a voice preprocessing unit 102, a voice authentication system 1703, a door lock control system 108, a door 109, It is composed of Furthermore, the audio preprocessing unit 102 includes a microphone amplifier 103 and an AD converter 104. The voice authentication system 1703 includes an RFID tag reader / writer 1704, a voice authentication unit 1705, and a registered pattern storage unit 107.

認証登録者である発話者１００は、それぞれ個別のＲＦＩＤタグ付きＮＡＭマイクロフォン１７０１を持ち、ＲＦＩＤタグには予め個人ＩＤ情報が書き込まれている。
ＲＦＩＤタグ付きＮＡＭマイクロフォン１７０１の構成例について図１７及び図１８を参照して説明する。図１７はＦＩＤタグ付きＮＡＭマイクロフォン１７０１の側面断面図、図１８はＦＩＤタグ付きＮＡＭマイクロフォン１７０１の正面図である。図１７及び図１８に示されているＮＡＭマイクロフォン１７０１の構成が、図２及び図３に示されているＮＡＭマイクロフォン１０１の構成と異なる点は、予め個人ＩＤ情報が書き込まれているＲＦＩＤタグ１８０１と、マイク入力端子１７０２に対応したプラグ１８０２とが追加されている点である。 Each speaker 100 who is an authentication registrant has an individual NAM microphone 1701 with an RFID tag, and personal ID information is written in the RFID tag in advance.
A configuration example of the NAM microphone 1701 with an RFID tag will be described with reference to FIGS. FIG. 17 is a side sectional view of the NAM microphone 1701 with an FID tag, and FIG. 18 is a front view of the NAM microphone 1701 with an FID tag. The configuration of the NAM microphone 1701 shown in FIGS. 17 and 18 is different from the configuration of the NAM microphone 101 shown in FIGS. 2 and 3 in that the RFID tag 1801 in which personal ID information is written in advance. A plug 1802 corresponding to the microphone input terminal 1702 is added.

ＲＦＩＤなどの非接触ＩＤタグに記憶されている識別情報を読み出すことによって該識別情報が入力されるので、認証対象者を識別するための識別情報を容易に入力することができる。
プラグ１８０２が追加されているので、ＮＡＭマイクロフォン１７０１は、本システムに対して着脱自在に構成されていることになる。このように構成すれば、ＮＡＭマイクロフォンを携行でき、システムの利便性が向上する。 Since the identification information is input by reading the identification information stored in a non-contact ID tag such as an RFID, the identification information for identifying the person to be authenticated can be easily input.
Since the plug 1802 is added, the NAM microphone 1701 is configured to be detachable from the system. If comprised in this way, a NAM microphone can be carried and the convenience of a system will improve.

ドアロックを解錠したい場合、認証登録者である発話者１００は、まず個人毎に所有するＲＦＩＤタグ付きＮＡＭマイクロフォン１７０１をＲＦＩＤタグリーダライタ１７０４に近づける。ＲＦＩＤタグリーダライタ１７０４は、ＲＦＩＤタグ付きＮＡＭマイクロフォン１７０１のＲＦＩＤタグ１８０１から、個人ＩＤを読み出す。
次に、認証登録者である発話者１００は、ＲＦＩＤタグ付きＮＡＭマイクロフォン１７０１のプラグ１８０２をマイク入力端子１７０２に差し込んだ後、耳介の後方下部の、頭蓋骨の乳様突起直下の、胸鎖乳突筋上の皮膚表面にはＲＦＩＤタグ付きＮＡＭマイクロフォン１７０１を装着し、認証用キーワードを、非可聴つぶやき音（ＮＡＭ音）または通常音で発声する。 When it is desired to unlock the door lock, the speaker 100 who is an authentication registrant first brings the RFID tag-attached NAM microphone 1701 close to the RFID tag reader / writer 1704 for each individual. The RFID tag reader / writer 1704 reads the personal ID from the RFID tag 1801 of the RFID tag-attached NAM microphone 1701.
Next, the speaker 100 who is an authentication registrant inserts the plug 1802 of the RFID tag-attached NAM microphone 1701 into the microphone input terminal 1702, and then the thoracic milk in the lower rear part of the auricle just below the mastoid process of the skull. A NAM microphone 1701 with an RFID tag is attached to the skin surface above the gluteal muscle, and the authentication keyword is uttered with an inaudible murmur (NAM sound) or a normal sound.

上記ＮＡＭ音または通常音はＲＦＩＤタグ付きＮＡＭマイクロフォン１７０１により採取され、マイクアンプ１０３に入力される。マイクアンプ１０３で増幅されたＮＡＭ音または通常音はＡＤコンバータ１０４でデジタル化された後、音声認証部１７０５へ入力される。音声認証部１７０５は、ＲＦＩＤタグリーダライタ１７０４が読み込んだ個人ＩＤに対応した、予め登録用に発声されたＮＡＭ音または通常音をもとに作成・登録され、登録パターン記憶部１０７に記憶された音声パターンと、認証時に発声されたＮＡＭ音または通常音との類似度を計算し、予め設定した閾値と比較して、発話者１００が部屋への入出権限を持つものか否かを判断する。 The NAM sound or normal sound is collected by the RFID tag-attached NAM microphone 1701 and input to the microphone amplifier 103. The NAM sound or normal sound amplified by the microphone amplifier 103 is digitized by the AD converter 104 and then input to the voice authentication unit 1705. The voice authentication unit 1705 is created and registered based on the NAM sound or normal sound uttered for registration corresponding to the personal ID read by the RFID tag reader / writer 1704, and stored in the registration pattern storage unit 107. The similarity between the pattern and the NAM sound or normal sound uttered at the time of authentication is calculated, and compared with a preset threshold value, it is determined whether or not the speaker 100 has the right to enter and leave the room.

以後の動作は実施例１の場合と同様のため省略する。
個人毎に、小型かつ安価で製造可能なＮＡＭマイクロフォンとＲＦＩＤタグを組み合わせたＲＦＩＤタグ付きＮＡＭマイクロフォン１７０１を保有・使用することにより、特別な操作なしに個人ＩＤ入力が可能となり、使い勝手がよく、セキュリティ性の高いシステムが実現できる。
なお、実施例１から実施例５までのいずれかにおいて、マイク入力端子に対応したプラグを追加すれば、ＮＡＭマイクロフォンを、各システムに対して着脱自在に構成でき、ＮＡＭマイクロフォンを携行でき、システムの利便性が向上する。 Subsequent operations are the same as those in the first embodiment, and will be omitted.
For each individual, possessing and using a NAM microphone with RFID tag 1701 that combines a small and inexpensive NAM microphone and an RFID tag makes it possible to enter a personal ID without any special operation. A highly reliable system can be realized.
In any of the first to fifth embodiments, if a plug corresponding to the microphone input terminal is added, the NAM microphone can be configured to be detachable from each system, and the NAM microphone can be carried around. Convenience is improved.

本発明により、セキュリティ性が高く、かつ、認証性能が高い音声認証システムが実現できるようになるため、入出門管理やＡＴＭ以外のセキュリティシステムや金融システムにおいても、本人認証システムとして広く使用することが可能になる。 According to the present invention, a voice authentication system with high security and high authentication performance can be realized. Therefore, it can be widely used as a personal authentication system in security systems and financial systems other than entrance management and ATMs. It becomes possible.

本発明の実施の形態に係るドアロック解錠システムの構成を示す図である。It is a figure showing composition of a door lock unlocking system concerning an embodiment of the invention. ＮＡＭマイクロフォンの構成例を示す側面断面図である。It is side surface sectional drawing which shows the structural example of a NAM microphone. ＮＡＭマイクロフォンの構成例を示す正面図である。It is a front view which shows the structural example of a NAM microphone. ＮＡＭマイクロフォンから採取した信号のスペクトルを示す図である。It is a figure which shows the spectrum of the signal extract | collected from the NAM microphone. 発声「あ」の空気伝導音と肉伝導音のオクターブ分析図である。It is an octave analysis diagram of air conduction sound and meat conduction sound of utterance "A". 発声「い」の空気伝導音と肉伝導音のオクターブ分析図である。It is an octave analysis diagram of air conduction sound and meat conduction sound of utterance “I”. 発声「う」の空気伝導音と肉伝導音のオクターブ分析図である。It is an octave analysis diagram of air conduction sound and meat conduction sound of utterance “U”. 発声「え」の空気伝導音と肉伝導音のオクターブ分析図である。It is an octave analysis diagram of air conduction sound and meat conduction sound of utterance “E”. 発声「お」の空気伝導音と肉伝導音のオクターブ分析図である。It is an octave analysis diagram of air conduction sound and meat conduction sound of utterance "O". ＮＡＭマイクロフォンの耐環境雑音性用の評価構成図である。It is an evaluation block diagram for environmental noise resistance of a NAM microphone. ＮＡＭマイクロフォンから採取された肉伝導音と空気伝導音の波形図である。It is a wave form diagram of the meat conduction sound and air conduction sound which were extract | collected from the NAM microphone. 本発明にかかる別のドアロック解錠システムの構成を示す図である。It is a figure which shows the structure of another door lock unlocking system concerning this invention. 本発明にかかる別のドアロック解錠システムの構成を示す図である。It is a figure which shows the structure of another door lock unlocking system concerning this invention. 本発明にかかるネットワークログイン管理システムの構成を示す図である。It is a figure which shows the structure of the network login management system concerning this invention. 本発明にかかるデータベースアクセス管理システムの構成を示す図である。It is a figure which shows the structure of the database access management system concerning this invention. 本発明にかかる別のドアロック解錠システムの構成を示す図である。It is a figure which shows the structure of another door lock unlocking system concerning this invention. ＲＦＩＤタグ付きＮＡＭマイクロフォンの構成例を示す側面断面図である。It is side surface sectional drawing which shows the structural example of the NAM microphone with an RFID tag. ＲＦＩＤタグ付きＮＡＭマイクロフォンの構成例を示す正面図である。It is a front view which shows the structural example of the NAM microphone with an RFID tag.

Explanation of symbols

１００発話者
１０１ＮＡＭマイクロフォン
１０２音声前処理部
１０３マイクアンプ
１０４ＡＤコンバータ
１０５音声認証システム
１０６音声認証部
１０７登録パターン記憶部
１０８ドアロック制御システム
１０９ドア
２０１無線送信ユニット
２０２デジタル変調部
２０３送信用アンテナ
２０４無線受信ユニット
２０５受信用アンテナ
２０６デジタル復調部
３０１ａ接触部
３０１ｂフレーム
３０１ｃ外部雑音防音空間
３０１ｄ反射板
３０１ｅコンデンサマイクロフォン
３０１ｆ振動板電極
３０１ｇリード線
１４０１音声認証システム
１４０２個人ＩＤ入力部
１４０３音声認証部
１５０１ネットワークログイン管理部
１５０２ネットワーク
１６０１データベースアクセス管理部
１６０２データベース
１７０１ＲＦＩＤタグ付きＮＡＭマイクロフォン
１７０２マイク入力端子
１７０３音声認証システム
１７０４ＲＦＩＤタグリーダライタ
１７０５音声認証部
１８０１ＲＦＩＤタグ
１８０２プラグ
ＭＣ通常マイクロフォン
ＳＰスピーカ DESCRIPTION OF SYMBOLS 100 Speaker 101 NAM microphone 102 Voice preprocessing part 103 Microphone amplifier 104 AD converter 105 Voice authentication system 106 Voice authentication part 107 Registration pattern memory | storage part 108 Door lock control system 109 Door 201 Wireless transmission unit 202 Digital modulation part 203 Transmitting antenna 204 Wireless receiving unit 205 Reception antenna 206 Digital demodulator 301a Contact 301b Frame 301c External noise soundproof space 301d Reflector 301e Capacitor microphone 301f Diaphragm electrode 301g Lead wire 1401 Voice authentication system 1402 Personal ID input part 1403 Voice authentication part 1501 Network login Management unit 1502 Network 1601 Database access management unit 1602 Database 1701 With RFID tag NAM microphone 1702 Microphone input terminal 1703 Voice authentication system 1704 RFID tag reader / writer 1705 Voice authentication unit 1801 RFID tag 1802 Plug MC Normal microphone SP Speaker

Claims

Data relating to the body conduction voice for authentication is stored in advance, the body conduction voice input means for inputting the body conduction voice of the person to be authenticated from the skin surface, and the body conduction voice input means. Authentication means for performing authentication by comparing data relating to body conduction speech and data relating to body conduction speech stored in the storage means, and to control an external device according to the authentication result of the authentication means A voice speaker authentication system characterized by

2. The voice speaker authentication system according to claim 1, wherein the body conduction voice input means is a NAM (Non-Audible Murmur) microphone.

The voice speaker authentication system according to claim 1, wherein the body conduction voice is a non-audible murmur.

4. The voice speaker authentication system according to claim 2, wherein the NAM microphone is configured to be detachable from its own system.

5. The wireless communication unit according to claim 1, further comprising a wireless communication unit configured to transmit data related to the internal conductive speech input by the internal body speech input unit to the authentication unit by a wireless communication method. The voice speaker authentication system described.

It further includes ID input means for inputting identification information given in advance to the authentication target person and for identifying the authentication target person, wherein the authentication means includes the ID among the data stored in the storage means. The data corresponding to the identification information inputted by the input means is collated with the data related to the body conduction voice inputted by the body conduction voice input means, according to any one of claims 1 to 5. The voice speaker authentication system described.

7. The voice speaker authentication system according to claim 6, wherein the ID input means inputs the identification information by reading the identification information stored in a non-contact ID tag.

The authentication means compares the similarity between the data related to the body conduction speech input by the body conduction speech input means and the data related to the body conduction speech stored in the storage means, and a predetermined threshold value. The voice speaker authentication system according to any one of claims 1 to 7.

The voice speaker authentication system according to any one of claims 1 to 8, wherein the external device controls the locking of the door by an electrical signal corresponding to the authentication result.

The voice speaker authentication system according to any one of claims 1 to 8, wherein the external device controls whether or not to log in to the network based on data corresponding to the authentication result.

The voice speaker authentication system according to any one of claims 1 to 8, wherein the external device controls whether or not to access a database based on data corresponding to the authentication result.