JP3886483B2

JP3886483B2 - Acoustic sensor

Info

Publication number: JP3886483B2
Application number: JP2003354292A
Authority: JP
Inventors: 繁安藤; 尚哉宮野; 松本　　俊行; 宗生原田
Original assignee: Tokyo Electron Ltd
Current assignee: Tokyo Electron Ltd
Priority date: 2003-10-14
Filing date: 2003-10-14
Publication date: 2007-02-28
Anticipated expiration: 2017-05-26
Also published as: JP2004117368A

Description

本発明は、音声認識処理，音響信号処理等において音信号の特徴を抽出するための音響センサに関し、特に、各周波数帯域における音信号の強度を検出するための音響センサに関する。 The present invention relates to an acoustic sensor for extracting features of a sound signal in voice recognition processing, acoustic signal processing, and the like, and more particularly to an acoustic sensor for detecting the intensity of a sound signal in each frequency band.

音声認識を実行するシステムにおいて、従来は、音声信号を受信したマイクロフォンの振動を、アンプにて電気信号に変換・増幅した後、Ａ／Ｄ変換器でアナログ信号をディジタル化して音声ディジタル信号を得、この音声ディジタル信号にコンピュータ上でソフトウェアにより高速フーリエ変換を施し、音声の特徴を抽出する。このような音声認識のシステムが開示されている（例えば、非特許文献１参照。）。 In systems that perform speech recognition, conventionally, the vibration of a microphone that receives a speech signal is converted and amplified into an electrical signal by an amplifier, and then the analog signal is digitized by an A / D converter to obtain a speech digital signal. The voice digital signal is subjected to fast Fourier transform by software on a computer to extract voice characteristics. Such a speech recognition system is disclosed (for example, see Non-Patent Document 1).

音声信号の特徴を効率良く抽出するためには、音声信号が定常であると見做せる時間内の音響スペクトルを計算する必要がある。音声信号の場合には、通常10〜20msecの時間内で定常と見做せると考えられている。従って、10〜20msecを周期としてその時間内に含まれる音声ディジタル信号に対して、コンピュータ上のソフトウェアにより、高速フーリエ変換等の信号処理を実行する。 In order to efficiently extract the characteristics of an audio signal, it is necessary to calculate an acoustic spectrum within a time period that allows the audio signal to be considered to be stationary. In the case of an audio signal, it is considered that it can be regarded as a steady state in a time of 10 to 20 msec. Therefore, signal processing such as fast Fourier transform is executed by software on the computer on the audio digital signal included in the time period of 10 to 20 msec.

以上のように、従来の音声認識方式では、瞬時の全帯域を含んだ音声信号をマイクロフォンによって電気信号に変換し、その電気信号のスペクトルを分析するために、Ａ／Ｄ変換を施して各周波数をディジタル化し、その音声ディジタル信号データを特定の音声波形のデータと比較して、音声の特徴を抽出している。 As described above, in the conventional speech recognition method, a speech signal including the entire instantaneous band is converted into an electrical signal by the microphone, and A / D conversion is performed to analyze each spectrum of the electrical signal in order to analyze the spectrum of the electrical signal. And the voice digital signal data is compared with data of a specific voice waveform to extract voice features.

ところで、聴覚機構及び音の心理物理的性質について詳細な説明がなされている（例えば、非特許文献２参照。）。この文献には、人間が聴く音の高さ（ピッチ）の尺度が、物理量としての周波数と線形に対応するものではなく、メルスケールという尺度に線形に対応することが示されている。このメルスケールとは、音階に表されるような音の高さを表す心理的属性（心理尺度）を示すものであり、人間に等間隔に聞こえるピッチと呼ばれる周波数の間隔を直接数量化したスケールであって、１０００Ｈｚ，４０フォンの音のピッチを１０００ｍｅｌと定義する。そして、５００ｍｅｌの音響信号は０．５倍ピッチの音に聞こえ、２０００ｍｅｌの音響信号は２倍ピッチの音に聞こえる。このメルスケールは物理量としての周波数ｆ〔Ｈｚ〕を用いて次の（１）式のように近似できる。また、この近似式における音の高さ〔ｍｅｌ〕と周波数〔Ｈｚ〕との関係を図７に示す。
ｍｅｌ＝（１０００／ｌｏｇ２）ｌｏｇ（ｆ／１０００＋１） …（１） By the way, the auditory mechanism and the psychophysical properties of sound are described in detail (for example, see Non-Patent Document 2). This document shows that the scale of the pitch (pitch) of the sound that humans listen to does not correspond linearly to the frequency as a physical quantity, but linearly corresponds to the scale called mel scale. This Mel Scale is a psychological attribute (psychological scale) that expresses the pitch of the sound as expressed in the scale, and is a scale that directly quantifies the frequency interval called the pitch that sounds to humans at equal intervals. Then, the pitch of the sound of 1000 Hz and 40 phones is defined as 1000 mel. A 500-mel sound signal sounds like a 0.5-pitch sound, and a 2000-mel sound signal sounds like a double-pitch sound. This mel scale can be approximated by the following equation (1) using the frequency f [Hz] as a physical quantity. FIG. 7 shows the relationship between the pitch [mel] and the frequency [Hz] in this approximate expression.
mel = (1000 / log2) log (f / 1000 + 1) (1)

そして、音声の特徴を効率良く抽出するために、音響スペクトルの周波数帯をこのようなメルスケールに変換することが良く行われている。この音響スペクトルのメルスケールへの変換は、スペクトルの分析と同様に、通常コンピュータ上でソフトウェアにより実行される。 And in order to extract the feature of an audio | voice efficiently, converting the frequency band of an acoustic spectrum into such a mel scale is often performed. This conversion of the acoustic spectrum to the mel scale is usually performed by software on a computer, as in the case of spectrum analysis.

また、音声の特徴を効率良く抽出する手法として、音響スペクトルの周波数帯をバークスケールに変換することも良く行われている。このバークスケールは、人間の心理的な音の大きさ（ラウドネス）に対応する尺度であり、ある程度以上の大きな音において、人間が聴き分けられる周波数帯域幅（これを臨界帯域幅という）を示したものであり、この臨界帯域幅内の音は周波数が異なっていても同じように聞こえる。例えば、その臨界帯域幅内に大きなノイズが発生すると、信号音がそのノイズと周波数が異なっているにも拘らず、ノイズと信号音とを人間の聴覚では判別できないような周波数帯域を示すスケールがバークスケールである。 In addition, as a method for efficiently extracting voice features, the frequency band of an acoustic spectrum is often converted to a bark scale. This Bark scale is a measure corresponding to the loudness of human psychological sound, and shows the frequency bandwidth (this is called the critical bandwidth) that humans can hear in loud sounds of a certain level. Sounds within this critical bandwidth will sound the same even at different frequencies. For example, if a large amount of noise is generated within the critical bandwidth, a scale indicating a frequency band in which the noise and the signal sound cannot be distinguished by human hearing even though the signal sound is different in frequency from the noise. Bark scale.

音声信号処理の分野ではコンピュータ上で取り扱いが容易な臨界帯域幅が要求され、音響スペクトルの周波数軸は１つの臨界帯域を１バーク〔Ｂａｒｋ〕と定義するバークスケールで示される。図８に、臨界帯域幅とバークスケールとの数値関係を示す。また、これらの臨界帯域幅及びバークスケールは、物理量としての周波数ｆ〔ｋＨｚ〕を用いて次の（２）及び（３）式のように近似できる。
臨界帯域幅：ＣＢ〔Ｈｚ〕＝２５＋７５（１＋１．４ｆ²）^0.69 …（２）
バークスケール：Ｂ〔Ｂａｒｋ〕
＝１３ｔａｎ^-1（０．７６ｆ）＋３．５ｔａｎ^-1（ｆ／７．５） …（３） In the field of audio signal processing, a critical bandwidth that is easy to handle on a computer is required, and the frequency axis of the acoustic spectrum is indicated by a bark scale that defines one critical band as one bark. FIG. 8 shows a numerical relationship between the critical bandwidth and the Bark scale. Further, these critical bandwidth and bark scale can be approximated by the following equations (2) and (3) using the frequency f [kHz] as a physical quantity.
Critical bandwidth: CB [Hz] = 25 + 75 (1 + 1.4f ² ) ^0.69 (2)
Bark Scale: B [Bark]
= 13 tan ^-1 (0.76f) +3.5 tan ^-1 (f / 7.5) (3)

ところで、音声認識の分野で聴覚末梢系の工学的機能モデルを用いることが知られており、前記非特許文献２に詳細な説明がなされている。工学的機能モデルでは、帯域フィルタ群による周波数スペクトル分析を前処理としており、例えば代表的な工学的機能モデルの１つであるSeneffのモデルにおける前処理では１３０〜６４００Ｈｚの周波数領域に４０個の独立したチャネルを持つ臨界帯域幅フィルタ群により周波数スペクトル分析がなされる。このとき、音響スペクトルの周波数帯はバークススケールに変換される。このモデルではコンピュータシミュレーションによって入力音刺激に対するモデルの出力が求められ、生理データと良く一致することが示されている。よって、このような工学的機能モデルを使用することにより、音声自動認識において雑音中の音声認識率の向上を図ることができる。 By the way, it is known to use an engineering function model of the auditory peripheral system in the field of speech recognition, and is described in detail in Non-Patent Document 2. In the engineering function model, frequency spectrum analysis using a band-pass filter group is a pre-processing. For example, in the pre-processing in the Seneff model, which is one of the typical engineering function models, 40 independent in the frequency region of 130 to 6400 Hz. Frequency spectrum analysis is performed by a critical bandwidth filter group having the selected channel. At this time, the frequency band of the acoustic spectrum is converted to a Barks scale. In this model, the output of the model for the input sound stimulus is obtained by computer simulation, and it is shown that it matches well with the physiological data. Therefore, by using such an engineering function model, the speech recognition rate in noise can be improved in automatic speech recognition.

このような臨界帯域幅フィルタ群により周波数スペクトル分析及び音響スペクトルのメルスケールへの変換は、スペクトルの分析と同様に、通常コンピュータ上でソフトウェアにより実行される。
IEEE Signal Processing Magazine, Vol.13, No.5, pp.45-57(1996) 甘利俊一監修、中川聖一・鹿野清宏・東倉洋一著「ニューロサイエンス＆テクノロジーシリーズ音声・聴覚と神経回路網モデル」（オーム社，1992年） The frequency spectrum analysis and the conversion of the acoustic spectrum to the mel scale by such a critical bandwidth filter group are usually performed by software on a computer, as in the case of spectrum analysis.
IEEE Signal Processing Magazine, Vol.13, No.5, pp.45-57 (1996) Supervised by Shunichi Amari, Seiichi Nakagawa, Kiyohiro Shikano, Yoichi Higashikura “Neuroscience & Technology Series Voice / Hearing and Neural Network Model” (Ohm, 1992)

コンピュータ上のソフトウェアにより、ディジタル音響信号に高速フーリエ変換処理を施して、その音響信号のスペクトルを分析する従来の手法では、計算量が莫大となって計算負荷が大きいという問題がある。また、音響信号のスペクトルを高速フーリエ変換し、かつ、メルスケールに変換する一連の処理を、コンピュータ上のソフトウェアで行う場合も、計算量が莫大となって計算負荷が大きい。更に、音響信号のスペクトルを臨界帯域幅フィルタ群により周波数スペクトル分析し、かつ、バークスケールに変換する一連の処理をコンピュータ上のソフトウェアで行う場合も、計算量が莫大となって計算負荷が大きい。 In the conventional method of performing fast Fourier transform processing on a digital sound signal by software on a computer and analyzing the spectrum of the sound signal, there is a problem that the calculation amount is enormous and the calculation load is large. Further, when a series of processes for performing fast Fourier transform on the spectrum of an acoustic signal and converting it to a mel scale is performed by software on a computer, the calculation amount is enormous and the calculation load is large. Furthermore, when the spectrum of an acoustic signal is subjected to frequency spectrum analysis using a critical bandwidth filter group and a series of processes for converting to a bark scale is performed by software on a computer, the calculation amount is enormous and the calculation load is large.

また、従来の方法では、母音のように、時間の変化と共に音響スペクトルが変化しないような音声については問題が生じないが、子音と母音との組合せの音、例えば、「か，き，く，け，こ，さ，た」等のように初めに子音が出てきて時間の経過と共に母音の強度が大きくなるような音、または、英語のように複雑な子音と母音との組合せの音では、以下のような問題が生じる。従来では、瞬時に音声を記録し、一定時間毎に区切って全帯域の音響スペクトルを積算して、音声を分析しているので、どの時点で子音から母音に変わったのかを判定することは困難であり、そのために音声認識の判別率の低下が引き起こされていた。この問題を解消するために、より多くの音声パターンを予めコンピュータに記憶させておき、これらの音声パターンの何れかにあてはめるようにしているが、このことが計算負荷をますます増大させる原因となっている。 Further, in the conventional method, there is no problem with a voice whose acoustic spectrum does not change with time, such as a vowel, but a sound of a combination of consonants and vowels, for example, “ka, ki, k, In the case of a consonant that first appears and the intensity of the vowel increases over time, such as “K, Ko, Sa, Ta”, or a combination of complex consonants and vowels such as English The following problems arise. Conventionally, voice is recorded instantaneously, and the sound spectrum is analyzed by summing up the acoustic spectrum of all bands divided at regular intervals, so it is difficult to determine at what point it has changed from consonant to vowel As a result, the recognition rate of speech recognition has been lowered. In order to solve this problem, more voice patterns are stored in the computer in advance and applied to one of these voice patterns, but this causes an increase in computational load. ing.

本発明は斯かる事情に鑑みてなされたものであり、音響信号の検出及び周波数スペクトル分析を１つのハードウェア上にて高速かつ正確に行うことができる音響センサを提供することを目的とする。 The present invention has been made in view of such circumstances, and an object thereof is to provide an acoustic sensor capable of performing acoustic signal detection and frequency spectrum analysis at high speed and accurately on one piece of hardware.

本発明の他の目的は、音響信号の検出，周波数スペクトル分析及び周波数スケールの変換（メルスケールまたはバークスケールへの変換）を１つのハードウェア上にて高速かつ正確に行うことができる音響センサを提供することにある。 Another object of the present invention is to provide an acoustic sensor capable of performing acoustic signal detection, frequency spectrum analysis, and frequency scale conversion (conversion to Mel scale or Bark scale) at high speed and accurately on one piece of hardware. It is to provide.

請求項１に係る音響センサは、媒質中を伝搬する音波を受ける受波部分と、夫々が異なる特定の周波数に共振するような長さを持つ複数の棒状の共振子を有する共振部分と、該共振部分を保持する保持部分と、前記各共振子の、前記特定の周波数毎の振動強度を検出する振動強度検出部分とを備えており、隣合う二つの前記共振子間の距離を異ならせて、各共振子における共振周波数の帯域幅を所定値に設定していることを特徴とする。 An acoustic sensor according to claim 1 includes a receiving portion that receives a sound wave propagating in a medium, a resonance portion having a plurality of rod-like resonators each having a length that resonates at a different specific frequency, and A holding part for holding a resonance part, and a vibration intensity detection part for detecting the vibration intensity of each of the resonators for each specific frequency, and the distance between two adjacent resonators is made different. The resonance frequency bandwidth of each resonator is set to a predetermined value .

請求項２に係る音響センサは、請求項１において、前記複数の共振子における共振周波数を、メルスケールで分布するように設定していることを特徴とする。 The acoustic sensor according to a second aspect is characterized in that, in the first aspect, the resonance frequencies of the plurality of resonators are set so as to be distributed on a mel scale .

請求項３に係る音響センサは、請求項１において、前記複数の共振子における共振周波数を、バークスケールで分布するように設定していることを特徴とする。 Acoustic sensor according to claim 3, Oite to claim 1, the resonance frequency of the plurality of resonators, characterized in that it is set so as to be distributed in the Bark scale.

請求項４に係る音響センサは、請求項１において、前記複数の共振子における共振周波数を、バークスケールで分布するように設定しており、各共振周波数に対応する帯域幅が臨界帯域幅であることを特徴とする。 Acoustic sensor according to claim 4, Oite to claim 1, the resonance frequency of the plurality of resonators is set so as to distributed Bark scale, bandwidth critical bandwidth corresponding to each resonant frequency It is characterized by being.

請求項５に係る音響センサは、媒質中を伝搬する音波を受ける受波部分と、夫々が異なる特定の周波数に共振するような長さを持つ複数の棒状の共振子を有する共振部分と、該共振部分を保持する保持部分と、前記各共振子の、前記特定の周波数毎の振動強度を検出する振動強度検出部分とを備えており、隣合う二つの前記共振子間の距離が異なっており、前記複数の共振子における共振周波数を、バークスケールで分布するように設定しており、各共振周波数に対応する帯域幅が臨界帯域幅であることを特徴とする。 An acoustic sensor according to a fifth aspect includes a receiving portion that receives a sound wave propagating in a medium, a resonance portion having a plurality of rod-shaped resonators each having a length that resonates at a different specific frequency, A holding portion for holding a resonance portion, and a vibration intensity detection portion for detecting the vibration intensity of each of the resonators for each specific frequency, and the distance between two adjacent resonators is different , the resonance frequency of the plurality of resonators, have been set so as to be distributed in the bark scale, and wherein the bandwidth corresponding to each resonant frequency is a critical band width.

請求項６に係る音響センサは、請求項２において、音楽曲を認識するための音楽曲入力用マイクロフォンであることを特徴とする。 According to a sixth aspect of the present invention, the acoustic sensor according to the second aspect is a music song input microphone for recognizing a music song.

請求項７に係る音響センサは、請求項１乃至５の何れかにおいて、音声を認識するための音声入力用マイクロフォンであることを特徴とする。 According to a seventh aspect of the present invention, in any one of the first to fifth aspects, the acoustic sensor is a voice input microphone for recognizing a voice.

請求項８に係る音響センサは、請求項１乃至７の何れかにおいて、音響センサが半導体基板上に構成してあることを特徴とする。 An acoustic sensor according to an eighth aspect is the acoustic sensor according to any one of the first to seventh aspects, wherein the acoustic sensor is formed on a semiconductor substrate.

本発明の音響センサは、夫々が特定の周波数に共振するように長さが異なる複数の共振子を有し、媒質中を伝搬した音波をこれらの共振子に伝え、各共振子での振動を検出する。そして、検出した振動振幅を電気信号に変換し、その電気信号を積算手段に入力して任意周期の期間で入力電気信号を積算する。そして、その積算結果を任意周期毎に特定の周波数毎に出力する。 Acoustic sensor of the present invention has a plurality of resonators of different lengths such that each resonates with a particular frequency, transmitted sound waves propagated through the medium of these resonators, the vibration of each resonator Is detected. Then, the detected vibration amplitude is converted into an electric signal, and the electric signal is input to the integrating means, and the input electric signal is integrated in a period of an arbitrary period. And the integration result is output for every specific frequency for every arbitrary period.

また、本発明の音響センサは、各共振子における共振周波数を、数学的に線形なスケールで分布させるのではなく、メルスケールにて線形に分布させるようにする。実際の振動周波数とメルスケールとの対応は、前記（１）式及び図６に基づいて決められるので、各共振子の設計仕様は容易に決定できる。そして、メルスケール仕様に合わせた各共振子での振動を検出し、その後、上述した第１音響センサと同様の処理を行うことにより、音響信号のスペクトルに相当する物理量をメルスケールで検出できる。 Moreover, acoustic sensors of the present invention, the resonant frequency in each resonator, mathematically rather than being distributed in a linear scale, so as to linearly distributed in the mel scale. Since the correspondence between the actual vibration frequency and the mel scale is determined based on the equation (1) and FIG. 6, the design specifications of each resonator can be easily determined. And the physical quantity equivalent to the spectrum of an acoustic signal is detectable on a mel scale by detecting the vibration in each resonator matched to the mel scale specification, and performing the process similar to the 1st acoustic sensor mentioned above after that.

また、本発明の音響センサは、各共振子における共振周波数を、数学的に線形なスケールで分布させるのではなく、バークスケールにて線形に分布させるようにすると共に、各共振周波数の帯域幅が臨界帯域幅になるようにする。実際の振動周波数とバークスケールとの対応、及び、臨界帯域幅を決める遮断周波数は、前記（２），（３）式及び図７に基づいて決められるので、各共振子の設計仕様は容易に決定できる。そして、バークスケール仕様に合わせた各共振子での振動を検出し、その後、上述した第１音響センサと同様の処理を行うことにより、音響信号のスペクトルに相当する物理量を臨界帯域幅を持ってバークスケールで検出できる。 Moreover, acoustic sensors of the present invention, the resonant frequency in each resonator, mathematically rather than being distributed in a linear scale, while so as to linearly distributed in the Bark scale, the bandwidth of each resonance frequency To be the critical bandwidth. Since the correspondence between the actual vibration frequency and the Bark scale and the cutoff frequency that determines the critical bandwidth are determined based on the equations (2), (3) and FIG. 7, the design specifications of each resonator can be easily set. Can be determined. And by detecting the vibration in each resonator according to the Bark scale specification, and then performing the same processing as the first acoustic sensor described above, the physical quantity corresponding to the spectrum of the acoustic signal has a critical bandwidth. Can be detected on a bark scale.

本発明の音響センサでは、所望の周波数毎に音の強さを検知できるので、分析処理を行うことなく、音響スペクトルをリアルタイムで得ることができる。よって、全帯域の音響信号を入力して各周波数帯域に電気的にフィルタリングする従来の方式に比べて、このように音響信号を機械的に周波数毎に分解する本発明では、電気的なフィルタリングが不要となって処理速度が速くなる。また、一定時間毎に区切ったとしてもどこにも音響データの欠落がない。また、一定時間毎に各周波数毎の音響データが得られるので、時間の経過に合わせて各周波数の強度の推移を確認でき、例えば母音と子音との時間的変化の判別をより正確に行えて、音声認識の判別率を高めることができる。 In the acoustic sensor of the present invention, since the intensity of sound can be detected for each desired frequency, an acoustic spectrum can be obtained in real time without performing analysis processing. Therefore, compared with the conventional method in which an acoustic signal in the entire band is input and electrically filtered into each frequency band, in the present invention in which the acoustic signal is mechanically decomposed for each frequency in this way, electrical filtering is performed. It becomes unnecessary and the processing speed is increased. Moreover, even if it divides for every fixed time, there is no omission of acoustic data anywhere. In addition, since acoustic data for each frequency is obtained at regular intervals, the transition of the intensity of each frequency can be confirmed over time, for example, it is possible to more accurately determine temporal changes between vowels and consonants. The discrimination rate of voice recognition can be increased.

以上のように、本発明の音響センサでは、電気信号に変換する前に、音波が各周波数帯域毎に機械的に分解されるので、従来のようなソフトウェアを用いた電気的なフィルタリング処理は不要になり、処理速度が速くなる。また、半導体基板上に容易に作製可能であって、従来のシステムに比べて占有面積を縮小でき、低コスト化も図ることがきる。更に、所望の周波数毎に音の強さを検知できるので、分析処理を行うことなく、音響スペクトルをリアルタイムで得ることができ、また、一定時間毎に各周波数毎の音響データが得られるので、時間の経過に合わせて各周波数の強度の推移を確認でき、音声の時間的変化の判別をより正確に行えて、音声認識の判別率を高めることに寄与できる。 As described above, in the acoustic sensor of the present invention, since the sound wave is mechanically decomposed for each frequency band before being converted into an electric signal, an electrical filtering process using conventional software is unnecessary. And the processing speed is increased. Further, it can be easily manufactured on a semiconductor substrate, and the occupied area can be reduced as compared with the conventional system, and the cost can be reduced. Furthermore, since the intensity of sound can be detected for each desired frequency, an acoustic spectrum can be obtained in real time without performing analysis processing, and acoustic data for each frequency can be obtained at regular intervals. The transition of the intensity of each frequency can be confirmed with the passage of time, the temporal change of speech can be determined more accurately, and the speech recognition discrimination rate can be increased.

また、本発明の音響センサは、メルスケールで分布する共振周波数を持つ複数の共振子の集合体、または、共振周波数がバークスケールで分布し、臨界帯域幅を持つ複数の共振子の集合体を有するので、人間の聴覚により近似させた状態で音声を認識でき、音声認識時に音声の特徴を効率良く抽出することが可能である。 In addition, the acoustic sensor of the present invention includes an assembly of a plurality of resonators having resonance frequencies distributed in the mel scale, or an assembly of a plurality of resonators having resonance frequencies distributed in a bark scale and having a critical bandwidth. Therefore, the voice can be recognized in a state approximated to human hearing, and the features of the voice can be efficiently extracted during the voice recognition.

以下、本発明をその実施の形態を示す図面に基づいて具体的に説明する。 Hereinafter, the present invention will be specifically described with reference to the drawings showing embodiments thereof.

（第１の実施の形態）
図１は、本発明の音響センサの実施の形態を示す図である。本発明の音響センサは、半導体シリコン基板１に形成されるセンサ本体２と電極３と周辺回路である検出回路４とから構成されている。センサ本体２は、すべての部分が半導体シリコンで形成されており、長さが異なる複数（図１の例では６個）の棒状の部分を有する共振部分21と、この共振部分21を共振の固定端側で保持する板状の保持部分22と、保持部分22の一方の端部に立設された短寸棒状の伝搬部分23と、伝搬部分23に連なり空気中を伝搬した音波を受ける板状の受波部分24とから構成されている。 (First embodiment)
FIG. 1 is a diagram showing an embodiment of an acoustic sensor of the present invention. The acoustic sensor of the present invention includes a sensor main body 2 formed on a semiconductor silicon substrate 1, an electrode 3, and a detection circuit 4 that is a peripheral circuit. The sensor main body 2 is formed of semiconductor silicon, and has a resonance part 21 having a plurality of (6 in the example of FIG. 1) rod-like parts having different lengths, and the resonance part 21 is fixed for resonance. A plate-like holding portion 22 held on the end side, a short rod-like propagation portion 23 erected on one end of the holding portion 22, and a plate-like shape that receives sound waves propagated in the air connected to the propagation portion 23 And a receiving portion 24.

共振部分21は片持ち梁となっており、それぞれの棒状の部分は特定の周波数に共振するように長さが調整された共振子25となっている。これらの複数の共振子25は、下記（４）式で表される共振周波数ｆにて選択的に応答振動するようになっている。 The resonance portion 21 is a cantilever, and each rod-like portion is a resonator 25 whose length is adjusted so as to resonate at a specific frequency. The plurality of resonators 25 selectively vibrate at a resonance frequency f expressed by the following equation (4).

ｆ＝（ＣＨＥ^1/2）／（Ｌ²ρ^1/2） …（４）
但し、Ｃ：実験的に決定される定数
Ｈ：各共振子の厚さ
Ｌ：各共振子の長さ
Ｅ：材料物質（半導体シリコン）のヤング率
ρ：材料物質（半導体シリコン）の密度 f = (CHE ^1/2 ) / (L ² ρ ^1/2 ) (4)
C: constant determined experimentally
H: Thickness of each resonator
L: Length of each resonator
E: Young's modulus of material (semiconductor silicon)
ρ: Density of material (semiconductor silicon)

上記（４）式から分かるように、共振子25の厚さＨまたは長さＬを変えることにより、その共振周波数ｆを所望の値に設定することができる。図１に示す例では、すべての共振子25の厚さＨは一定とし、その長さＬを左側から右側に向かうにつれて順次長くなるように設定しており、各共振子25が固有の共振周波数を持つようにしている。具体的には、左側から右側に向かって可聴帯域の１５〜／２０ｋＨｚ程度の範囲内で高周波数から低周波数まで対応できるようになっている。 As can be seen from the above equation (4), by changing the thickness H or length L of the resonator 25, the resonance frequency f can be set to a desired value. In the example shown in FIG. 1, the thickness H of all the resonators 25 is constant, and the length L is set so as to increase sequentially from the left side to the right side, and each resonator 25 has its own resonance frequency. Like to have. Specifically, from the left side to the right side, a high frequency to a low frequency can be handled within a range of about 15 to 20 kHz of the audible band.

各共振子25の共振周波数の帯域幅は、共振部分21を振動エネルギが伝送していく過程において、隣合う共振子25との相互作用に依存する。即ち、隣合う共振子25の共振周波数の変化率，隣合う共振子25までの距離のような構造上の設計値、及び、隣合う共振子25間の気体の粘性等により、その帯域幅は決定されるが、本例では、隣合う共振子25間の距離を変えることにより、各共振子25の共振周波数の帯域幅を制御している。 The bandwidth of the resonance frequency of each resonator 25 depends on the interaction with the adjacent resonator 25 in the process in which vibration energy is transmitted through the resonance portion 21. That is, the bandwidth is determined by the structural design value such as the rate of change of the resonance frequency of the adjacent resonators 25, the distance to the adjacent resonators 25, and the viscosity of the gas between the adjacent resonators 25. Although determined, in this example, the bandwidth of the resonance frequency of each resonator 25 is controlled by changing the distance between the adjacent resonators 25.

図５は、共振周波数が３ｋＨｚである単結晶シリコン製の共振子25について、隣合う共振子25までの距離Ｄ（横軸）を変化させた場合の帯域幅（縦軸）の変化を示すグラフである。図６は、共振子25における長さＬ，厚さＨ，幅Ｗ及び距離Ｄの関係を表す図であり、この共振子25の設計値は長さＬ＝１７０６μｍ、厚さＨ＝１０μｍ、幅Ｗ＝８０μｍであって、隣合う共振子25間の気体は空気である。隣合う共振子25までの距離Ｄを調整することにより、所望の帯域幅を設定できることが、図５から理解される。よって、このことを考慮して、本例では、各共振子25の帯域幅が図８に示す臨界帯域幅になるように、隣合う共振子25間の距離Ｄを決定している。 FIG. 5 is a graph showing a change in bandwidth (vertical axis) when the distance D (horizontal axis) to the adjacent resonator 25 is changed for a resonator 25 made of single crystal silicon having a resonance frequency of 3 kHz. It is. FIG. 6 is a diagram showing the relationship between the length L, thickness H, width W, and distance D of the resonator 25. The design values of the resonator 25 are length L = 1706 μm, thickness H = 10 μm, and width. W = 80 μm, and the gas between adjacent resonators 25 is air. It can be understood from FIG. 5 that a desired bandwidth can be set by adjusting the distance D to the adjacent resonator 25. Therefore, in consideration of this, in this example, the distance D between the adjacent resonators 25 is determined so that the bandwidth of each resonator 25 becomes the critical bandwidth shown in FIG.

以上のような構成をなすセンサ本体２は、半導体集積回路製造技術またはマイクロマシン加工技術を用いて半導体シリコン基板１上に作製される。そして、このような構成において、音波が受波部分24に伝わるとその板状の受波部分24が振動し、音波を示すその振動は伝搬部分23を経て保持部分22に伝搬し、これに保持された共振部分21の棒状の各共振子25をそれぞれの特定の周波数にて順次共振させながら図１の左方から右方へ伝わっていくようになっている。 The sensor body 2 configured as described above is manufactured on the semiconductor silicon substrate 1 using a semiconductor integrated circuit manufacturing technique or a micromachining technique. In such a configuration, when the sound wave is transmitted to the wave receiving part 24, the plate-like wave receiving part 24 vibrates, and the vibration indicating the sound wave propagates to the holding part 22 through the propagation part 23 and is held by this. The rod-like resonators 25 of the resonating portion 21 are transmitted from the left to the right in FIG. 1 while sequentially resonating at their specific frequencies.

センサ本体２には適当なバイアス電圧Ｖ_biasが印加されており、共振部分21の各共振子25の先端部と、該先端部に対向する位置の半導体シリコン基板１に形成された電極３とにてキャパシタが構成されている。共振子25の先端部は共振子25の振動に伴って位置が上下する可動電極であって、一方、半導体シリコン基板１に形成された電極３はその位置が移動しない固定電極となっている。そして、共振子25がそれぞれの特定の周波数にて振動すると、両電極間の距離が変動するので、キャパシタの容量が変化するようになっている。 An appropriate bias voltage V _bias is applied to the sensor body 2, and is applied to the tip of each resonator 25 of the resonance portion 21 and the electrode 3 formed on the semiconductor silicon substrate 1 at a position facing the tip. The capacitor is configured. The tip of the resonator 25 is a movable electrode whose position moves up and down with the vibration of the resonator 25, while the electrode 3 formed on the semiconductor silicon substrate 1 is a fixed electrode whose position does not move. When the resonator 25 vibrates at each specific frequency, the distance between the two electrodes fluctuates, so that the capacitance of the capacitor changes.

各電極３には、このような容量変化を電圧信号に変換し、変換した電圧信号を所定時間内で積算して出力する検出回路４が接続されている。図２は、検出回路４の構成を示す図であり、検出回路４は、前記キャパシタの容量Ｃ_sと基準容量Ｃ_fとのインピーダンス比に応じた増幅比にて増幅する演算増幅器41，42と、基準電圧Ｖ_refより高い演算増幅器42の出力信号を所定時間だけ積算する積算回路43と、積算回路43から出力信号を取り出して一時的に保持して出力するサンプルホールド回路44とを備える。このような構成の検出回路４は、例えばシリコンＣＭＯＳプロセスによって形成されている。 Each electrode 3 is connected to a detection circuit 4 that converts such a capacitance change into a voltage signal, integrates the converted voltage signal within a predetermined time, and outputs the result. Figure 2 is a diagram showing a configuration of the detection circuit 4, the detection circuit 4 includes an operational amplifier 41 for amplifying at amplification ratio corresponding to an impedance ratio between the capacitor C _s and the reference capacity C _f of the capacitor The integration circuit 43 integrates the output signal of the operational amplifier 42 higher than the reference voltage V _ref for a predetermined time, and the sample hold circuit 44 extracts the output signal from the integration circuit 43 and temporarily holds and outputs it. The detection circuit 4 having such a configuration is formed by, for example, a silicon CMOS process.

演算増幅器41，積算回路43及びサンプルホールド回路44には、それぞれクロックパルスφ₀，φ₁及びφ₂が供給され、演算増幅器41，積算回路43及びサンプルホールド回路44はそれぞれこれらのクロックパルスに同期して動作する。なお、これらのクロックパルスは、外部から供給するようにしても良いし、同一の半導体シリコン基板上にカウンタ回路を形成してそこから供給するようにしても良い。 The operational amplifier 41, the integrating circuit 43 and the sample hold circuit 44 are supplied with clock pulses φ ₀ , φ ₁ and φ ₂ , respectively. The operational amplifier 41, the integrating circuit 43 and the sample hold circuit 44 are synchronized with these clock pulses, respectively. Works. These clock pulses may be supplied from the outside, or may be supplied from a counter circuit formed on the same semiconductor silicon substrate.

次に、動作について説明する。空気中を伝搬した音波がセンサ本体２の受波部分24に伝わると、板状の受波部分24が振動してその振動がセンサ本体２内を伝搬する。この際、図１の左方から右方へ音波が、順次長さが長くなっていく片持ち梁の各共振子25を共振させながら伝わっていく。各共振子25は固有の共振周波数を有しており、各共振子25はその固有の周波数の音波が伝搬すると共振し、その先端部が上下に振動する。この振動によって、その先端部と電極３との間で構成されるキャパシタの容量が変化する。なお、音波が伝搬していくにつれて音波のエネルギは共振子25の振動エネルギに順次変換されていくので、このような共振により音波のエネルギは除々に減衰し、最も長い共振子25（図１の右端）に音波が到達する頃には、音波としてのエネルギは殆ど無くなっており、反射波は生じない。よって、反射波が容量変化に影響を及ぼす虞はなく、伝搬した音波のスペクトルに合致した正確な容量変化を検出できる。 Next, the operation will be described. When the sound wave propagated in the air is transmitted to the wave receiving portion 24 of the sensor body 2, the plate-like wave receiving portion 24 vibrates and the vibration propagates in the sensor body 2. At this time, the sound wave is transmitted from the left side to the right side of FIG. 1 while resonating each resonator 25 of the cantilever whose length becomes longer. Each resonator 25 has a unique resonance frequency, and each resonator 25 resonates when a sound wave having the unique frequency propagates, and its tip portion vibrates up and down. Due to this vibration, the capacitance of the capacitor formed between the tip portion and the electrode 3 changes. Since the sound wave energy is sequentially converted into the vibration energy of the resonator 25 as the sound wave propagates, the sound wave energy is gradually attenuated by such resonance, and the longest resonator 25 (in FIG. 1). When the sound wave reaches the right end), the energy as the sound wave is almost lost and no reflected wave is generated. Therefore, there is no possibility that the reflected wave affects the capacitance change, and an accurate capacitance change that matches the spectrum of the propagated sound wave can be detected.

得られた容量変化が検出回路４内に送られる。図３は、検出回路４内におけるタイミングチャートを示す図であり、演算増幅器41，積算回路43及びサンプルホールド回路44にそれぞれ供給するクロックパルスφ₀，φ₁及びφ₂を示す。なお、本例でのクロックパルス制御は、ローレベルでオン状態とする。 The obtained capacitance change is sent into the detection circuit 4. FIG. 3 is a diagram showing a timing chart in the detection circuit 4, and shows clock pulses φ ₀ , φ ₁ and φ ₂ supplied to the operational amplifier 41, the integrating circuit 43 and the sample hold circuit 44, respectively. The clock pulse control in this example is turned on at a low level.

まず、検出回路４内では、演算増幅器41で得られたキャパシタの容量Ｃ_sと基準容量Ｃ_fとのインピーダンス比に応じて増幅比が決まる。例えば、１／ωＣ_f（ω＝２πｆ，ｆ：周波数）に対する１／ωＣ_sの値が１／２である場合には、得られる電圧信号が２倍になる。但し、演算増幅器41は、その＋入力端子が接地されている反転増幅器であるので、次段の演算増幅器42で電圧位相を１倍で反転させる。得られた増幅電圧信号が積算回路43へ入力される。積算回路43では、クロックパルスφ₁に応じた所定の時間内において基準電圧Ｖ_refより高い増幅電圧信号が積算され、その積算信号がサンプルホールド回路44へ入力される。サンプルホールド回路44では、クロックパルスφ₂に応じて積算信号のサンプリングとホールドとを繰り返して外部へ積算信号を出力する。 First, in the detection circuit 4, the amplification ratio is determined according to the impedance ratio between the capacitance C _{s of} the capacitor obtained by the operational amplifier 41 and the reference capacitance C _f . For example, when the value of 1 / ωC _s with respect to 1 / ωC _f (ω = 2πf, f: frequency) is ½, the obtained voltage signal is doubled. However, since the operational amplifier 41 is an inverting amplifier whose positive input terminal is grounded, the operational phase of the operational amplifier 42 in the next stage is inverted by a factor of 1. The obtained amplified voltage signal is input to the integrating circuit 43. The integrating circuit 43, high amplification voltage signal from the reference voltage V _ref in a predetermined corresponding to the clock pulses phi ₁ time is accumulated, the integrated signal is input to the sample-and-hold circuit 44. The sample hold circuit 44 repeats sampling and holding of the integration signal according to the clock pulse φ ₂ and outputs the integration signal to the outside.

以上のような処理は、長さが異なる共振子25にそれぞれ対応する検出回路４毎に並列的に行われる。なお、図３に示すクロックパルスφ₀，φ₁及びφ₂の周期は一例であり、これらの各クロックパルスの周期は任意に設定しても良いことは勿論である。 The above processing is performed in parallel for each detection circuit 4 corresponding to each of the resonators 25 having different lengths. Note that the periods of the clock pulses φ ₀ , φ _1, and φ ₂ shown in FIG. 3 are merely examples, and it is a matter of course that the periods of these clock pulses may be set arbitrarily.

以上のようにして、本発明では、特定の周波数に共振する共振子25に対応する検出回路４の出力信号を調べることにより、任意の時間を周期とした、その特定の周波数の音の強さの経時変化を知ることができる。また、複数の共振子25に対応する検出回路４の出力信号を調べることにより、任意の時間を周期とした、複数の周波数帯域毎の音の強さの経時変化を知ることができる。 As described above, in the present invention, by checking the output signal of the detection circuit 4 corresponding to the resonator 25 that resonates at a specific frequency, the intensity of the sound at the specific frequency with an arbitrary time period. The change with time can be known. Further, by examining the output signals of the detection circuit 4 corresponding to the plurality of resonators 25, it is possible to know the change over time of the sound intensity for each of the plurality of frequency bands with an arbitrary time period.

図４は、特定の周波数に対応する各検出回路４の関係を示す図である。例えば、ｎ種類の共振周波数ｆ₁，ｆ₂，ｆ₃，ｆ₄，…，ｆ_nにそれぞれ選択的に応答振動するようにｎ本の共振子を設ける場合には、各共振周波数毎にその共振強度に応じた検出回路の出力信号Ｖ₁，Ｖ₂，Ｖ₃，Ｖ₄，…，Ｖ_nを得ることができる。例えば、音声認識のための音声入力用マイクロフォンとして本発明の音響センサを使用する場合には、可聴帯域における各共振周波数毎の共振強度に応じてその周波数の強度を求め、求めた分析パターンに基づいて音声を認識する。 FIG. 4 is a diagram showing the relationship of each detection circuit 4 corresponding to a specific frequency. For example, n kinds of resonant frequencies _{_{_{f 1, f 2, f 3}}} , f 4, ..., in the case where each f _n selectively providing the n number of resonators to respond vibrations that each resonant frequency Output signals V ₁ , V ₂ , V ₃ , V ₄ ,..., V _n can be obtained according to the resonance intensity. For example, when the acoustic sensor of the present invention is used as a microphone for speech input for speech recognition, the intensity of the frequency is obtained according to the resonance intensity for each resonance frequency in the audible band, and based on the obtained analysis pattern. Recognize the voice.

なお、音波の任意に選択した周波数のみの強度を求めたい場合には、必要な共振周波数に対応する検出回路の出力信号のみを得るようにすれば良い。例えば、図４において周波数ｆ₁，ｆ₃の強度を求める場合には、対応しない他の検出回路４-2，４-4，…，４-nの出力を遮断するか、予めこれらの検出回路４-2，４-4，…，４-nは設けないようにするかして、必要な出力信号Ｖ₁，Ｖ₃が得られて、不要な出力信号Ｖ₂，Ｖ₄，…，Ｖ_nが得られないようにすれば良い。このような音響センサの使用例としては、特定の１または複数の周波数の異常音を検出するための異常音入力用マイクロフォンが好適である。 If it is desired to obtain the intensity of only a selected frequency of the sound wave, only the output signal of the detection circuit corresponding to the required resonance frequency may be obtained. For example, when obtaining the intensities of the frequencies f ₁ and f ₃ in FIG. 4, the outputs of the other detection circuits 4-2, 4-4,. If 4-2, 4-4,..., 4-n are not provided, necessary output signals V ₁ , V ₃ can be obtained and unnecessary output signals V ₂ , V ₄ ,. _It is sufficient that _n is not obtained. As an example of use of such an acoustic sensor, an abnormal sound input microphone for detecting abnormal sound having a specific frequency or frequencies is suitable.

（第２の実施の形態）
次に、各共振子における共振周波数を、音階に表されるような音の高さを表す心理的属性であるメルスケールにて線形に分布させるようにした第２の実施の形態について説明する。なお、この第２の実施の形態の音響センサの構成は、前述した第１の実施の形態の構成と同様であるが、第２の実施の形態では、各共振子25における共振周波数を、数学的に線形なスケールで分布させるのではなく、メルスケールにて線形に分布させるようにしている。つまり、ｎ本の共振子25における共振周波数をｆ₁，ｆ₂，ｆ₃，…，ｆ_nとした場合に、
ｆ₁〔Ｈｚ〕＝αｆ₂〔Ｈｚ〕＝…………＝α^n-1ｆ_n〔Ｈｚ〕
のように設定するのではなく、
ｆ₁〔ｍｅｌ〕＝αｆ₂〔ｍｅｌ〕＝…………＝α^n-1ｆ_n〔ｍｅｌ〕
のように設定する。なお、αは任意に設定可能な係数である。 (Second Embodiment)
Next, a description will be given of a second embodiment in which the resonance frequency in each resonator is linearly distributed on a mel scale, which is a psychological attribute representing the pitch of a sound as represented by a musical scale. Note that the configuration of the acoustic sensor of the second embodiment is the same as the configuration of the first embodiment described above, but in the second embodiment, the resonance frequency in each resonator 25 is expressed mathematically. Instead of being distributed on a linear scale, it is distributed linearly on a mel scale. That is, the resonance frequency of the n number of resonators _{_{25 f 1, f 2, f}} 3, ..., when the f _n,
f ₁ [Hz] = αf ₂ [Hz] = ………… = α ^n-1 f _n [Hz]
Instead of setting
f ₁ [mel] = αf ₂ [mel] = …… == α ⁿ⁻¹ f _n [mel]
Set as follows. Α is a coefficient that can be arbitrarily set.

各共振子25の共振周波数は、前記（４）式にて決められ、また、実際の振動周波数とメルスケールとの対応は、前述したように、前記（１）式及び図７に基づいて決められるので、メルスケールでの任意の共振周波数を各共振子25に容易に割り当てることができる。本例では、すべての共振子25の厚さＨは一定とし、その長さＬを異ならせて、メルスケール上で等間隔になるような周波数に対応した共振周波数を得ている。 The resonance frequency of each resonator 25 is determined by the above equation (4), and the correspondence between the actual vibration frequency and the mel scale is determined based on the above equation (1) and FIG. Therefore, an arbitrary resonance frequency on the mel scale can be easily assigned to each resonator 25. In this example, the thickness H of all the resonators 25 is constant, and the length L is varied to obtain resonance frequencies corresponding to frequencies that are equally spaced on the mel scale.

なお、他の構成及び動作は、前述した第１の実施の形態の場合と同じであるので、それらの説明は省略する。 Other configurations and operations are the same as those in the first embodiment described above, and thus the description thereof is omitted.

第２の実施の形態では、各共振子25の共振周波数をメルスケールにて分布するようにしたので、人間の耳に聞こえるオクターブ音，半音等を選択的にリアルタイムで認識でき、人間の聴覚に合わせた周波数特性を持つマイクロフォンの製作が可能となる。オクターブ音，半音等のピッチ音の時間的変化をより正確に判別できるので、音声認識，異常音検出に効果を奏することは勿論、朗読，和歌等の抑揚がある音声、楽曲等の音階がある音に対する識別性に優れた音声入力用マイクロフォンを構成できる。 In the second embodiment, since the resonance frequencies of the resonators 25 are distributed on a mel scale, octave sounds, semitones, and the like that can be heard by the human ear can be selectively recognized in real time. A microphone with a combined frequency characteristic can be manufactured. Since the temporal changes of pitch sounds such as octaves and semitones can be more accurately discriminated, it is effective for speech recognition and abnormal sound detection. A voice input microphone having excellent sound discrimination can be configured.

（第３の実施の形態）
次に、各共振子における共振周波数を、音の大きさを表す心理的属性であるバークスケールにて線形に分布させるようにした第３の実施の形態について説明する。なお、この第３の実施の形態の音響センサの構成は、前述した第１の実施の形態の構成と同様であるが、第３の実施の形態では、各共振子25における共振周波数を、数学的に線形なスケールで分布させるのではなく、バークスケールにて分布させるようにしていると共に、各共振子25における共振周波数の帯域幅を臨界帯域幅になるようにしている。 (Third embodiment)
Next, a description will be given of a third embodiment in which the resonance frequency in each resonator is linearly distributed on a Bark scale, which is a psychological attribute representing the volume of sound. The configuration of the acoustic sensor of the third embodiment is the same as the configuration of the first embodiment described above. However, in the third embodiment, the resonance frequency in each resonator 25 is expressed by a mathematical formula. Instead of being distributed in a linear scale, it is distributed in a bark scale, and the bandwidth of the resonance frequency in each resonator 25 is made to be a critical bandwidth.

図８で示されるバークスケールと実周波数との対応関係に基づいて、各各共振子25の共振周波数が決定される。そして、各共振子25の共振周波数は前記（４）式にて決められるが、本例では、すべての共振子25の厚さＨは一定とし、その長さＬを異ならせることにより、バークスケールでの任意の共振周波数を各共振子25に割り当てている。 Based on the correspondence between the Bark scale and the actual frequency shown in FIG. 8, the resonance frequency of each resonator 25 is determined. The resonance frequency of each resonator 25 is determined by the above equation (4). In this example, the thickness H of all the resonators 25 is constant, and the length L is made different so that the Bark scale is used. An arbitrary resonance frequency at is assigned to each resonator 25.

第３の実施の形態では、各共振子25の共振周波数をバークスケールにて分布するようにしたので、人間の聴力に合った周波数特性と帯域幅を持たせることができ、雑音中に隠れている音響信号を選別することが容易になり、雑音が多い状況の中での音声認識の判別率を向上させることが可能となる。また、人間の聴覚により近いセンサを提供できる。 In the third embodiment, since the resonance frequency of each resonator 25 is distributed on a bark scale, it can have a frequency characteristic and a bandwidth suitable for human hearing and is hidden in noise. It is possible to easily select the sound signals that are present, and it is possible to improve the speech recognition discrimination rate in a noisy situation. In addition, a sensor closer to human hearing can be provided.

本発明の音響センサの実施の形態を示す図である。It is a figure which shows embodiment of the acoustic sensor of this invention. 本発明の音響センサにおける検出回路の構成を示す図である。It is a figure which shows the structure of the detection circuit in the acoustic sensor of this invention. 本発明の音響センサにおける検出回路のタイミングチャートを示す図である。It is a figure which shows the timing chart of the detection circuit in the acoustic sensor of this invention. 特定の周波数に対応する各検出回路の関係を示す図である。It is a figure which shows the relationship of each detection circuit corresponding to a specific frequency. 共振子間距離と帯域幅との関係を示すグラフである。It is a graph which shows the relationship between the distance between resonators, and a bandwidth. 本発明の音響センサにおける共振子の長さ，厚さ，幅及び距離の関係を表す図である。It is a figure showing the relationship of the length of the resonator in the acoustic sensor of this invention, thickness, width, and distance. 実際の周波数とメルスケール値との関係を示すグラフである。It is a graph which shows the relationship between an actual frequency and a mel scale value. 臨界帯域幅とバークスケールとの数値関係を示す図表である。It is a graph which shows the numerical relationship between a critical bandwidth and a Bark scale.

Explanation of symbols

１半導体シリコン基板
２センサ本体
３電極
４検出回路
21 共振部分
22 保持部分
23 伝搬部分
24 受波部分
25 共振子
41，42 演算増幅器
43 積算回路
44 サンプルホールド回路 DESCRIPTION OF SYMBOLS 1 Semiconductor silicon substrate 2 Sensor main body 3 Electrode 4 Detection circuit
21 Resonant part
22 Holding part
23 Propagation part
24 Received part
25 Resonator
41, 42 operational amplifier
43 Integration circuit
44 Sample hold circuit

Claims

A receiving portion that receives a sound wave propagating in the medium, a resonance portion having a plurality of rod-like resonators each having a length that resonates at a different specific frequency, and a holding portion that holds the resonance portion; A vibration intensity detection portion for detecting the vibration intensity of each of the resonators for each specific frequency, and by changing the distance between the two adjacent resonators to obtain a resonance frequency band in each resonator. An acoustic sensor characterized in that the width is set to a predetermined value .

The acoustic sensor according to claim 1, wherein resonance frequencies of the plurality of resonators are set so as to be distributed on a mel scale .

The acoustic sensor according to claim 1, wherein resonance frequencies of the plurality of resonators are set so as to be distributed on a bark scale .

2. The acoustic sensor according to claim 1, wherein resonance frequencies in the plurality of resonators are set to be distributed on a bark scale, and a bandwidth corresponding to each resonance frequency is a critical bandwidth .

A receiving portion that receives a sound wave propagating in the medium, a resonance portion having a plurality of rod-like resonators each having a length that resonates at a different specific frequency, and a holding portion that holds the resonance portion; Each of the resonators includes a vibration intensity detecting portion that detects a vibration intensity for each specific frequency, the distance between two adjacent resonators is different, and the resonance frequencies of the plurality of resonators and it is set so as to be distributed in the bark scale, characteristics and be Ruoto sound sensor that the bandwidth corresponding to each resonant frequency is a critical band width.

The acoustic sensor according to claim 2 , wherein the acoustic sensor is a music song input microphone for recognizing a music song.

6. The acoustic sensor according to claim 1, wherein the acoustic sensor is a voice input microphone for recognizing voice.

The acoustic sensor according to claim 1, wherein the acoustic sensor is configured on a semiconductor substrate.