JPH1070790A

JPH1070790A - Speaking speed detecting method, speaking speed converting means, and hearing aid with speaking speed converting function

Info

Publication number: JPH1070790A
Application number: JP9126625A
Authority: JP
Inventors: Katsufumi Kondo; 克文近藤; Koji Tanitaka; 幸司谷高
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 1996-05-22
Filing date: 1997-05-16
Publication date: 1998-03-10
Anticipated expiration: 2017-05-16
Also published as: JP3961616B2

Abstract

PROBLEM TO BE SOLVED: To provide the hearing aid with the speaking speed converting function which can convert a speech signal to a proper speaking speed and output it whatever voicing speed (speaking speed) the speech signal has. SOLUTION: When the speech signal is inputted, the length Lpth of the vowel of the syllable at the head is measured. For example, when 'ohayou'(Good morning in English) is inputted, 'o' is inputted first, so when this 'o' is detected, its length is measured. On the basis of the length of the 'o', the speaking speed of the input speech signal is detected and a conversion rate for the speaking speed is calculated from the detected length and a target speaking speed value. Then 'hayou' inputted following the 'o' is converted with this conversion rate to perform speaking speed conversion corresponding to the speaking speed of the input speech signal almost in real time, so that the speech signal can be outputted at the target speaking speed all the times.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、入力された音声
信号の発話速度（話速）を伸長して出力することによ
り、装用者の聴覚機能の低下を補償した話速変換機能付
補聴器に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a hearing aid with a speech speed conversion function that compensates for a decrease in the hearing function of a wearer by extending the speech speed (speech speed) of an input voice signal and outputting the speech signal.

【０００２】[0002]

【従来の技術】従来より、高齢者など聴覚機能が低下し
た者が装用する機能補助装置として補聴器が使用されて
いる。ところで、老齢化による聴覚機能の低下は、最小
可聴信号レベルの上昇，高音域の聴取機能の低下などの
伝音系機能低下のほか音声識別臨界速度（語音を識別す
ることができる最大の話速）の低下などの聴覚中枢系の
機能低下も含まれている。2. Description of the Related Art Conventionally, hearing aids have been used as function assisting devices worn by persons with reduced hearing functions, such as the elderly. By the way, the deterioration of the hearing function due to aging is caused by a decrease in the sound transmission system function such as an increase in the minimum audible signal level and a decrease in the listening function in the high-frequency range, as well as the critical speed for speech identification (the maximum speech speed at which speech can be identified. ), Such as impaired auditory central function.

【０００３】このため、高齢者用の補聴器として、音声
信号を時間的に伸長して周波数帯域の一部または全部を
増幅することに加えて、音声信号の出力速度を入力速度
よりも低速にする話速変換処理を行う補聴器も提案され
ている。For this reason, as a hearing aid for the elderly, in addition to amplifying a part or all of the frequency band by temporally extending the audio signal, the output speed of the audio signal is made lower than the input speed. Hearing aids that perform speech rate conversion processing have also been proposed.

【０００４】[0004]

【発明が解決しようとする課題】しかし、単に入力され
た音声信号を低速に変換して出力するのみの補聴器で
は、対話者がゆっくり話してくれた場合でもこれを更に
低速に変換して出力するため、話速が低速になり過ぎて
しまい、装用者が高齢者であってもかえって聞き取りに
くくなる場合があった。However, in a hearing aid which merely converts an input voice signal at a low speed and outputs the converted signal, even if the interlocutor speaks slowly, the converted signal is output at a lower speed. As a result, the speech speed becomes too low, and even if the wearer is an elderly person, it may be difficult to hear.

【０００５】これに対応するためには、対話者の発言の
話速に応じて話速変換の変換率を変えればよいが、高齢
者がマニュアル操作でこれを行うことは殆ど不可能であ
り、また、話者の発話速度を事前に予測して変換率を決
定することも不可能である。To cope with this, it is only necessary to change the conversion rate of the speech rate conversion according to the speech rate of the speech of the interlocutor, but it is almost impossible for the elderly to perform this manually. Also, it is impossible to predict the utterance speed of the speaker in advance and determine the conversion rate.

【０００６】この発明は、音声信号の先頭部分で話速を
測定する話速検出方法、および、検出された話速を用い
て以後の音声信号の話速を目標値に変換することによ
り、リアルタイムの話速変換を可能にした話速変換方
法、そして、どのような発話速度（話速）の音声信号が
入力された場合でも、適切な話速に変換して出力するこ
とができる話速変換機能付補聴器を提供することを目的
とする。According to the present invention, a speech speed detection method for measuring a speech speed at a head portion of an audio signal, and converting the speech speed of a subsequent audio signal into a target value using the detected speech speed, thereby real-time processing. Speech rate conversion method that enables speech rate conversion of a speech rate, and speech rate conversion that can convert a speech signal of any speech rate (speech rate) into an appropriate speech rate and output the speech signal It is an object to provide a hearing aid with a function.

【０００７】[0007]

【課題を解決するための手段】この出願の請求項１の発
明は、入力される音声信号のうち最初の母音の長さを検
出し、該検出された母音の長さに基づいて前記入力され
た音声信号の発話速度を検出することを特徴とする。According to a first aspect of the present invention, the length of a first vowel in an input voice signal is detected, and the length of the input vowel is determined based on the length of the detected vowel. Detecting the speech speed of the voice signal.

【０００８】この出願の請求項２の発明は、目標話速を
予め設定しておき、音声信号を入力してこの音声信号の
発話速度を検出し、該入力された音声信号を検出された
発話速度から前記目標話速へ変換することを特徴とす
る。In the invention of claim 2 of the present application, a target speech speed is set in advance, a speech signal is inputted, the speech speed of this speech signal is detected, and the speech signal detected by the inputted speech signal is detected. The speed is converted to the target speech speed.

【０００９】この出願の請求項３の発明は、入力される
音声信号のうち最初の母音の長さを検出し、該検出され
た母音の長さに基づいて前記入力された音声信号の発話
速度を検出し、該検出された発話速度に基づき前記最初
の母音以後に入力される音声信号の発話速度を予め設定
されている目標話速に変換することを特徴とする。According to a third aspect of the present invention, the length of the first vowel in the input voice signal is detected, and the speech rate of the input voice signal is determined based on the length of the detected vowel. And converting the speech rate of the voice signal input after the first vowel into a preset target speech rate based on the detected speech rate.

【００１０】この出願の請求項４の発明は、音声信号を
含む音響信号を入力する入力手段と、該入力手段から入
力された音声信号の発話速度を検出する話速検出手段
と、発話速度変換の目標値である目標話速を記憶する目
標話速記憶手段と、該目標話速記憶手段に目標話速を設
定する話速設定手段と、前記話速検出手段が検出した音
声信号の発話速度から前記目標話速へ変換するための変
換比率を算出する変換比率算出手段と、該変換比率算出
手段が算出した変換比率で前記最初の母音以後に入力さ
れる音声信号の発話速度を変換する話速変換手段と、を
備えたことを特徴とする。According to a fourth aspect of the present invention, there is provided an input means for inputting an audio signal including a voice signal, a voice speed detecting means for detecting a voice speed of the voice signal input from the input means, and a voice speed conversion. Target speech rate storage means for storing a target speech rate which is a target value of the speech rate, speech rate setting means for setting the target speech rate in the target speech rate storage means, and speech rate of the voice signal detected by the speech rate detection means. Conversion ratio calculating means for calculating a conversion ratio for converting from the first vowel to the target speech speed, and a speech for converting the speech speed of the voice signal inputted after the first vowel at the conversion ratio calculated by the conversion ratio calculating means. Speed conversion means.

【００１１】この出願の請求項５の発明は、音声信号を
含む音響信号を入力する入力手段と、該入力手段から入
力された音響信号を監視し音声信号の開始を検出する音
声信号検出手段と、該開始を検出された音声信号の最初
の母音の長さを検出する母音長検出手段と、該検出され
た母音の長さに基づいて前記開始を検出された音声信号
の発話速度を検出する話速検出手段と、発話速度変換の
目標値である目標話速を記憶する目標話速記憶手段と、
前記話速検出手段が検出した音声信号の発話速度から前
記目標話速へ変換するための変換比率を算出する変換比
率算出手段と、該変換比率算出手段が算出した変換比率
で前記最初の母音以後に入力される音声信号の発話速度
を変換する話速変換手段とを備えたことを特徴とする。According to a fifth aspect of the present invention, there is provided an input means for inputting an audio signal including an audio signal, an audio signal detecting means for monitoring the audio signal input from the input means and detecting the start of the audio signal. Vowel length detecting means for detecting the length of the first vowel of the voice signal whose start has been detected, and detecting the utterance speed of the voice signal whose start has been detected based on the length of the detected vowel. Speech speed detection means, target speech speed storage means for storing a target speech speed which is a target value of speech speed conversion,
A conversion ratio calculating unit for calculating a conversion ratio for converting the speech speed of the voice signal detected by the speech speed detecting unit to the target voice speed, and a conversion ratio calculated by the conversion ratio calculating unit after the first vowel. And a speech speed conversion means for converting the speech speed of the voice signal input to the communication device.

【００１２】通常の速度の発声では１モーラ（１音節）
はほぼ１４０〜１５０ｍｓ程度である。また、子音部と
母音部はオーバーラップしているため厳密に子音部を特
定することは困難であるが、このうちほぼ２０〜４０ｍ
ｓを子音部が占め、母音部が１００〜１３０ｍｓを占め
ることが知られている。また、通常の会話やアナウンス
では１単語程度の発声で発話速度が大きく変化すること
はないことも知られている。1 mora (1 syllable) for normal-rate utterances
Is about 140 to 150 ms. Further, since the consonant part and the vowel part overlap, it is difficult to specify the consonant part exactly.
It is known that s is occupied by consonants and vowels occupy 100-130 ms. It is also known that in ordinary conversations and announcements, the utterance speed does not change significantly with the utterance of about one word.

【００１３】請求項１，請求項３および請求項５の発明
では、これらの前提にたち、音声信号の最初の母音の長
さを検出し、上記時間的占有率を逆算することによって
発話速度（話速）を検出する。これにより、音声信号が
入力されたときリアルタイム（約２００ｍｓ以内）に該
音声信号の発話速度を検出することができる。According to the first, third and fifth aspects of the present invention, based on these assumptions, the length of the first vowel of the voice signal is detected, and the temporal occupancy is calculated back to thereby make the utterance speed ( Speech speed) is detected. This makes it possible to detect the speech speed of the audio signal in real time (within about 200 ms) when the audio signal is input.

【００１４】また、高齢者に聞き取りやすくするために
は１モーラを２００ｍｓ（５モーラ／秒）程度に伸長す
ることが好ましい。請求項２，請求項４および請求項５
の発明では、この発話速度を目標話速とし、上記検出方
法で検出された発話速度とこの目標話速と差を補償する
ように入力される音声信号を変換することにより、話し
手がどのような速度で話した場合でも高齢者が聞き取り
やすい話速の音声信号を出力できるようにした。Further, in order to make it easier for the elderly to hear, it is preferable to extend one mora to about 200 ms (5 mora / sec). Claims 2, 4 and 5
In the invention of the above, the speech speed is set as a target speech speed, and the speech signal input so as to compensate for the difference between the speech speed detected by the above detection method and this target speech speed is converted into a desired speech speed. Even when speaking at a high speed, it is now possible to output an audio signal with a speaking speed that is easy for the elderly to hear.

【００１５】[0015]

【発明の実施の形態】図１は、この発明の実施の形態で
ある話速変換機能付補聴器（以下、単に補聴器とい
う。）の構成を示すブロック図である。マイク１０はオ
ーディオ信号を受信してアンプ１１に入力する。なお、
オーディオ信号は、会話やアナウンスの人声である音声
信号やノイズなどからなる可聴周波数信号である。ま
た、マイク１０は補聴器本体，装耳部等どこに設けるも
のであってもよい。前記アンプ１１は前記オーディオ信
号を増幅してフィルタ１２に入力する。上記フィルタ１
２はアンチエリアシングフィルタであり、サンプリング
周波数の１／２以上の周波数をカットするローパスフィ
ルタで構成されている。このフィルタ１２を通過したオ
ーディオ信号はＡ／Ｄコンバータ１３でディジタル信号
（波形データ）に変換される。このディジタルの波形デ
ータはＤＳＰ１４に入力される。ＤＳＰ１４には信号処
理用ＲＡＭ１５およびパラメータＲＡＭ１６が接続され
ている。信号処理用ＲＡＭ１５はＤＲＡＭで構成された
大容量のものであり、この信号処理用ＲＡＭ１５には話
速変換され伸長された音声信号や遅延して出力される音
声信号が記憶される。また、パラメータＲＡＭ１６はＤ
ＳＰ１４の動作を制御するためのパラメータを記憶する
ＲＡＭであり、バッテリバックアップされたＳＲＡＭで
構成されている。このパラメータＲＡＭ１６には目標話
速データ記憶エリア１６ａが設定されているほか、後述
の伸長音節数（Ｎｖｍａｘ），レベル閾値（Ｐｔｈ），
長さ閾値（Ｌｐｔｈ），限度波数（Ｎｄ）などのパラメ
ータが記憶される。またこのパラメータＲＡＭ１６には
設定器２１が接続されている。この設定器２１は、上記
目標話速データや伸長音節数を設定するためのものであ
る。FIG. 1 is a block diagram showing the configuration of a hearing aid with a speech speed conversion function (hereinafter simply referred to as a hearing aid) according to an embodiment of the present invention. The microphone 10 receives the audio signal and inputs the audio signal to the amplifier 11. In addition,
The audio signal is an audible frequency signal composed of a voice signal or a noise, which is a human voice of a conversation or an announcement. Further, the microphone 10 may be provided anywhere such as a hearing aid body, an earpiece, or the like. The amplifier 11 amplifies the audio signal and inputs the amplified signal to the filter 12. Filter 1 above
Reference numeral 2 denotes an anti-aliasing filter, which is configured by a low-pass filter that cuts a frequency equal to or more than 1/2 of the sampling frequency. The audio signal passing through the filter 12 is converted into a digital signal (waveform data) by an A / D converter 13. The digital waveform data is input to the DSP 14. A signal processing RAM 15 and a parameter RAM 16 are connected to the DSP 14. The signal processing RAM 15 is a large-capacity one composed of a DRAM. The signal processing RAM 15 stores voice signals that have been converted in speech speed and expanded, and voice signals that are output after being delayed. The parameter RAM 16 stores D
This is a RAM for storing parameters for controlling the operation of the SP 14, and is configured by a battery-backed SRAM. A target speech speed data storage area 16a is set in the parameter RAM 16, and the number of expanded syllables (Nvmax), level threshold (Pth),
Parameters such as a length threshold (Lpth) and a limit wave number (Nd) are stored. A setting device 21 is connected to the parameter RAM 16. The setting unit 21 is for setting the target speech speed data and the number of expanded syllables.

【００１６】ＤＳＰ１４は入力された波形データを分析
して、現在音声信号が入力されているか否かを判断す
る。音声信号が入力されている場合には、その信号を伸
長するなど適切な処理をして信号処理用ＲＡＭ１５に書
き込むとともに、読出クロック（サンプリングクロッ
ク）に同期して書き込まれた信号をＤ／Ａコンバータ１
７に出力する。また、音声信号が入力されていない場合
には、入力された信号をそのままＤ／Ａコンバータ１７
に出力する。また、ＤＳＰ１４はＤ／Ａコンバータ１７
に信号を出力するとき、該信号のうち高い周波数成分の
ゲインを大きくするイコライジングを同時に行う。The DSP 14 analyzes the input waveform data and determines whether or not a voice signal is currently input. If an audio signal is input, the signal is subjected to appropriate processing such as decompression and written to the signal processing RAM 15, and the signal written in synchronization with the read clock (sampling clock) is converted to a D / A converter. 1
7 is output. If no audio signal is input, the input signal is used as it is in the D / A converter 17.
Output to The DSP 14 is a D / A converter 17
When the signal is output to the, the equalizing for increasing the gain of the high frequency component of the signal is performed at the same time.

【００１７】Ｄ／Ａコンバータ１７は入力されたディジ
タル波形データをアナログのオーディオ信号に再変換し
てローパスフィルタ１８に入力する。オーディオ信号は
ローパスフィルタ１８を通過することによって、アナロ
グ変換時の不連続ノイズが除去される。そしてアンプ１
９は、このオーディオ信号を利用者が可聴できるレベル
まで増幅してレシーバ２０に出力する。レシーバ２０
は、アンプ１９から入力されたアナログ信号を空気振動
に変換して装用者の外耳道に放出する。The D / A converter 17 reconverts the input digital waveform data into an analog audio signal and inputs the analog audio signal to a low-pass filter 18. The audio signal passes through the low-pass filter 18 to remove discontinuous noise during analog conversion. And amplifier 1
9 amplifies this audio signal to a level that can be heard by the user and outputs it to the receiver 20. Receiver 20
Converts the analog signal input from the amplifier 19 into air vibration and emits it to the ear canal of the wearer.

【００１８】なお、Ａ／Ｄコンバータ１３，ＤＳＰ１４
およびＤ／Ａコンバータ１７には図示していないクロッ
ク回路からクロックが供与されている。The A / D converter 13, DSP 14
The D / A converter 17 is supplied with a clock from a clock circuit (not shown).

【００１９】ここで、この補聴器の機能を図２，図３を
参照して説明する。レシーバ２０から出力される音声信
号の話速の目標値を示す目標話速データをパラメータＲ
ＡＭ１６の目標話速データ記憶エリア１６ａに記憶す
る。この目標話速データは設定器２１から設定入力され
るが、この設定は工場出荷時に行っておいてもよく、利
用者（装用者）が自ら設定するようにしてもよい。入力
された音声信号がどのような話速のものであっても、こ
の話速で出力されるように話速を変換する。この入力音
声に応じた話速変換処理をリアルタイムに行うため、入
力された音声信号の最初の音節は話速を変換せず、母音
の長さを計測する。この母音の長さから後続する会話の
話速を推定する。推定した入力音声の話速と上記目標値
から話速変換比率（母音の波形データ伸長率）を算出
し、後続する音節の母音に対して伸長処理を施す。これ
により、どのような速度で話者が話しても装用者には一
定の最も聞き取りやすい話速でこの音声信号が入力され
る。Here, the function of the hearing aid will be described with reference to FIGS. The target speech speed data indicating the target value of the speech speed of the audio signal output from the receiver 20 is set to a parameter R
It is stored in the target speech speed data storage area 16a of the AM 16. The target speech speed data is set and input from the setting device 21. This setting may be performed at the time of factory shipment, or may be set by a user (wearer) himself. Regardless of the voice speed of the input voice signal, the voice speed is converted so that the voice signal is output at this voice speed. In order to perform the speech speed conversion processing according to the input voice in real time, the length of a vowel is measured without converting the voice speed of the first syllable of the input voice signal. The speech speed of the subsequent conversation is estimated from the length of the vowel. A speech speed conversion ratio (vowel waveform data expansion ratio) is calculated from the estimated speech speed of the input voice and the target value, and expansion processing is performed on the vowel of the following syllable. Thus, no matter what speed the speaker speaks, this sound signal is input to the wearer at a constant, most audible speech speed.

【００２０】しかし、入力される全ての音声信号を話速
変換して伸長すると、長い文章が一度に入力される場合
には、入力音声信号に対する出力音声信号の遅延が大き
くなりすぎて信号処理用ＲＡＭ１５の記憶容量が足りな
くなったり、会話における応答のずれが大きくなりすぎ
て会話が円滑に行われなくなる問題点が生じ、さらに、
テレビや映画などでは画面と音声のずれが大きくなる問
題点などが生じる。その一方で、単語や一連の文はその
全体を完全に聞き取ることができなくても、その一部特
に先頭部分を正確に聞き取ることができれば、その内容
を十分に把握することができる場合が多い。そこで、一
つの単語を構成する音節の数は３つあるいは４つなど比
較的少数の特定の数が多いことに着目し、この実施形態
では先頭から４つ（第２音節から３つ）の音節は伸長
し、それ以後の音節はそのまままたは圧縮して出力する
ことにより、入力音声と出力音声の遅延の最小限にして
いる。なお、この実施形態では、日本語の場合、各音節
には必ず母音が含まれていることに着目し、母音の数を
カウントすることで音節数のカウントに代えている。ま
た、無声区間も必要に応じてその一部を削除するように
している。However, if all of the input voice signals are converted to speech speed and decompressed, if a long sentence is input at a time, the delay of the output voice signal with respect to the input voice signal becomes too large, so that the signal processing signal is too long. There is a problem in that the storage capacity of the RAM 15 becomes insufficient, the response shift in the conversation becomes too large, and the conversation is not performed smoothly.
In a television, a movie, or the like, there is a problem that a gap between the screen and the sound becomes large. On the other hand, even if a word or a series of sentences cannot be completely heard, if the part, especially the head part, can be accurately heard, the contents can often be sufficiently understood. . Therefore, attention is paid to the fact that the number of syllables constituting one word is relatively large, such as three or four, and in this embodiment, the first four syllables (three from the second syllable) are used. Is expanded, and the syllables thereafter are output as they are or compressed, thereby minimizing the delay between the input voice and the output voice. In this embodiment, in the case of Japanese, attention is paid to the fact that each syllable always includes a vowel, and the number of vowels is replaced by counting the number of syllables. Also, a part of the unvoiced section is deleted as necessary.

【００２１】図２は４音節の単語「おはよう」が入力さ
れたときの話速変換処理を示す図である。最初の「お」
が入力されたとき、これを伸長しないでそのまま出力
し、母音部の長さを測定する。この長さに基づいて伸長
比率を決定する。次の「は」が入力されたとき、「は」
を構成する音素「子音：Ｈ」と「母音：Ａ」のうち子音
のＨはそのまま出力（メモリ（信号処理用ＲＡＭ１５）
に記憶）し、母音のＡは上記伸長比率に応じて伸長して
メモリに記憶する。なお、メモリに記憶された波形デー
タは読出プログラムによって順次よみだされ音声信号と
して出力される。この伸長処理プログラムと読出プログ
ラムは並行して動作している。つぎの「よ」が入力され
たときも同様に子音のＹはそのままメモリに記憶し、母
音のＯを伸長してメモリに記憶する。「う」は母音のみ
であるため全体を伸長してメモリに記憶する。FIG. 2 is a diagram showing speech speed conversion processing when a four-syllable word "Ohayo" is input. The first "O"
Is input, it is output as it is without decompression, and the length of the vowel part is measured. The extension ratio is determined based on this length. When the next "ha" is entered, "ha"
The consonant H among the phonemes “Consonant: H” and “Vowel: A” constituting the same is output as is (memory (signal processing RAM 15)
The vowel A is expanded according to the expansion ratio and stored in the memory. The waveform data stored in the memory is sequentially read out by a reading program and output as an audio signal. The decompression processing program and the read program operate in parallel. Similarly, when the next "yo" is input, the consonant Y is stored in the memory as it is, and the vowel O is expanded and stored in the memory. Since "U" is only vowels, the whole is expanded and stored in the memory.

【００２２】なお、この補聴器は入力レベルが入力レベ
ル閾値Ｐｔｈを超える信号が持続時間閾値Ｌｐｔｈ以上
の長さ入力されたとき、これを音声信号であると判断し
て上記処理を行う。このため、同図に示すパルス的なノ
イズが入力されても持続時間が短いためこれを音声信号
として処理しない。また、話速変換された音声信号がメ
モリから読出出力されている期間以外は入力されたオー
ディオ信号（レベルノイズ・パルスノイズなどの背景
音）をそのまま出力している。When a signal whose input level exceeds the input level threshold value Pth is longer than the duration threshold value Lpth, the hearing aid determines that the signal is an audio signal and performs the above processing. For this reason, even if the pulse-like noise shown in the figure is input, it is not processed as an audio signal because the duration is short. Also, the input audio signal (background sound such as level noise and pulse noise) is output as it is, except during the period in which the voice signal whose speech speed has been converted is read out and output from the memory.

【００２３】また、この図は、時間軸上の伸長方式のみ
図示しているが、実際には出力レベルは入力レベルに比
して数十ｄＢ増幅されているものとする。また、この増
幅レベルは全ての周波数帯域に一様ではなく、可聴周波
数に限定され、且つ、可聴周波数上方が特に大きなゲイ
ンで増幅されるようになっている。このイコライズ処理
もＤＳＰ１４が行う。Although this drawing shows only the expansion method on the time axis, it is assumed that the output level is actually amplified by several tens of dB compared to the input level. Further, this amplification level is not uniform in all frequency bands, but is limited to audio frequencies, and is amplified with a particularly large gain above audio frequencies. The DSP 14 also performs this equalizing process.

【００２４】また、図３は９音節の文「おはようござい
ます」が入力されたときの話速変換処理を示す図であ
る。９音節であっても最初の４音節「おはよう」に関し
ては上記と同様の処理が行われ伸長された音声信号がレ
シーバ２０から出力される。そして、５音節目以後は圧
縮されて出力される。ＤＳＰ１４は伸長処理をしながら
音節数をカウントしており、連続して５音節目が入力さ
れるとこれ以後連続して入力される音節（母音）が一定
以上の長さであれば、これを圧縮するようにしている。
５音節目以後は母音の波数（周期数）Ｎｗをカウント
し、このカウント値Ｎｗが限度数Ｎｄを超えたとき、そ
れ以後を圧縮する。圧縮の方式は２波（２周期分の波形
データ）を読み込んで、これらの平均波を算出し、この
１波のみをメモリ（信号処理用ＲＡＭ１５）記憶するこ
とで時間を１／２に圧縮する方式である。FIG. 3 is a diagram showing a speech speed conversion process when a nine-syllable sentence "Good morning" is input. The same processing is performed for the first four syllables “Ohayo” even for nine syllables, and the expanded audio signal is output from the receiver 20. Then, after the fifth syllable, it is compressed and output. The DSP 14 counts the number of syllables while performing decompression processing. If a syllable (vowel) continuously input after the fifth syllable is input is longer than a certain length, the DSP 14 counts this number. I try to compress.
After the fifth syllable, the wave number (cycle number) Nw of the vowel is counted, and when this count value Nw exceeds the limit number Nd, the subsequent numbers are compressed. The compression method reads two waves (two periods of waveform data), calculates an average wave of these waves, and stores only this one wave in a memory (the signal processing RAM 15) to compress the time to half. It is a method.

【００２５】なお、この実施形態では５音節以後を圧縮
するようにしているが、圧縮しないでそのまま出力する
ようにしてもよい。また、圧縮の方式として、音節（母
音）が一定波数を超えるときその超えた部分を１／２に
圧縮する方式を採用しているが、圧縮方式はこれに限定
されるものではなく、音節（母音）全体を圧縮するよう
にしてもよく、波数単位でなく時間単位で非圧縮限度を
定めるようにしてもよい。In this embodiment, the syllables after 5 syllables are compressed, but they may be output without compression. Further, as a compression method, when a syllable (vowel) exceeds a certain wave number, a method of compressing the excess portion to half is adopted. However, the compression method is not limited to this. The entire vowel may be compressed, or the non-compression limit may be determined in units of time instead of units of wave numbers.

【００２６】図４〜図８は上記ＤＳＰの動作を示すフロ
ーチャートである。図４はデータ取込処理、図５〜図７
は話速変換処理、図８は読出処理を示している。これら
データ取込処理、話速変換処理、読出処理は並行して実
行される。なお、全ての動作スタート時に先立って初期
設定動作が実行され、信号処理用ＲＡＭ１５のクリアや
フラグのプリセットなどが行われているものとする。FIGS. 4 to 8 are flowcharts showing the operation of the DSP. FIG. 4 is a data acquisition process, FIGS.
8 shows a speech speed conversion process, and FIG. 8 shows a reading process. These data fetching process, speech speed conversion process, and reading process are executed in parallel. It is assumed that the initial setting operation is executed prior to the start of all operations, and the signal processing RAM 15 is cleared, flags are preset, and the like.

【００２７】図４のフローチャートを参照してデータ取
込処理について説明する。このデータ取込処理は、数サ
ンプルの波形データからなるフレーム毎に実行される。
まず、Ａ／Ｄコンバータ１３から波形データＤをリアル
タイムバッファに取り込む（ｓ１）。そして、このレベ
ルを判定する（ｓ３）。このデータＤのレベルが入力レ
ベル閾値Ｐｔｈよりも高い場合にはｓ６以下の動作に進
む。また、ＤのレベルがＰｔｈ以下の場合にはｓ４以下
に進む。なお、リアルタイムバッファや各種フラグはＤ
ＳＰ１４の内部に設定されている。The data fetch process will be described with reference to the flowchart of FIG. This data capturing process is executed for each frame composed of several samples of waveform data.
First, the waveform data D is fetched from the A / D converter 13 into a real-time buffer (s1). Then, this level is determined (s3). If the level of the data D is higher than the input level threshold Pth, the operation proceeds to s6 and below. If the level of D is equal to or lower than Pth, the process proceeds to s4 or lower. The real-time buffer and various flags are D
It is set inside SP14.

【００２８】装用直後で音声信号がない場合、Ｄ≦Ｐｔ
ｈであり、初期設定によりＦｎｓセット、Ｆｓリセット
であるためｓ３→ｓ４→ｓ５で何もしないでリターンす
る。ここで、話速変換フラグＦｓは、現在話速変換処理
（主として伸長処理）を行っていることを示すフラグで
あり、これがセットされていると入力された波形データ
がそのまま出力されないことを示している。また、信号
無しフラグＦｎｓは入力された波形データがＰｔｈを超
えているか否かを示すフラグである。このフラグのリセ
ット状態が一定時間（Ｌｐｔｈ）以上継続した場合、す
なわち、Ｐｔｈを超える信号がＬｐｔｈ以上継続して入
力された場合、入力された信号が音声信号であると判断
される。If there is no audio signal immediately after wearing, D ≦ Pt
h, Fns is set and Fs is reset by the initial setting, and the process returns without doing anything in s3 → s4 → s5. Here, the speech speed conversion flag Fs is a flag indicating that speech speed conversion processing (mainly expansion processing) is currently being performed. If this flag is set, it indicates that the input waveform data is not output as it is. I have. The signal absence flag Fns is a flag indicating whether or not the input waveform data exceeds Pth. If the reset state of the flag has continued for a predetermined time (Lpth) or more, that is, if a signal exceeding Pth has been continuously input for Lpth or more, it is determined that the input signal is an audio signal.

【００２９】Ｐｔｈを超える何らかの信号が入力された
場合、ｓ６に進み、話速変換フラグＦｓがセットされて
いるか否かを判断する。最初はこのフラグはセットされ
ていないためｓ７に進み信号無しフラグＦｎｓがセット
されているか否かを判断する。最初にこの動作に進んだ
ときはＦｎｓがセットされているためｓ７からｓ８に進
む。ｓ８ではＦｎｓをリセットし、リセット継続時間
（閾値レベルを超えた時間）をカウントするためタイマ
カウンタＴを０にリセットする（ｓ９）。また、連続し
て２回以上ｓ６→ｓ７に動作が進んだ場合にはすでにＦ
ｎｓがリセットされているためｓ７からｓ１０に進む。
ｓ１０ではタイマカウンタＴに１を加算する。加算の結
果Ｔが閾値Ｌｐｔｈに等しくなった場合には（ｓ１
１）、現在入力されている信号は音声信号であるとして
話速変換処理を開始するため話速変換フラグＦｓをセッ
トし（ｓ１２）、変換比率計算処理（図５）を起動する
（ｓ１３）。一方、ｓ１１でＴ＜Ｌｐｔｈであった場合
にはそのままリターンする。このように、ｓ２で入力さ
された波形データＤが一定時間Ｌｐｔｈ以上レベル閾値
Ｐｔｈを超えていた場合には音声信号が入力されたこと
してＦｓがセットされる。Ｆｓがセットされている間、
後述の話速変換処理（変換比率計算処理および変換処
理）が実行される。If any signal exceeding Pth is input, the process proceeds to s6, where it is determined whether or not the speech speed conversion flag Fs is set. At first, since this flag is not set, the process proceeds to s7 to determine whether or not the signal absence flag Fns is set. When the process first proceeds to this operation, the process proceeds from s7 to s8 because Fns is set. In s8, Fns is reset, and the timer counter T is reset to 0 in order to count the reset continuation time (time exceeding the threshold level) (s9). If the operation has progressed from s6 to s7 twice or more consecutively, F
Since ns has been reset, the process proceeds from s7 to s10.
In s10, 1 is added to the timer counter T. When the addition result T becomes equal to the threshold Lpth (s1
1) Assuming that the currently input signal is a voice signal, the speech speed conversion flag Fs is set to start the speech speed conversion process (s12), and the conversion ratio calculation process (FIG. 5) is started (s13). On the other hand, if T <Lpth in s11, the routine returns. As described above, when the waveform data D input in s2 has exceeded the level threshold Pth for a certain period of time Lpth or more, Fs is set because an audio signal has been input. While Fs is set,
Speaking speed conversion processing (conversion ratio calculation processing and conversion processing) to be described later is executed.

【００３０】一方、入力された波形データのレベルＤが
Ｐｔｈ以下になった場合には、ｓ３からｓ４に進む。ｓ
４ではＦｎｓがセットしているか否かを判断するが、一
旦レベルＤがＰｔｈを超えたのち、レベルが低下してｓ
４に進んだ場合には、前記ｓ８でＦｎｓがリセットされ
ているためｓ４からｓ１４に進む。ｓ１４では信号無し
フラグＦｎｓをセットし、無音時間をカウントするため
タイマカウンタＴをリセットする（ｓ１５）。また、す
でにＦｎｓがリセットされている場合にはｓ４からｓ５
に進み話速変換フラグＦｓがリセットされているか否か
を判断する。Ｆｓがリセットされている場合には上述し
たようにそのままリターンするが、一旦、話速変換動作
をスタートしたのち入力信号レベルが低下してこの処理
動作に進んだ場合にはＦｓはセットしたままであるため
ｓ５からｓ１６に進む。ｓ１６ではタイマカウンタＴに
１を加算する。加算の結果Ｔが無音時間閾値Ｔｎｓに達
した場合には（ｓ１７）、既に音声信号の入力は終了し
ていると判断して話速変換フラグＦｓをリセットすると
ともに、音声信号の波形データが終了した以後、信号処
理用ＲＡＭ１５に書き込まれた無音部のデータを廃棄す
るように、並行動作している話速変換処理動作に指示す
る（ｓ１９）。一方、Ｔに１を加算してもｓ１７でＴ＜
Ｔｎｓであった場合にはそのままリターンする。On the other hand, when the level D of the input waveform data falls below Pth, the process proceeds from s3 to s4. s
In step 4, it is determined whether or not Fns is set. After the level D once exceeds Pth, the level decreases and s
When the process proceeds to step S4, the process proceeds from step s4 to step s14 because Fns is reset in step s8. At s14, the signal absence flag Fns is set, and the timer counter T is reset to count the silence time (s15). If Fns has already been reset, s4 to s5
To determine whether or not the speech speed conversion flag Fs has been reset. When Fs is reset, the process returns as described above. However, once the speech speed conversion operation is started, the input signal level is reduced, and if the process proceeds to this processing operation, Fs remains set. Because there is, the process proceeds from s5 to s16. In s16, 1 is added to the timer counter T. When the addition result T reaches the silence time threshold Tns (s17), it is determined that the input of the audio signal has already been completed, the speech speed conversion flag Fs is reset, and the waveform data of the audio signal ends. After that, an instruction is given to the speech speed conversion processing operation which is operating in parallel so as to discard the data of the silent portion written in the signal processing RAM 15 (s19). On the other hand, even if 1 is added to T, T <T in s17.
If it is Tns, the process returns.

【００３１】このように、入力された波形データがＰｔ
ｈを下回ったままＴｎｓを経過したとき、音声信号の入
力が終了したとして話速変換フラグＦｓをリセットす
る。なお、波形データのレベルＤがＰｔｈを下回ったと
き即座にＦｓをリセットしないのは、音声信号中にも短
時間の無音部（無声区間）が存在するからであり、この
無声区間も音声信号として取り込む必要があるからであ
る。音声信号中に含まれる無音部としては促音「っ」や
語間のインターバルなどがある以下、話速変換処理動作
を説明する。図５のフローチャートはＦｓがセットさ
れ、話速変換処理動作がスタートしたとき最初に実行さ
れる変換比率計算処理動作を示している。この動作がス
タートするときにはリアルタイムバッファにＬｐｔｈ分
の波形データが蓄積されているため、このなかの適当な
区間を切り出し（ｓ２１）、ゼロクロス点の間隔に基づ
いて各部の基本周波数ｆｚを割り出す（ｓ２２）。この
ｆｚに基づき、母音部を抽出する（ｓ２３）。母音部は
子音部に比して基本周波数が低いことからこれらを分離
抽出することができる。そして、母音部数カウンタＮｖ
に１をセットする（ｓ２４）。この母音部数カウンタＮ
ｖは音節数をカウントする代わりに母音部の数をカウン
トするものであり、以下の処理ではこのカウント値を音
節数として扱っている。以下、リアルタイムバッファに
入力される波形データを監視しながら母音部を終了する
まで母音の時間的長さＬｖをカウントする（ｓ２５，ｓ
２６）。この音節の母音部が終了すると（ｓ２６）、こ
の母音部の長さに基づいて音節の長さを推定し、これに
基づいてこの音声信号の発話速度（話速）を算出する
（ｓ２７）。この算出された話速とパラメータＲＡＭ１
６に記憶されている目標話速データとを比較することに
より話速変換比率を計算する（ｓ２８）。こののち、話
速変換処理を実行するための変換処理動作（図６）を起
動する（ｓ２９）。As described above, when the input waveform data is Pt
When Tns has elapsed while h is still lower than h, the speech speed conversion flag Fs is reset assuming that the input of the audio signal has ended. The reason why Fs is not reset immediately when the level D of the waveform data falls below Pth is that there is a short-time silent part (voiceless section) in the voice signal, and this voiceless section is also used as the voice signal. It is necessary to take in. The silent portion included in the audio signal includes a prompting sound "tsu" and an interval between words. Hereinafter, the speech speed conversion processing operation will be described. The flowchart of FIG. 5 shows the conversion ratio calculation processing operation that is executed first when Fs is set and the speech speed conversion processing operation starts. When this operation starts, since the waveform data for Lpth is accumulated in the real-time buffer, an appropriate section is cut out (s21), and the fundamental frequency fz of each section is calculated based on the interval between the zero cross points (s22). . A vowel part is extracted based on the fz (s23). Since the fundamental frequency of the vowel part is lower than that of the consonant part, these can be separated and extracted. And the vowel part number counter Nv
Is set to 1 (s24). This vowel part number counter N
v counts the number of vowel parts instead of counting the number of syllables. In the following processing, this count value is treated as the number of syllables. Hereinafter, while monitoring the waveform data input to the real-time buffer, the temporal length Lv of the vowel is counted until the vowel portion is completed (s25, s).
26). When the vowel part of this syllable is completed (s26), the length of the syllable is estimated based on the length of this vowel part, and the speech speed (speech speed) of this voice signal is calculated based on this (s27). The calculated speech speed and parameter RAM1
The speech speed conversion ratio is calculated by comparing the target speech speed data stored in No. 6 with the target speech speed data (s28). Thereafter, a conversion processing operation (FIG. 6) for executing the speech speed conversion processing is started (s29).

【００３２】なお、リアルタイムバッファは、ある程度
の時間分の波形データを蓄積記憶することができるもの
とし、処理済のデータは各処理動作において適宜クリア
または上書きされるものとする。The real-time buffer can store and store waveform data for a certain period of time, and the processed data is appropriately cleared or overwritten in each processing operation.

【００３３】図６は実際に話速変換処理を実行する変換
処理動作を示すフローチャートである。この動作は、音
声信号が入力されたのち、第２音節のデータから実行さ
れる。この処理が開始されると、入力された波形データ
をリアルタイムバッファから直接読み出し出力すること
ができなくなり、信号処理用ＲＡＭ１５から読み出す必
要があるためメモリ読出フラグＦｍをセットする（ｓ３
０）。そしてリアルタイムバッファ記憶されているデー
タを読み取る（ｓ３１）。このデータが母音部のデータ
であるか（ｓ３２）、子音部のデータであるか（ｓ３
３）、無音部のデータであるか（ｓ３４）、または、無
音部データの廃棄指示のデータであるか（ｓ３５）を判
断する。なお、この動作においてリアルタイムバッファ
のデータの読み取りはｓ３１のみで行われるのではな
く、必要に応じて各処理動作で行われる。また、リアル
タイムバッファからデータを読み取るとき、必要に応じ
てＡ／Ｄコンバータ１３からデータが入力されるまで待
機する。FIG. 6 is a flowchart showing the conversion processing operation for actually executing the speech speed conversion processing. This operation is executed from the data of the second syllable after the audio signal is input. When this processing is started, the input waveform data cannot be directly read out and output from the real-time buffer and must be read out from the signal processing RAM 15, so that the memory read flag Fm is set (s3).
0). Then, the data stored in the real-time buffer is read (s31). Whether this data is vowel part data (s32) or consonant part data (s3)
3) It is determined whether the data is data of a silent part (s34) or data of an instruction to discard the silent part data (s35). In this operation, the reading of the data of the real-time buffer is not performed only in s31, but is performed in each processing operation as needed. Further, when reading data from the real-time buffer, it waits until data is input from the A / D converter 13 as necessary.

【００３４】あ行以外の音節は子音から開始するため、
子音部と判断された場合にはｓ３３からｓ３６に進みこ
の子音部のデータをそのまま信号処理用ＲＡＭ１５に書
き込む（ｓ３６）。子音は非周期音であり加工すると不
自然になるため、話速変換するときでも伸長しないため
である。Since the syllables other than the row start from a consonant,
If it is determined that the consonant part is present, the process proceeds from s33 to s36, and the data of this consonant part is written as it is to the signal processing RAM 15 (s36). This is because consonants are non-periodic sounds and become unnatural when processed, so they are not expanded even when speech speed conversion is performed.

【００３５】一方、読み取られたデータが母音部のデー
タである場合には、母音部数カウンタＮｖに１を加算す
る（ｓ４０）。これによりＮｖがＮｖｍａｘを超えたか
否かを判断する（ｓ４１）。図３を参照して説明したよ
うにこの実施形態ではＮｖｍａｘ＝４にしている。した
がって、Ｎｖが５になったときｓ４１は肯定的な判断と
なりｓ４５に進む。ＮｖがＮｖｍａｘ以下のときには、
この母音部を伸長する（ｓ４２，ｓ４３）。On the other hand, if the read data is vowel part data, 1 is added to the vowel part number counter Nv (s40). Thus, it is determined whether Nv has exceeded Nvmax (s41). As described with reference to FIG. 3, in this embodiment, Nvmax = 4. Therefore, when Nv becomes 5, s41 becomes a positive determination and proceeds to s45. When Nv is equal to or less than Nvmax,
This vowel part is extended (s42, s43).

【００３６】図７（Ａ）のフローチャートを参照して伸
長処理を説明する。ここでは、母音部の複数波を１つの
ブロックとして扱う。たとえば、母音の３波で１ブロッ
クとする。そして、このブロックにおける母音波形の基
本周波数を算出する（ｓ６０）。この基本周波数の算出
はゼロクロスを用いたもので、ｓ２２の動作とほぼ同様
である。そしてブロック内の隣接する２波形を選択して
切り出し（ｓ６１）、これらの平均波形を算出する（ｓ
６２）。そしてこの平均波形を上記切り出した２波形間
に挿入する（ｓ６３）。これでこのブロックは４波にな
ったことになる。この４波のブロックを信号処理用ＲＡ
Ｍ１５に書き込む（ｓ６４）。この例では、３波を４波
に伸長しているため、伸長率は１３３％となる。また、
各ブロックは全て同数である必要はなく、伸長率１３０
％にするためには、ブロックの波数を４，３，３の繰り
返しにすればよい。The decompression process will be described with reference to the flowchart of FIG. Here, a plurality of waves of the vowel part are treated as one block. For example, one block is composed of three vowel waves. Then, the fundamental frequency of the vowel waveform in this block is calculated (s60). The calculation of the fundamental frequency uses a zero cross, which is almost the same as the operation in s22. Then, two adjacent waveforms in the block are selected and cut out (s61), and their average waveform is calculated (s61).
62). Then, this average waveform is inserted between the two cut-out waveforms (s63). The block now has four waves. This four-wave block is used for signal processing RA.
Write to M15 (s64). In this example, since three waves are expanded into four waves, the expansion ratio is 133%. Also,
Each block does not need to have the same number, and the decompression rate is 130
In order to obtain the percentage, the wave number of the block may be repeated 4, 3, and 3.

【００３７】図６に戻って、検出された母音部（音節）
が５番目以後のものであった場合にはｓ４１からｓ４５
に進む。ｓ４５ではこの母音部の長さを波数で計るため
の波数カウンタＮｗをクリアする。そしてリアルタイム
バッファに入力される波形データを読み取り、１波形が
入力される毎にＮｗをカウントアップしてゆく（ｓ４
６）。そしてこの波数Ｎｗが限度波数Ｎｄを超えるまで
はそのままＲＡＭに書き込んでゆくが（ｓ４７→ｓ４
８）、Ｎｗが限度波数Ｎｄを超えると以後は圧縮処理を
して（ｓ４９）、信号処理用ＲＡＭ１５に書き込む。Returning to FIG. 6, detected vowel parts (syllables)
Is the fifth or later, s41 to s45
Proceed to. In s45, the wave number counter Nw for measuring the length of the vowel portion by the wave number is cleared. Then, the waveform data input to the real-time buffer is read, and Nw is counted up each time one waveform is input (s4).
6). Until the wave number Nw exceeds the limit wave number Nd, the data is directly written to the RAM (s47 → s4
8) When Nw exceeds the limit wave number Nd, compression processing is performed thereafter (s49), and the result is written into the signal processing RAM 15.

【００３８】図７（Ｂ）は圧縮処理動作を示すフローチ
ャートである。この動作は、リアルタイムバッファに２
波形が入力されるのを待って実行される。まず、この２
波形を切り出し（ｓ６５）、この平均波形を算出する
（ｓ６６）。そして、この算出された平均波形を上記２
波形に代えて信号処理用ＲＡＭ１５に書き込む（ｓ６
７）。この動作により、Ｎｄ以後の母音部波形は１／２
に圧縮されることになる。FIG. 7B is a flowchart showing the compression processing operation. This operation has two
It is executed after a waveform is input. First, this 2
The waveform is cut out (s65), and the average waveform is calculated (s66). Then, the calculated average waveform is referred to the above 2
Write to the signal processing RAM 15 instead of the waveform (s6
7). By this operation, the vowel part waveform after Nd becomes １／.
Will be compressed.

【００３９】また、ｓ３１で読み取られた波形データが
無音部のものであれば母音数（音節数）カウンタＮｖの
値を判断し（ｓ５５）、Ｎｖｍａｘを超えていなければ
母音部の伸長率に合わせてこの無音部も伸長して信号処
理用ＲＡＭ１５に書き込む（ｓ５６）。もし、音節数Ｎ
ｖが伸長限度数Ｎｖｍａｘを超えている場合には、伸長
せずにそのまま信号処理用ＲＡＭ１５に書き込む（ｓ３
６）。一方、読み取られたデータが無音部の廃棄指示で
あれば信号処理用ＲＡＭ１５の末尾に記憶されている無
音データ群を廃棄・消去する（ｓ５７）。これは、これ
らの無音部データを音声信号の無声区間（促音など）と
して記憶していたが、実際には音声信号が終了したあと
の無音部分であり不要であることが判明したからであ
る。廃棄指示が入力されると、音声信号の処理が終了し
たことを意味するためこれでこの動作を終了してリター
ンする。If the waveform data read in s31 is for a silent part, the value of the vowel number (syllable number) counter Nv is determined (s55). If it does not exceed Nvmax, the value is adjusted to the expansion rate of the vowel part. The silent part of the lever is also expanded and written into the signal processing RAM 15 (s56). If the number of syllables N
If v exceeds the expansion limit number Nvmax, it is written into the signal processing RAM 15 without expansion (s3
6). On the other hand, if the read data is an instruction to discard a silent part, the silent data group stored at the end of the signal processing RAM 15 is discarded / erased (s57). This is because these silence part data was stored as unvoiced sections (prompting sounds, etc.) of the audio signal, but it was actually found to be a silence part after the end of the audio signal and was unnecessary. When the discard instruction is input, it means that the processing of the audio signal has been completed, so that this operation is completed and the process returns.

【００４０】図８は読出処理動作を示すフローチャート
である。この動作はデータ取込処理と同様、補聴器の動
作スタートと同時に起動し常時実行されている。この動
作も上記データ取込処理動作と同様サンプリングタイミ
ング毎に実行される。FIG. 8 is a flowchart showing the read processing operation. This operation is started and performed at the same time as the operation of the hearing aid as in the data acquisition process. This operation is also executed at each sampling timing, similarly to the data fetch processing operation.

【００４１】まず、Ｆｍがセットしているか否かを判断
する（ｓ７０）。Ｆｍがセットしていない場合にはｓ７
４に進んで、リアルタイムバッファに記憶されている最
新のデータを読み出してＤ／Ａコンバータ１７に出力す
る。Ｆｍがセットされている場合には信号処理用ＲＡＭ
１５に読み出すべきデータがあるか否かを判断し（ｓ７
１）、ある場合には時刻ポインタに指示される位置のデ
ータを読み出してＤ／Ａコンバータ１７に出力する（ｓ
７２）。時刻ポインタはこの読み出しによって歩進され
るが、上記変換処理（伸長処理・圧縮処理を含む）によ
るデータ書き込みによっても変更される場合があるもの
とする。一方、信号処理用ＲＡＭ１５に読み出すべきデ
ータがない場合にはＦｍをリセットしたのち（ｓ７
３）、リアルタイムバッファから最新のデータを読み出
してＤ／Ａコンバータ１７に出力する（ｓ７４）。以
後、変換処理動作がスタートしてＦｍがセットされるま
でリアルタイムバッファからＤ／Ａコンバータ１７にデ
ータが出力されることになる。なお、信号処理用ＲＡＭ
１５においては、読出済データの消去動作が適宜行われ
るものとする。First, it is determined whether or not Fm is set (s70). If Fm is not set, s7
In step 4, the latest data stored in the real-time buffer is read and output to the D / A converter 17. RAM for signal processing when Fm is set
15 to determine whether there is data to be read (s7).
1) In some cases, the data at the position indicated by the time pointer is read and output to the D / A converter 17 (s)
72). The time pointer is incremented by the reading, but it may be changed in some cases by data writing by the conversion processing (including the decompression processing and the compression processing). On the other hand, when there is no data to be read in the signal processing RAM 15, after resetting Fm (s7
3) The latest data is read from the real-time buffer and output to the D / A converter 17 (s74). Thereafter, data is output from the real-time buffer to the D / A converter 17 until the conversion processing operation is started and Fm is set. The signal processing RAM
At 15, it is assumed that an operation of erasing read-out data is appropriately performed.

【００４２】なお、上記実施形態では、最初の母音（音
節）は話速変換せずに出力するようにしているが、何ら
かの変換比率で話速変換出力するようにしてもよい。た
とえば、直前の音声信号に対して決定された変換比率で
変換するなどである。In the above embodiment, the first vowel (syllable) is output without converting the speech speed. However, the output may be converted at a certain conversion ratio. For example, the conversion is performed at a conversion ratio determined for the immediately preceding audio signal.

【００４３】なお、上記実施形態では、最初の母音の長
さに基づいてその音声信号の発話速度を検出するように
しているが、発話速度を検出する方法はこれに限定され
ない。たとえば、無音部の後に音声信号が入力されたと
き、該無音部直前に入力された音声信号の母音長から今
回の音声信号の発話速度を推定する方法や、無音部の後
に音声信号が入力されたとき、該無音部直前に入力され
た音声信号の母音間距離から今回の音声信号の発話速度
を推定する方法などを採用することができる。これらの
方法によれば、直前の１音節を用いることもでき、直前
の１文すべての母音長や母音間距離を求めてその平均値
やその変化曲線を用いて今回の発話速度を推定すること
ができる。In the above embodiment, the speech speed of the voice signal is detected based on the length of the first vowel, but the method of detecting the speech speed is not limited to this. For example, when an audio signal is input after a silent part, a method of estimating the utterance speed of the current audio signal from the vowel length of the audio signal input immediately before the silent part, or when an audio signal is input after the silent part Then, a method of estimating the utterance speed of the current voice signal from the distance between vowels of the voice signal input immediately before the silent portion can be adopted. According to these methods, the immediately preceding syllable can be used, and the vowel length and the distance between vowels of the immediately preceding sentence are obtained, and the average value and its change curve are used to estimate the current utterance speed. Can be.

【００４４】また、この実施形態には請求の範囲に記載
していない以下のような発明が含まれている。Further, this embodiment includes the following inventions not described in the claims.

【００４５】所定音節以後は話速変換しないようにした
ことにより、理解度の低下を防ぎ、且つ、出力遅れを最
小限にくい止めることができる。By preventing the speech speed from being converted after a predetermined syllable, it is possible to prevent a decrease in the degree of comprehension and to minimize the output delay.

【００４６】音声信号として検出された信号のゲインを
上げることにより、了解度を高くすることができる。The intelligibility can be increased by increasing the gain of the signal detected as the audio signal.

【００４７】[0047]

【発明の効果】以上のようにこの発明によれば、音声信
号の最初の母音に基づいてその音声信号の発話速度を検
出するようにしたことにより、ほぼリアルタイムで高精
度に発話速度を検出することができる。As described above, according to the present invention, the utterance speed of an audio signal is detected based on the first vowel of the audio signal, thereby detecting the utterance speed with high accuracy in almost real time. be able to.

【００４８】また、この発明によれば、入力された音声
信号の発話速度を検出し、この音声信号の発話速度を目
標話速に変換することにより、どのような発話速度の音
声信号が入力された場合でもリアルタイムに利用者が所
望の速度（目標話速）に発話速度を変換することができ
る。According to the present invention, the speech rate of the input speech signal is detected, and the speech rate of the speech signal is converted into the target speech rate, whereby the speech signal of any speech rate is inputted. In this case, the user can convert the speech speed to a desired speed (target speech speed) in real time.

[Brief description of the drawings]

【図１】この発明の実施形態である話速変換機能付の補
聴器のブロック図FIG. 1 is a block diagram of a hearing aid with a speech speed conversion function according to an embodiment of the present invention;

【図２】同補聴器の話速変換機能を説明する図FIG. 2 is a diagram illustrating a speech speed conversion function of the hearing aid.

【図３】同補聴器の話速変換機能を説明する図FIG. 3 is a diagram illustrating a speech speed conversion function of the hearing aid.

【図４】同補聴器のＤＳＰの動作を示すフローチャートFIG. 4 is a flowchart showing the operation of the DSP of the hearing aid;

【図５】同補聴器のＤＳＰの動作を示すフローチャートFIG. 5 is a flowchart showing the operation of the DSP of the hearing aid;

【図６】同補聴器のＤＳＰの動作を示すフローチャートFIG. 6 is a flowchart showing the operation of the DSP of the hearing aid;

【図７】同補聴器のＤＳＰの動作を示すフローチャートFIG. 7 is a flowchart showing the operation of the DSP of the hearing aid;

【図８】同補聴器のＤＳＰの動作を示すフローチャートFIG. 8 is a flowchart showing the operation of the DSP of the hearing aid.

[Explanation of symbols]

１０…マイクロフォン、１１…マイクアンプ、１２…フ
ィルタ、１３…Ａ／Ｄコンバータ、１４…ＤＳＰ、１５
…音声信号ＲＡＭ、１６…パラメータＲＡＭ、１６ａ…
目標話速データ記憶エリア、１７…Ｄ／Ａコンバータ、
１８…ローパスフィルタ、１９…パワーアンプ、２０…
レシーバ、２１…設定器10 microphone, 11 microphone amplifier, 12 filter, 13 A / D converter, 14 DSP, 15
... Sound signal RAM, 16 ... Parameter RAM, 16a ...
Target speech speed data storage area, 17 ... D / A converter,
18 low-pass filter, 19 power amplifier, 20
Receiver, 21 ... Setting device

Claims

[Claims]

1. The method according to claim 1, wherein a length of a first vowel in the input voice signal is detected, and an utterance speed of the input voice signal is detected based on the length of the detected vowel. Talk speed detection method.

2. A target speech speed is set in advance, an audio signal is input, an utterance speed of the audio signal is detected, and the input speech signal is converted from the detected utterance speed to the target speech speed. A speech speed conversion method characterized by converting.

Detecting a length of a first vowel in the input voice signal; detecting a speech rate of the input voice signal based on the length of the detected vowel; A speech speed conversion method, comprising: converting the speech speed of a speech signal input after the first vowel based on the speech speed to a preset target speech speed.

4. An input means for inputting an acoustic signal including a voice signal, a voice speed detecting means for detecting a voice speed of the voice signal input from the input means, and a target voice speed which is a target value of the voice speed conversion. Speech speed setting means for setting the target speech speed in the target speech speed storage means, and converting the speech speed of the voice signal detected by the speech speed detection means into the target speech speed. Conversion rate calculating means for calculating a conversion rate for the conversion rate; and speech rate conversion means for converting the speech rate of a voice signal input after the first vowel with the conversion rate calculated by the conversion rate calculating means. A hearing aid with a speech speed conversion function.

5. An input means for inputting an audio signal including an audio signal, an audio signal detecting means for monitoring the audio signal input from the input means and detecting the start of the audio signal, and detecting the start of the audio signal. Vowel length detection means for detecting the length of the first vowel of the voice signal; speech speed detection means for detecting the utterance speed of the voice signal whose start has been detected based on the length of the detected vowel; A target speech speed storage unit that stores a target speech speed that is a target value of the speed conversion; and a conversion ratio that calculates a conversion ratio for converting the speech speed of the voice signal detected by the speech speed detection unit into the target speech speed. Speech rate conversion means for calculating speech rate of speech signals input after the first vowel at the conversion rate calculated by the conversion rate calculation means. Hearing aid with function.