JPH10333695A

JPH10333695A - Voice converting device

Info

Publication number: JPH10333695A
Application number: JP9146179A
Authority: JP
Inventors: Kiyo Hara; 紀代原; Kenji Matsui; 謙二松井
Original assignee: Technology Research Association of Medical and Welfare Apparatus
Current assignee: Technology Research Association of Medical and Welfare Apparatus
Priority date: 1997-06-04
Filing date: 1997-06-04
Publication date: 1998-12-18

Abstract

PROBLEM TO BE SOLVED: To improve the clearness and hearing easiness of a gullet-pronounced voice, by detecting the zero intersection of the inputted voice, calculating the power of the voice, and amplitude-emphasizing the voice for every detected zone with a zero intersection detecting means. SOLUTION: The voice inputted from a voice input end 1 (such as a microphone) is A/D-converted by an A/D-converter 2 and accumulated in a wave-form memory section 3. A zero intersection detection section 4 detects the zero intersection of the accumulated voice wave-form and determines the processing zone. A power calculation section 5 obtains the power (amplitude square sum) of the wave-form in the processing zone. A power comparison section 6 compares the obtained power value with a threshold value set in advance. When the power value is larger than the threshold value, an amplitude emphasis section 7 conducts an amplitude emphasis processing. When the power value is smaller than the threshold value, no amplitude emphasis processing is conducted, and the voice is outputted to a voice output end 9 (such as a speaker) as it is via a D/A-converter 8. The inputted voice is A/D-converted, the zero intersection is detected, and the amplitude emphasis processing is conducted for the zone.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、入力された音声に
処理を施し、明瞭性や聞きやすさを向上させて出力する
音声変換装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice conversion apparatus for processing input voice and outputting the processed voice with improved clarity and audibility.

【０００２】[0002]

【従来の技術】喉頭ガンによる咽頭摘出手術を受けた患
者は、声を失ってしまうが、声帯の代わりに食道を振動
させて発声する食道発声法を訓練することにより、発声
が可能となる。しかしながら、食道発声された声は、以
下のような問題点がある。・呼気の量が不十分なため、大きな声がでない、かすれ
声になってしまう。・基本周波数が低く、乱れている。・気孔音などのノイズが多い。2. Description of the Related Art A patient who has undergone pharyngectomy surgery using a laryngeal cancer loses its voice. However, it is possible to produce a voice by training an esophageal vocalization method in which the esophagus is vibrated instead of the vocal cords. However, the esophagus uttered voice has the following problems.・ Insufficient volume of exhalation causes a loud or no faint voice.・ The fundamental frequency is low and disturbed.・ There is much noise such as stomatal noise.

【０００３】声の大きさの問題を改良するために、アナ
ログ拡声装置なども販売されているが、ノイズも含めて
拡声してしまうため、十分有用とは言えない。[0003] In order to improve the problem of loudness, analog loudspeakers and the like are sold, but they are not sufficiently useful because they loudspeakers including noise.

【０００４】[0004]

【発明が解決しようとする課題】本発明は、前記従来技
術の項で説明した食道発声音声の問題点を軽減すること
を目的とする。SUMMARY OF THE INVENTION An object of the present invention is to alleviate the problem of esophageal utterance voice described in the section of the prior art.

【０００５】[0005]

【課題を解決するための手段】本発明は、上記問題点を
解決するために、音声を入力する音声入力手段と、入力
された音声をＡＤ変換する手段と、入力された音声を記
憶する音声記憶手段と、入力された音声の零交差点を検
出する手段と音声のパワーを計算する手段と前記零交差
検出手段で検出された区間毎に振幅強調を行う手段を有
する音声変換装置である。SUMMARY OF THE INVENTION In order to solve the above-mentioned problems, the present invention provides a voice input means for inputting voice, an A / D conversion means for input voice, and a voice for storing input voice. An audio conversion device includes a storage unit, a unit for detecting a zero-crossing point of an input voice, a unit for calculating the power of the voice, and a unit for performing amplitude emphasis for each section detected by the zero-crossing detection unit.

【０００６】また、音声を入力する音声入力手段と、入
力された音声をＡＤ変換する手段と、入力された音声を
記憶する音声記憶手段と、入力された音声の基本周波数
を検出する手段と音声のパワーを計算する手段と前記基
本周波数検出手段で検出された基本周期毎に振幅強調を
行う手段を有する音声変換装置である。[0006] Also, a voice input means for inputting voice, a means for AD converting the input voice, a voice storage means for storing the input voice, a means for detecting a fundamental frequency of the input voice, and a voice And a means for performing amplitude emphasis for each fundamental period detected by the fundamental frequency detecting means.

【０００７】また、音声を入力する音声入力手段と、入
力された音声をＡＤ変換する手段と、入力された音声を
記憶する音声記憶手段と、入力された音声の零交差点を
検出する手段と振幅を一定倍する拡声手段と音声のパワ
ーを計算する手段と前記零交差検出手段で検出された区
間毎に振幅強調を行う手段を有する音声変換装置であ
る。[0007] Further, a voice input means for inputting voice, a means for AD converting the input voice, a voice storage means for storing the input voice, a means for detecting a zero crossing point of the input voice and an amplitude. Is a voice conversion device having a loudspeaker for multiplying the constant by a factor, a calculator for calculating the power of the voice, and a unit for emphasizing the amplitude for each section detected by the zero-crossing detector.

【０００８】また、音声を入力する音声入力手段と、入
力された音声をＡＤ変換する手段と、入力された音声を
記憶する音声記憶手段と、入力された音声の零交差点を
検出する手段と有音／無音を判断する手段と前記零交差
検出手段で検出された区間毎に振幅強調を行う手段を有
する音声変換装置である。[0008] Also, there are voice input means for inputting voice, means for AD converting input voice, voice storage means for storing input voice, and means for detecting a zero crossing point of input voice. This is a voice conversion device having means for judging sound / silence and means for performing amplitude emphasis for each section detected by the zero-crossing detecting means.

【０００９】また、音声を入力する音声入力手段と、入
力された音声をＡＤ変換する手段と、入力された音声を
記憶する音声記憶手段と、入力された音声の零交差点を
検出する手段と有声／無声を判断する手段と前記零交差
検出手段で検出された区間毎に振幅強調を行う手段を有
する音声変換装置である。Also, a voice input means for inputting voice, a means for AD-converting the input voice, a voice storage means for storing the input voice, a means for detecting a zero crossing point of the input voice, and a voiced voice / A voice conversion device having means for determining unvoicedness and means for performing amplitude emphasis for each section detected by the zero-crossing detection means.

【００１０】また、音声を入力する音声入力手段と、入
力された音声をＡＤ変換する手段と、入力された音声を
記憶する音声記憶手段と、入力された音声の零交差点を
検出する手段と音声のパワーを計算する手段と前記零交
差検出手段で検出された区間毎に振幅強調を行う手段お
よび音声を分析する手段を有し、振幅強調に用いるパラ
メータを分析手段により決定する音声変換装置である。[0010] Also, voice input means for inputting voice, means for A / D conversion of input voice, voice storage means for storing input voice, means for detecting a zero crossing point of input voice, and voice And a means for performing amplitude emphasis for each section detected by the zero-crossing detecting means, and a means for analyzing speech, and a parameter used for amplitude emphasis is determined by the analyzing means. .

【００１１】（作用）上記の構成により、入力された音
声をＡ／Ｄ変換し、零交差点を検出して、その区間で振
幅強調処理を行うことにより、食道発声音声の明瞭度、
聞きやすさを改善することができる。(Operation) According to the above configuration, the input voice is A / D converted, a zero-crossing point is detected, and amplitude emphasis processing is performed in that section, whereby the clarity of the esophageal voice is improved.
Listenability can be improved.

【００１２】[0012]

BEST MODE FOR CARRYING OUT THE INVENTION

（実施例１）図１は、請求項１に記載の本発明の１実施
例の構成図である。音声入力端１（マイクなど）から入
力された音声は、Ａ／Ｄ変換器２でＡ／Ｄ変換され、波
形記憶部３に蓄積される。４は零交差検出部で、蓄積さ
れた音声波形の零交差点を検出し、処理区間を決定す
る。パワー計算部５では、処理区間内の波形のパワー
（振幅二乗和）を求める。パワー比較部６で得られたパ
ワー値とあらかじめ設定された閾値との比較を行い、パ
ワー値が閾値より大きいときは、振幅強調部７で振幅強
調処理を行う。パワー値が閾値より小さいときは、振幅
強調処理は行わずそのままＤ／Ａ変換器８を介して、音
声出力端９（スピーカなど）に出力する。(Embodiment 1) FIG. 1 is a block diagram of one embodiment of the present invention described in claim 1. A sound input from a sound input terminal 1 (such as a microphone) is A / D converted by an A / D converter 2 and stored in a waveform storage unit 3. Reference numeral 4 denotes a zero-crossing detecting unit which detects a zero-crossing point of the stored speech waveform and determines a processing section. The power calculator 5 calculates the power (sum of amplitude squared) of the waveform in the processing section. The power value obtained by the power comparing section 6 is compared with a preset threshold value. When the power value is larger than the threshold value, the amplitude emphasizing section 7 performs amplitude emphasis processing. When the power value is smaller than the threshold value, the signal is output to the audio output terminal 9 (such as a speaker) via the D / A converter 8 without performing the amplitude emphasis processing.

【００１３】次に各処理について詳しく説明する。図２
は、本発明を説明するための模式図である。縦軸に振
幅、横軸に時間をとって、入力された音声波形を示す。
入力された音声は、Ａ／Ｄ変換器２でディジタル化さ
れ、数列、ｘ(1),ｘ(2),,,ｘ(i),,,として波形記憶部３
に蓄積される。零交差検出部４では、音声波形ｘ(i-
1)、x(i)の符号が異なる点（零交差点）を検出する。最
初の零交差点t0が検出されたあとは、一定区間（本実施
例では、２ｍｓ：サンプリング周波数１０ＫＨｚの時
は、２０サンプル分）経過後に次の零交差点t1を検出す
る。同様にt2を検出する。Next, each processing will be described in detail. FIG.
FIG. 1 is a schematic diagram for explaining the present invention. The vertical axis represents the amplitude and the horizontal axis represents the time, and the input speech waveform is shown.
The input voice is digitized by the A / D converter 2 and converted into a sequence, x (1), x (2), x (i),.
Is accumulated in In the zero-crossing detection unit 4, the speech waveform x (i-
1) Detect a point (zero crossing point) where the sign of x (i) is different. After the first zero-crossing point t0 is detected, the next zero-crossing point t1 is detected after a lapse of a predetermined interval (2 ms in this embodiment: 20 samples at a sampling frequency of 10 KHz). Similarly, t2 is detected.

【００１４】区間をT0「t0-t1]、T1[t1-t2]とあらわ
す。パワー計算部５では、各区間ごとの波形のパワーを
次の式で求める。The sections are represented as T0 "t0-t1" and T1 [t1-t2] The power calculator 5 calculates the power of the waveform for each section by the following equation.

【００１５】得られたＰ(TO)が、あらかじめ設定された閾値Ｐminよ
り小さい時は、その区間はノイズであると判断して、振
幅強調処理を行わない。Ｐ(T0)＞Ｐminの時には、S振幅
強調部７において次式により振幅強調を行う。[0015] When the obtained P (TO) is smaller than the preset threshold value Pmin, the section is determined to be noise, and the amplitude emphasis processing is not performed. When P (T0)> Pmin, the S amplitude emphasizing section 7 performs amplitude emphasis by the following equation.

【００１６】Ｘmax(T0)：区間Ｔ０における波形の最大値Ｘmax ：振幅強調の最大値（あらかじめ設定）ｒ：振幅強調の割合（あらかじめ設定） α(T0) ：区間Ｔ０における振幅強調係数 y(i) ：振幅強調処理された波形 α(T0)＝（Ｘmax−Ｘmax(T0））＊ｒ／Ｘmax(t0)＋１式（２） y(i)＝α(T0)＊x(i) 式（３）（注：＊は、乗算をあらわす。）本実施例では、Ａ／Ｄで１６ビット・１０ＫＨｚのサン
プリングを行うものとし、Ｐmin＝５０、Ｘmax＝８１９
２、ｒ＝０．２と設定する。Xmax (T0): Maximum value of waveform in section T0 Xmax: Maximum value of amplitude enhancement (preset) r: Ratio of amplitude enhancement (preset) α (T0): Amplitude enhancement coefficient y (i in section T0 ): Waveform subjected to amplitude emphasis processing α (T0) = (Xmax−Xmax (T0)) * r / Xmax (t0) +1 Equation (2) y (i) = α (T0) * x (i) Equation (3) (Note: * indicates multiplication.) In this embodiment, A / D sampling is performed at 16 bits and 10 KHz, and Pmin = 50 and Xmax = 819.
2. Set r = 0.2.

【００１７】図２に入力音声波形を実線で、振幅強調さ
れた波形を点線で示す。また、食道発声話者によって
「植える」と発声された音声波形を図３(a)に、本発明
の振幅強調を行った波形を、図３(b)に示す。図３から
もわかるように入力音声ではほとんど聞き取れなかった
「る」の部分が強調されて、処理音声では明瞭性が大き
く改善されている。FIG. 2 shows the input speech waveform by a solid line and the amplitude-emphasized waveform by a dotted line. FIG. 3A shows a speech waveform uttered by the esophagus speaker as “planting”, and FIG. 3B shows a waveform subjected to amplitude emphasis according to the present invention. As can be seen from FIG. 3, the "ru" part, which was hardly heard in the input voice, is emphasized, and the clarity of the processed voice is greatly improved.

【００１８】本発明の効果を確かめるために、シェッフ
ェの一対比較による評価試験を行った。（シェッフェの
一対比較については、日科技連：官能検査ハンドブック
p.356〜p.384に詳述されている。）食道発声音声を聞き
慣れていない被験者５名に、原音声、本発明により振幅
強調した音声を連続して提示し、明瞭で聞き易いのはど
ちらかを５段階で評価した。評価文章数は４である。分
散分析の結果、１％の危険率で、本発明方式の有効性が
示された。In order to confirm the effects of the present invention, an evaluation test was performed by a paired comparison of Scheffe. (For paired comparisons of Scheffe, see Nikka Giren: Sensory test handbook
It is detailed on pages 356-384. 5) The original voice and the voice whose amplitude was emphasized according to the present invention were continuously presented to five subjects who were unfamiliar with the esophageal voice, and which was clear and easy to hear was evaluated on a five-point scale. The number of evaluation sentences is 4. Analysis of variance showed the effectiveness of the method of the present invention with a 1% risk factor.

【００１９】（実施例２）図４は、請求項２に記載の本
発明の１実施例の構成図である。実施例１と同じ機能の
ものについては、同一の番号を付与しその説明を割愛す
る。実施例１の零交差検出部に替えて、基本周波数（ピ
ッチ）検出部１０を有している。実施例１では、振幅強
調を零交差によって検出された区間に対して行ったが、
本実施例では、１ピッチ区間に対して行う。基本周波数
抽出の手法については、すでにいろいろな手法が広く知
られている。(Embodiment 2) FIG. 4 is a block diagram of an embodiment of the present invention according to claim 2. The components having the same functions as those in the first embodiment are given the same numbers, and the description thereof is omitted. A fundamental frequency (pitch) detector 10 is provided in place of the zero-crossing detector of the first embodiment. In the first embodiment, the amplitude enhancement is performed on the section detected by the zero crossing.
In this embodiment, the operation is performed for one pitch section. Regarding the method of fundamental frequency extraction, various methods are already widely known.

【００２０】実施例１同様、本発明においても、食道発
声音声の明瞭度や聞き易さを改善することができる。As in the first embodiment, in the present invention, the clarity and audibility of the esophageal voice can be improved.

【００２１】（実施例３）図５は、請求項３に記載の本
発明の１実施例の構成図である。実施例１と同じ機能の
ものについては、同じ番号を付与し、その説明を割愛す
る。実施例１の構成に加えて、零交差検出部４とパワー
計算部５の間に拡声部１１が存在する。拡声１１では、
入力された音声波形を以下の式で拡声する。(Embodiment 3) FIG. 5 is a block diagram of an embodiment of the present invention according to claim 3. The components having the same functions as those of the first embodiment are denoted by the same reference numerals, and description thereof is omitted. In addition to the configuration of the first embodiment, a loudspeaker 11 exists between the zero-crossing detector 4 and the power calculator 5. In loudspeaker 11,
The input speech waveform is amplified by the following equation.

【００２２】ｘ(i)：入力波形ｙ(i)：出力波形 α：倍率＞１ｙ(i)＝α＊ｘ(i) 式（４）本実施例では、α＝1.5とする。X (i): input waveform y (i): output waveform α: magnification> 1 y (i) = α * x (i) Equation (4) In this embodiment, α = 1.5.

【００２３】食道発声音声は、呼気量が不十分なため、
音量が不十分な場合がよくあるが、本方式によれば、あ
らかじめ拡声してからさらに振幅強調を行うため、食道
発声音声の明瞭度・聞き易さを大幅に改善することがで
きる。The esophageal utterance voice has insufficient expiratory volume,
Although the sound volume is often insufficient, according to this method, the loudspeaker voice can be significantly improved in clarity and audibility since the voice is amplified in advance and then the amplitude is further emphasized.

【００２４】（実施例４）図６は、請求項４に記載の本
発明の１実施例の構成図である。実施例１と同じ機能の
ものについては、同じ番号を付与し、その説明を割愛す
る。実施例１のパワー計算部、パワー比較部に代えて、
有音／無音判定部１２が存在する。実施例１では、パワ
ーによって振幅強調処理を行うかどうかを判定していた
が、食道発声音声では、息継ぎ音や気道音などの雑音が
多くかつレベルも高いため、パワーのみで判断した場
合、ノイズまで強調してしまう場合がある。本発明で
は、この問題点を回避するために、有音／無音（ノイ
ズ）判定の結果によって、振幅強調処理を行う。有音／
無音の判定手法として、本実施例では以下の手法を用い
る。(Embodiment 4) FIG. 6 is a block diagram of an embodiment of the present invention according to claim 4. The components having the same functions as those of the first embodiment are denoted by the same reference numerals, and description thereof is omitted. Instead of the power calculation unit and the power comparison unit of the first embodiment,
A sound / non-sound determining unit 12 is provided. In the first embodiment, whether or not to perform the amplitude emphasizing process is determined based on the power. However, in the esophageal vocal sound, since the noise such as the breathing sound and the airway sound is large and the level is high, the noise is determined when the power alone is used. May be emphasized. In the present invention, in order to avoid this problem, amplitude emphasis processing is performed based on the result of the sound / non-sound (noise) determination. Sound /
In the present embodiment, the following method is used as a silence determination method.

【００２５】(1)入力音声から一定期間ごとのパワーを
求め、パワーが閾値１以下の時は無音と判定（実施例１
と同様）。(1) The power for each fixed period is obtained from the input voice, and when the power is equal to or less than the threshold value 1, it is determined that there is no sound (the first embodiment).
the same as).

【００２６】(2)パワーが閾値１以上、閾値２以下の時
は、ＬＰＣケプストラム係数を求めて、あらかじめノイ
ズとして求めておいたテンプレートとの距離計算を行
い、ノイズと判定されれば、振幅強調処理は行わない。(2) When the power is equal to or more than the threshold value 1 and equal to or less than the threshold value 2, an LPC cepstrum coefficient is calculated, a distance calculation is performed with respect to a template previously determined as noise, and if noise is determined, amplitude enhancement is performed. No processing is performed.

【００２７】(3)上記のいずれでもないとき、振幅強調
処理を行う。食道発声音声は、ノイズを伴うことが多く
あるが、本発明によれば、ノイズを強調することなく、
食道発声音声の明瞭性・聞き易さを大幅に改善すること
ができる。(3) If none of the above, amplitude emphasis processing is performed. Esophageal vocal sounds are often accompanied by noise, but according to the present invention, without emphasizing noise,
The clarity and audibility of the esophageal voice can be greatly improved.

【００２８】なお、本発明では、有音／無音（ノイズ）
の判定手法としてケプストラム距離を用いたが、これは
本発明を何ら拘束するものではない。In the present invention, sound / silence (noise)
Although the cepstrum distance is used as the determination method of the above, this does not restrict the present invention at all.

【００２９】（実施例５）図７は、請求項５に記載の本
発明の１実施例の構成図である。実施例１と同じ機能の
ものについては、同じ番号を付与し、その説明を割愛す
る。実施例１のパワー計算部、パワー比較部に代えて、
有声／無声判定部１３が存在する。実施例１では、パワ
ーによって振幅強調処理を行うかどうかを判定していた
が、無声子音を強調しすぎると、かえって聞き難くなる
場合がある。本発明では、この問題点を回避するため
に、有声／無声判定を行ってその結果により、振幅強調
処理を行う。有声／無声の判定方法として、本実施例で
は以下の手法を用いる。(Embodiment 5) FIG. 7 is a block diagram of an embodiment of the present invention according to claim 5. The components having the same functions as those of the first embodiment are denoted by the same reference numerals, and description thereof is omitted. Instead of the power calculation unit and the power comparison unit of the first embodiment,
A voiced / unvoiced determination unit 13 exists. In the first embodiment, whether or not to perform the amplitude emphasis processing is determined based on the power. However, if the unvoiced consonant is emphasized too much, it may be difficult to hear. In the present invention, in order to avoid this problem, voiced / unvoiced judgment is performed, and amplitude emphasis processing is performed based on the result. In this embodiment, the following method is used as a voiced / unvoiced determination method.

【００３０】(1)入力音声から一定期間ごとのパワーを
求め、パワーが閾値１以下の時は無音と判定（実施例１
と同様）。(1) The power for each fixed period is obtained from the input voice, and when the power is equal to or less than the threshold value 1, it is determined that there is no sound (the first embodiment).
the same as).

【００３１】(2)パワーが閾値１以上の場合、一次のＬ
ＰＣケプストラム係数を求めて、この値が閾値３以下の
時は、無声部であると判定して、振幅強調処理は行わな
い。(2) When the power is equal to or larger than the threshold value 1, the primary L
The PC cepstrum coefficient is obtained, and when this value is equal to or less than the threshold value 3, it is determined to be a voiceless part, and the amplitude emphasis processing is not performed.

【００３２】(3)上記のいずれでもないとき、振幅強調
処理を行う。このように、本発明によれば不要な子音部
を強調することを回避し、食道発声音声の明瞭性・聞き
易さを大幅に改善することができる。(3) If none of the above, amplitude emphasis processing is performed. As described above, according to the present invention, it is possible to avoid emphasizing unnecessary consonants, and to significantly improve the clarity and audibility of the esophageal utterance voice.

【００３３】なお、本発明では有声／無声判定の手法と
してケプストラム１次の係数を利用したが、これは本発
明を何ら拘束するものではない。In the present invention, a first-order cepstrum coefficient is used as a voiced / unvoiced determination method, but this does not restrict the present invention.

【００３４】（実施例６）図８は、請求項６に記載の本
発明の１実施例の構成図である。実施例１と同じ機能の
ものについては、同じ番号を付与し、その説明を割愛す
る。実施例１に加えて音声分析部１４、および係数決定
部１５が存在する。実施例１で説明したあらかじめ設定
された値、振幅強調の最大値Ｘmax および、振幅強調
の割合ｒは使用者に個別に決定した方が効果が大きい。
あるいは、同じ使用者であっても、声の調子などによっ
て変更した方が、より効果的となる。本実施例では、音
声分析時には、パワーを求めてＸmaxの値を決定すると
同時に、ｒ＝0.1、0.2、0.3〜1.0の各値に対して振幅強
調音を作成し、確認の上最適なｒの値を決定する。(Embodiment 6) FIG. 8 is a block diagram of an embodiment of the present invention according to claim 6. The components having the same functions as those of the first embodiment are denoted by the same reference numerals, and description thereof is omitted. A voice analysis unit 14 and a coefficient determination unit 15 are provided in addition to the first embodiment. The preset value, the maximum value Xmax of the amplitude emphasis, and the ratio r of the amplitude emphasis described in the first embodiment are more effective when individually determined by the user.
Alternatively, even for the same user, it is more effective to change it according to the tone of the voice. In the present embodiment, at the time of voice analysis, the value of Xmax is determined by obtaining power, and at the same time, an amplitude emphasized sound is created for each value of r = 0.1, 0.2, 0.3 to 1.0, and the optimum r Determine the value.

【００３５】なお、本実施例では、音声分析・係数設定
機能を音声変換装置に組み込んだ形として実現したが、
この機能を別途パソコン上に実現し、得られた係数だけ
を音声変換装置に設定する構成でも、実現可能である。In this embodiment, the speech analysis and coefficient setting functions are implemented as a form incorporated in the speech converter.
It is also possible to realize this function by separately realizing this function on a personal computer and setting only the obtained coefficients in the voice conversion device.

【００３６】[0036]

【発明の効果】以上ように本発明によれば、食道発声音
声の明瞭度・聞き易さを大きく改善することができる。As described above, according to the present invention, the clarity and audibility of the esophageal voice can be greatly improved.

[Brief description of the drawings]

【図１】本発明実施例１の音声変換装置の構成図FIG. 1 is a configuration diagram of a voice conversion device according to a first embodiment of the present invention;

【図２】本発明実施例１のアルゴリズムを説明するため
の図FIG. 2 is a diagram for explaining an algorithm according to the first embodiment of the present invention;

【図３】本発明実施例１の入力波形、出力波形を示す図FIG. 3 is a diagram showing an input waveform and an output waveform of the first embodiment of the present invention.

【図４】本発明実施例２の音声変換装置の構成図FIG. 4 is a configuration diagram of a voice conversion device according to a second embodiment of the present invention;

【図５】本発明実施例３の音声変換装置の構成図FIG. 5 is a configuration diagram of a voice conversion device according to a third embodiment of the present invention.

【図６】本発明実施例４の音声変換装置の構成図FIG. 6 is a configuration diagram of a voice conversion device according to a fourth embodiment of the present invention.

【図７】本発明実施例５の音声変換装置の構成図FIG. 7 is a configuration diagram of a voice conversion device according to a fifth embodiment of the present invention.

【図８】本発明実施例６の音声変換装置の構成図FIG. 8 is a configuration diagram of a voice conversion device according to a sixth embodiment of the present invention.

[Explanation of symbols]

１音声入力端（マイク）２Ａ／Ｄ３波形記憶部４零交差検出部５パワー計算部６パワー比較部７振幅強調部８Ｄ／Ａ９音声出力端（スピーカ）１０基本周波数検出部１１振幅拡張部１２有音／無音判定部１３有声／無声判定部１４音声分析部１５係数決定部 Reference Signs List 1 audio input terminal (microphone) 2 A / D 3 waveform storage unit 4 zero-crossing detection unit 5 power calculation unit 6 power comparison unit 7 amplitude emphasis unit 8 D / A 9 audio output terminal (speaker) 10 fundamental frequency detection unit 11 amplitude Extension unit 12 Voiced / unvoiced determination unit 13 Voiced / unvoiced determination unit 14 Voice analysis unit 15 Coefficient determination unit

Claims

[Claims]

1. A voice input means for inputting voice, a means for AD converting input voice, a voice storage means for storing input voice, a means for detecting a zero crossing point of input voice, and a voice And a means for calculating the power of the signal, and a means for performing amplitude emphasis for each section detected by the zero-crossing detecting means.

2. A voice input means for inputting voice, a means for AD converting input voice, a voice storage means for storing input voice, a means for detecting a fundamental frequency of input voice, and a voice And a means for calculating the power of the fundamental frequency and a means for performing amplitude emphasis for each fundamental period detected by the fundamental frequency detecting means.

3. A voice input means for inputting voice, an A / D conversion means for input voice, a voice storage means for storing input voice, a means for detecting a zero crossing point of the input voice, and an amplitude. A voice conversion device comprising: a loudspeaker for multiplying a constant by a factor; a calculator for calculating the power of voice; and a unit for performing amplitude emphasis for each section detected by the zero-crossing detector.

4. A voice input means for inputting voice, a means for AD converting input voice, a voice storage means for storing input voice, and a means for detecting a zero crossing point of the input voice. A voice conversion apparatus comprising: means for determining sound / silence; and means for performing amplitude emphasis for each section detected by the zero-crossing detecting means.

5. Voice input means for inputting voice, means for AD converting input voice, voice storage means for storing input voice, means for detecting a zero crossing point of input voice, and voiced. A voice conversion apparatus comprising: means for determining unvoicedness; and means for performing amplitude emphasis for each section detected by the zero-crossing detecting means.

6. Voice input means for inputting voice, means for AD converting input voice, voice storage means for storing input voice, means for detecting a zero-crossing point of input voice, and voice. And a means for performing amplitude emphasis for each section detected by the zero-crossing detection means and a means for analyzing voice, wherein parameters used for amplitude emphasis are determined by the analysis means. Voice converter.