JP2001154684A

JP2001154684A - Speech speed converter

Info

Publication number: JP2001154684A
Application number: JP33315599A
Authority: JP
Inventors: Kotaro Machidera; 侯大郎待寺; Chikako Ohara; 千賀子大原
Original assignee: Anritsu Corp
Current assignee: Anritsu Corp
Priority date: 1999-11-24
Filing date: 1999-11-24
Publication date: 2001-06-08

Abstract

PROBLEM TO BE SOLVED: To gradually change the speech speed of the beginning of the speech to a target speech speed in a speech speed converter. SOLUTION: The converter analyzes inputted digital voice signals, synthesizes the analyzed digital voice signals, generates new voice signals having a specified speech speed, converts the new voice signals into analog voice signals and outputs the signals to the external. The converter is provided with a sound detecting means 13 which detects a first sound in the inputted voice signals and a speech speed computing section 14 which changes a normal speech speed to a faster target speech speed within a prescribed time interval starting from the sound detecting time of the means 13, successively computes a speech speed to maintain the target speech speed and transmits the speech to a signal synthesis section.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声の話速を変更
する話速変換装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech speed conversion device for changing the speech speed of voice.

【０００２】[0002]

【従来の技術】例えば外国語を学習する方法として、実
際にその外国語を耳で聞いて学習する学習法が効果的で
ある。この場合、同じ会話を繰り返し聴くことが重要で
ある。したがって、外国語を聴く能力を向上させるため
に、簡便な方法として、例えば、数分〜十数分の会話又
は朗読を録音したテープを再生して学習する。2. Description of the Related Art For example, as a method of learning a foreign language, a learning method of actually learning by listening to the foreign language by ear is effective. In this case, it is important to listen to the same conversation repeatedly. Therefore, in order to improve the ability to listen to a foreign language, as a simple method, for example, learning is performed by playing a tape on which a conversation or reading for several minutes to several tens of minutes is recorded.

【０００３】この場合、語学学習専用のテープレコーダ
においては、音声の再生速度を一定の範囲で可変できる
ように構成されている。したがって、初級学習者は再生
速度を遅くして録音された会話をゆっくり聴き、上級学
習者は再生速度を速くして録音された会話を速く聴くこ
とによって学習効率を向上させることができる。[0003] In this case, a tape recorder dedicated to language learning is configured so that the sound reproduction speed can be varied within a certain range. Therefore, the beginner learner can listen to the recorded conversation at a slower playback speed, and the advanced learner can improve the learning efficiency by listening to the recorded conversation at a faster playback speed.

【０００４】しかし、ただ単に音声の再生速度を変更さ
せたのみでは、再生される音声の周波数が変化してしま
い、音声が不自然に聞こえる。However, simply changing the reproduction speed of the sound changes the frequency of the reproduced sound, and the sound sounds unnatural.

【０００５】このような不都合を解消するために、話す
速度、すなわち話速を変化させたとしても、音声の周波
数は変化せず、ただ話し方がゆっくりになったり、早口
になるのみで自然に聞こえる話速変換手法が提唱されて
いる。次に、この話速変換手法を図１３及び図１４を用
いて説明する。[0005] Even if the speaking speed, that is, the speaking speed is changed in order to solve such inconvenience, the frequency of the voice does not change, and the sound is heard naturally only by slowing down the speech or making the speaker speak faster. A speech speed conversion method has been proposed. Next, this speech speed conversion method will be described with reference to FIGS.

【０００６】図１３は、例えば「 It's difficult for
me to finish… 」と話した場合の音声信号１の波形
図である。図１４はこの音声信号１の拡大図である。周
知のように、音声には子音と母音とがあり、音声信号１
にもそれに対応した子音と母音とがある。図示するよう
に子音は１個の無声音２で構成され、母音は複数の有声
音３で構成されている。また、音声信号１には音声の途
切れたときの無音４が存在する。FIG. 13 shows, for example, "It's difficult for
It is a waveform diagram of the audio signal 1 when saying "me to finish ...". FIG. 14 is an enlarged view of the audio signal 1. As is well known, speech includes consonants and vowels, and audio signal 1
Also have corresponding consonants and vowels. As shown, the consonant is composed of one unvoiced sound 2 and the vowel is composed of a plurality of voiced sounds 3. The audio signal 1 includes silence 4 when the audio is interrupted.

【０００７】子音を構成する無声音２は比較的高い周波
数成分を有し、母音を構成する複数の有声音３はほぼ同
一波形を有する。したがって、話速を速くするために
は、母音を構成する複数の有声音３のうちの１個又は複
数の有声音３を間引いて、間引いた有声音３の前後の有
声音３どうし、又は無声音２と有声音３、又は有声音３
と無音４とを接続する。よって、母音の継続時間を短縮
でき、結果として音声信号１の全体の時間を短くでき、
音声の周波数や音質を変更することなく話速を速くでき
る。[0007] The unvoiced sound 2 constituting a consonant has a relatively high frequency component, and the plurality of voiced sounds 3 constituting a vowel have substantially the same waveform. Therefore, in order to increase the speech speed, one or a plurality of voiced sounds 3 among the plurality of voiced sounds 3 constituting the vowel are thinned out, and the voiced sounds 3 before and after the thinned voiced sound 3 or unvoiced sounds 3 are mixed. 2 and voiced sound 3 or voiced sound 3
And silence 4 are connected. Therefore, the duration of the vowel can be shortened, and as a result, the entire time of the audio signal 1 can be shortened,
Speaking speed can be increased without changing the frequency or sound quality of the voice.

【０００８】逆に、話速を遅くする場合は、母音を構成
する複数の有声音３に対して同一の有声音３を挿入して
母音の継続時間を長くすればよい。Conversely, when the speech speed is to be reduced, the same voiced sound 3 may be inserted into a plurality of voiced sounds 3 constituting a vowel to extend the duration of the vowel.

【０００９】この話速変換を自動的に行うためには、音
声信号１に含まれる無声音２と有声音３と無音４とを区
分けする必要がある。この区分手法として、母音は複数
の有声音３が継続する性質を利用して、音声信号１に対
して自己相関関数を算出することにより、無声音２と有
声音３との区分け、及び各有声音３の継続時間（ピッ
チ）が検出する。In order to automatically perform the speech speed conversion, it is necessary to classify the unvoiced sound 2, the voiced sound 3 and the silent sound 4 included in the voice signal 1. As a classification method, a vowel uses the property that a plurality of voiced sounds 3 continue, and calculates an autocorrelation function for the audio signal 1, thereby classifying the unvoiced sound 2 and the voiced sound 3 and each voiced sound. The duration (pitch) of 3 is detected.

【００１０】そして、母音を構成する複数の有声音３の
うち何個の有声音３を間引くか、又は何個の有声音３を
挿入するかで、音声信号１の話速が定まる。[0010] The speech speed of the voice signal 1 is determined by how many voiced sounds 3 among a plurality of voiced sounds 3 constituting a vowel are thinned out or how many voiced sounds 3 are inserted.

【００１１】したがって、このような話速変換機能が組
込まれた音声再生装置を用いることにより、語学学習者
は、違和感なく、速い速度又は遅い速度で会話やナレー
ションを聴くことができる。[0011] Therefore, by using a voice reproducing apparatus incorporating such a speech speed conversion function, a language learner can listen to conversation or narration at a high speed or a low speed without a sense of incongruity.

【００１２】[0012]

【発明が解決しようとする課題】しかしながら、上述し
た話速変換機能が組込まれた音声再生装置においても、
まだ改良すべき次のような課題があった。However, even in a sound reproducing apparatus incorporating the above-mentioned speech speed conversion function,
There were the following issues that still need to be improved.

【００１３】すなわち、上述した話速変換機能が組込ま
れた音声再生装置においては、音声を再生開始してから
再生終了まで、同一の速度である。したがって、たとえ
上級学習者にとっても、再生開始直後からいきなり早口
で再生されたとしても、その早口に慣れるまで時間を要
し、会話又はナレーションの冒頭部分が正確に聞取れな
い問題があった。That is, in the audio reproducing apparatus in which the above-mentioned speech speed conversion function is incorporated, the speed is the same from the start of the reproduction of the sound to the end of the reproduction. Therefore, even for advanced learners, even if the content is played back immediately after the start of playback, it takes time to get used to the playback, and there is a problem that the beginning of a conversation or narration cannot be heard accurately.

【００１４】本発明はこのような事情に鑑みてなされた
ものであり、話し始めは通常の速度であり、その後、次
第に速くしていくことにより、たとえ話し速度を速く設
定したとしても、会話又はナレーションの冒頭部分も正
確に聞き分けることができ、語学学習者にとって、使い
勝手の良い話速変換装置を提供することを目的とする。[0014] The present invention has been made in view of such circumstances, and the speech speed is normal at the beginning, and then gradually increased, so that even if the speech speed is set to be high, the conversation or An object of the present invention is to provide a speech speed conversion device that can accurately recognize the beginning of a narration and that is easy for a language learner to use.

【００１５】[0015]

【課題を解決するための手段】上記課題を解消するため
に、本発明は、入力されたデジタルの音声信号を記憶す
る音声信号メモリと、この音声信号メモリから出力され
たデジタルの音声信号を解析する音声解析部と、この音
声解析部で解析されたデジタルの音声信号を合成して指
定された話速を有する新たな音声信号として出力する信
号合成部とを備えた話速変換装置に適用される。In order to solve the above-mentioned problems, the present invention provides an audio signal memory for storing an input digital audio signal, and a digital audio signal output from the audio signal memory. The present invention is applied to a speech speed conversion device comprising a voice analysis unit for performing the above-described processing, and a signal synthesis unit for synthesizing the digital voice signal analyzed by the voice analysis unit and outputting as a new voice signal having a specified voice speed. You.

【００１６】そして、上記課題を解消するために、本発
明においては、入力された音声信号における最初の有音
を検出する有音検出手段と、この有音検出手段の有声音
検出時刻から規定時間経過するまでの期間内に、時間経
過に伴って通常話速より速い目標話速まで変化させ、か
つ規定時間経過後に、目標話速を維持する話速を順次算
出して信号合成部へ送出する話速算出部とを備えたもの
である。In order to solve the above-mentioned problem, the present invention provides a sound detecting means for detecting the first sound in an input audio signal, and a specified time from a voice sound detecting time of the sound detecting means. Within the period until the elapsed time, the speech speed is changed to the target speech speed higher than the normal speech speed with the passage of time, and after the lapse of the specified time, the speech speed for maintaining the target speech speed is sequentially calculated and transmitted to the signal synthesis unit. And a speech speed calculation unit.

【００１７】このように構成された話速変換装置におい
ては、入力された音声信号は音声信号メモリへ一旦格納
された後、読出されて信号合成部で指定された話速を有
する新たな音声信号に合成される。そして、この信号合
成部に印加される話速が、音声信号における最初の有音
の検出時刻から規定時間経過するまでの期間、時間経過
に伴って通常話速より速い目標話速まで増加し、その後
目標話速を維持するように制御される。In the thus constructed speech speed conversion device, the input speech signal is temporarily stored in the speech signal memory, read out, and then read into a new speech signal having the speech speed designated by the signal synthesis unit. Is synthesized. Then, the speech speed applied to the signal synthesizing unit is increased from the detection time of the first sound in the audio signal to the target speech speed higher than the normal speech speed with the lapse of time from the detection time of the first sound to the specified time, Thereafter, control is performed so as to maintain the target speech speed.

【００１８】したがって、この話速変換装置が組込まれ
た再生装置においては、会話やナレーションの冒頭部分
のみ通常話速でその後は通常話速より速い目標話速の早
口となる。よって、語学学習者が会話やナレーションの
冒頭部分を聞き逃すことはない。Therefore, in the reproducing apparatus in which the speech speed conversion device is incorporated, only the beginning portion of the conversation or the narration becomes the normal speech speed, and thereafter the target speech speed becomes faster than the normal speech speed. Thus, language learners do not miss the beginning of a conversation or narration.

【００１９】また、別の発明は、入力されたデジタルの
音声信号を記憶する音声信号メモリと、この音声信号メ
モリから出力されたデジタルの音声信号を無声音と有声
音と無音とに区分けする音声解析部と、この音声解析部
で区分けされた有声音のみを指定された話速に対応して
間引き、無声音と間引かれた後の有声音と無音とを接続
して新たな音声信号を合成する信号合成部とを備えた話
速変換装置に適用される。Further, another invention relates to an audio signal memory for storing an input digital audio signal, and an audio analysis for dividing the digital audio signal output from the audio signal memory into unvoiced sound, voiced sound, and silent sound. Unit and only the voiced sound segmented by the speech analysis unit are decimated in accordance with the specified speech speed, and the unvoiced sound and the decimated voiced sound and silence are connected to synthesize a new speech signal. The present invention is applied to a speech speed conversion device including a signal synthesis unit.

【００２０】そして、上記課題を解決するために、本発
明においては、入力された音声信号の開始時刻から規定
時間経過するまでの期間内に、時間経過に伴って通常話
速より速い目標話速まで変化させ、かつ規定時間経過
後、目標話速を維持する話速を順次算出して信号合成部
へ送出する話速算出部を備えている。In order to solve the above-mentioned problem, according to the present invention, a target speech speed higher than a normal speech speed with a lapse of time within a period from a start time of an input audio signal to a lapse of a specified time. And a speech speed calculation unit for sequentially calculating a speech speed for maintaining a target speech speed after a lapse of a specified time and sending the calculated speech speed to a signal synthesis unit.

【００２１】このように構成された話速変換装置におい
ては、入力された音声信号は、音声信号メモリに一旦記
憶された後、音声解析部で前述した無声音と有声音と無
音とに区分される。そして、次の信号合成部で、指定さ
れた通常話速より速い話速に対応した数の有声音が間引
かれて指定された話速の新たな音声信号となる。In the thus constructed speech speed converter, the input speech signal is temporarily stored in the speech signal memory and then divided by the speech analysis unit into the above-mentioned unvoiced sound, voiced sound and silence. . Then, in the next signal synthesizing unit, the number of voiced sounds corresponding to the speech speed higher than the designated normal speech speed is thinned out to become a new voice signal of the designated speech speed.

【００２２】そして、この信号合成部へ供給する話速
は、話速算出部において、入力された音声信号の開始時
刻から規定時間経過するまでの期間、時間経過に伴って
通常話速より速い目標話速まで増加し、その後目標話速
を維持するように制御される。The speech speed to be supplied to the signal synthesizing unit is, in the speech speed calculating unit, a period from the start time of the input voice signal to a lapse of a predetermined time, and a target speed higher than the normal speech speed with time. It is controlled to increase to the speech speed and then maintain the target speech speed.

【００２３】したがって、上述した発明とほぼ同様の効
果を奏することが可能である。Therefore, it is possible to obtain substantially the same effect as the above-described invention.

【００２４】又、別の発明においては、上述した話速変
換装置において、入力された音声信号の開始時刻は、こ
の音声信号における音声解析部で区分された先頭の有声
音の開始時刻としている。According to another aspect of the present invention, in the above-described speech speed conversion device, the start time of the input voice signal is the start time of the first voiced sound of the voice signal which is divided by the voice analysis unit.

【００２５】前述したように、音声信号は子音に対応す
る無声音と母音に対応する有声音とがある。「シーッ」
や「チーッ」等の無声音はあまり意味をなさず、有声音
が意味をなす場合が多い。したがって、有声音の開始時
刻から話速の変化を開始するのが望ましい。As described above, voice signals include unvoiced sounds corresponding to consonants and voiced sounds corresponding to vowels. "Shhh"
Unvoiced sounds such as "chic" and "chee" do not make much sense, and voiced sounds often make sense. Therefore, it is desirable to start changing the speech speed from the start time of the voiced sound.

【００２６】さらに別の発明においては、上述した話速
変換装置における話速倍率算出部は、音声解析部が音声
信号における各無音におけ無音開始時刻から該当無音の
継続期間を計時する無音継続期間計時手段と、無音継続
期間計時手段がしきい値時間を計時する毎に、計時回数
に応じて増加する話速を算出する話速不連続算出手段と
を備えている。In another aspect of the present invention, the speech rate multiplying unit in the speech rate converting apparatus described above is characterized in that the speech analyzing unit measures a duration of the silence from a silence start time for each silence in the speech signal. A timer is provided, and a speech speed discontinuity calculator is configured to calculate a speech speed that increases in accordance with the number of times each time the silent duration measuring unit measures the threshold time.

【００２７】このように構成された話速変換装置におい
ては、会話やナレーションの途中に存在するしきい値以
上の無音期間が発生する毎に、話速が増加する。したが
って、一つの単語や一つの文章の途中で話速が変化する
ことが防止され、より自然な言葉として聴くことができ
る。In the thus constructed speech speed conversion device, the speech speed increases each time a silent period equal to or greater than a threshold value occurs during conversation or narration. Therefore, it is possible to prevent the speech speed from being changed in the middle of one word or one sentence, and to listen as a more natural word.

【００２８】さらに別の発明は、上述した話速変換装置
において、音声解析部が音声信号における各無音におけ
無音開始時刻から該当無音の継続期間を計時する無音継
続期間計時手段と、この無音継続期間計時手段がしきい
値時間を計時する毎に、該当無音の次に来る無声音又は
有声音の開始時刻から時間経過に伴って通常話速より速
い目標話速まで変化させ、その後目標話速を維持する話
速を順次算出して信号合成部へ送出する話速算出部とを
備えている。According to still another aspect of the present invention, there is provided the above-mentioned speech speed conversion apparatus, wherein the voice analysis unit measures the duration of the corresponding silence from the silence start time at each silence in the voice signal; Every time the period timer counts the threshold time, it changes from the start time of the unvoiced sound or voiced sound that follows the corresponding silence to a target voice speed higher than the normal voice speed with the passage of time, and thereafter the target voice speed is changed. A speech speed calculating unit for sequentially calculating the maintained speech speed and sending it to the signal combining unit.

【００２９】このように構成された、話速変換装置にお
いては、会話やナレーションの途中に存在するしきい値
以上の無音期間が発生する毎に、話速が通常話速から通
常の話速より速い目標話速まで変化する。In the thus constructed speech speed converter, the speech speed is changed from the normal speech speed to the normal speech speed every time a silent period equal to or greater than a threshold value occurs during a conversation or narration. It changes to a fast target speech speed.

【００３０】[0030]

【発明の実施の形態】以下、本発明の各実施形態を図面
を用いて説明する。（第１実施形態）図１は本発明の第１実施形態に係る話
速変換装置の概略構成を示すブロック図である。入力端
子５に対して図１３に示した音声信号１と同一構成の一
連の音声信号ａが入力される。したがって、この音声信
号ａは、図１４に示すように、子音に対応する無声音２
と、母音に対応する有声音３と、無音４とで構成されて
いる。そして、この実施形態の話速変換装置において
は、無声音２又は有声音３の継続期間を有音期間１６と
定義し、無音４の継続期間を無音期間１７と定義する。Embodiments of the present invention will be described below with reference to the drawings. (First Embodiment) FIG. 1 is a block diagram showing a schematic configuration of a speech speed conversion device according to a first embodiment of the present invention. A series of audio signals a having the same configuration as the audio signal 1 shown in FIG. Therefore, as shown in FIG. 14, this audio signal a is a voiceless sound 2 corresponding to a consonant.
And a voiced sound 3 corresponding to a vowel and a silence 4. Then, in the speech speed conversion device of this embodiment, the duration of the unvoiced sound 2 or the voiced sound 3 is defined as a voiced period 16, and the duration of the silence 4 is defined as a silent period 17.

【００３１】入力端子５から入力されたアナログの音声
信号ａは、Ａ／Ｄ変換器５でデジタルの音声信号に変換
された後、音声信号メモリ７に蓄積される。音声解析部
８は、この音声信号メモリ８に書込まれた一連のデジタ
ルの音声信号ａ₁を無声音２と、有声音３と、無音４と
に区分けする。具体的には、音声信号ａ₁の信吾レベル
を調べて、有音期間１６と無音期間１７とを区分けす
る。その後、各有音期間１６の信号に対して自己相関解
析を実施して、この有音期間１６を無声音２と有声音３
とに区分けする。The analog audio signal a input from the input terminal 5 is converted into a digital audio signal by the A / D converter 5 and then stored in the audio signal memory 7. The audio analysis unit 8 divides the series of digital audio signals a ₁ written in the audio signal memory 8 into unvoiced sounds 2, voiced sounds 3, and silent sounds 4. Specifically, the speech level of the audio signal a ₁ is checked, and the sound period 16 and the silence period 17 are classified. After that, the autocorrelation analysis is performed on the signal of each voiced period 16 to determine the voiced period 16 as unvoiced sound 2 and voiced sound 3.
And is divided into

【００３２】音声解析部８で、無声音２と有声音３と無
音４とに区分けされた音声信号ａ₂は信号合成部９へ入
力される。信号合成部９は、入力された音声信号ａ₂に
おける各母音を構成する複数の有声音３のうち、話速算
出部１４にで指定された話速としての話速倍率Ｙに対応
した数だけ間引く。そして、入力された音声信号ａ₂に
おける無声音２と間引き後の有声音３と無音４とを接続
して新たな音声信号ａ ₃を合成して出力する。ここで、
話速倍率Ｙとは、速度変更を実施していない通常の話速
を１（基準）とした場合の倍率である。In the voice analysis unit 8, the unvoiced sound 2, the voiced sound 3 and the
Sound signal a divided into sound 4_TwoEnters the signal synthesizer 9
Is forced. The signal synthesizing unit 9 receives the input audio signal a_TwoTo
Speed calculation of voiced sounds 3 that make up each vowel
Corresponds to the speech speed magnification Y as the speech speed specified in the output unit 14
Decimate by the number you chose. Then, the input audio signal a_TwoTo
The unvoiced sound 2 and the decimated voiced sound 3 and silence 4
And a new audio signal a _ThreeAnd output. here,
The speech speed magnification Y is a normal speech speed without changing the speed.
Is 1 (reference).

【００３３】信号合成部９から出力された新たな音声信
号ａ₃は出力バッファ１０に一旦格納した後、Ｄ／Ａ変
換１１でアナログの音声信号ａ₄に変換されて、出力端
子１３から出力される。したがって、出力端子１２から
出力された新たなアナログの音声信号ａ₄は、入力端子
５に入力されたアナログの音声信号ａに対して、指定さ
れた話速倍率Ｙにする分だけ短くなり、その分、再生さ
れた会話又はナレーションの速度が速くなる。The new audio signal a ₃ output from the signal synthesizing unit 9 is temporarily stored in an output buffer 10, converted into an analog audio signal a ₄ by a D / A converter 11, and output from an output terminal 13. You. Therefore, the new analog audio signal a ₄ output from the output terminal 12 is shorter than the analog audio signal a input to the input terminal 5 by the specified speech speed magnification Y, and Minutes, the speed of the replayed conversation or narration is increased.

【００３４】音声信号メモリ７から出力されたデジタル
の音声信号ａ₁は、音声解析部８へ入力されると共に、
有音検出部１３へ入力される。有音検出部１３は、入力
された音声信号ａ₁における最初の音声信号ａ₁の信吾レ
ベルを調べて、有音期間１６の先頭を有音検出として話
速算出部１４へ通知する。The digital audio signal a ₁ output from the audio signal memory 7 is input to the audio analyzer 8 and
It is input to the sound detection unit 13. The sound detection unit 13 checks the signal level of the first audio signal a ₁ in the input audio signal a ₁ , and notifies the speech speed calculation unit 14 of the beginning of the sound period 16 as sound detection.

【００３５】話速算出部１４には、時間設定部１８から
規定時間Ｔ_Bが設定されると共に、目標話速設定部１９
から目標話速としての目標話速倍率Ａが設定される。[0035] The speech speed calculation part 14, with the specified time T _B from the time setting unit 18 is set, the target speech speed setting unit 19
, The target speech speed magnification A as the target speech speed is set.

【００３６】次に、話速算出部１４における話速倍率Ｙ
の算出処理動作を説明する。この話速算出部１４は、図
３に示すように、デジタルの音声信号ａ₂の有音検出時
刻ｔ_Sから規定時間Ｔ_B経過時刻ｔ_Eまでの期間内に、話
速倍率Ｙを通常話速に対応する通常話速倍率（Ｙ＝１）
から通常話速より速い目標話速倍率Ａへ変化させる処理
を実施する。Next, the speech speed magnification Y in the speech speed calculation unit 14
Will be described. The speech rate calculation unit 14, as shown in FIG. 3, within the period from sound detection time t _S of the digital audio signal a ₂ to the specified time T _B has elapsed time t _E, usually talking speech speed ratio Y Normal speech speed magnification corresponding to speed (Y = 1)
From the target voice speed magnification A higher than the normal voice speed.

【００３７】具体的には、図２に示す処理を実施する。
図２において、有音検出部１３が有音を検出すると（Ｒ
１）、経過時間Ｔを初期化（Ｔ＝０）する（Ｒ２）。そ
して、微小時間Δｔが経過すると（Ｒ３）、Ｒ４にて経
過時間Ｔを更新する（Ｔ＝Ｔ＋Δｔ）。そして、更新後
の経過時間Ｔが規定時間Ｔ_B未満の場合（Ｒ５）、下式
に示す話速倍率Ｙの算出を行う（Ｒ６）。Ｙ＝［（Ａ―１）／Ｔ_B］Ｔ＋１算出した話速倍率Ｙを信号合成部９へ送出する（Ｒ
７）。そして、Ｒ３へ戻り、次の微小時間Δｔの経過を
待つ。Specifically, the processing shown in FIG. 2 is performed.
In FIG. 2, when the sound detection unit 13 detects a sound (R
1) The elapsed time T is initialized (T = 0) (R2). When the short time Δt has elapsed (R3), the elapsed time T is updated at R4 (T = T + Δt). When the elapsed time T after the update is less than the predetermined time T _B (R5), and calculates the speech speed ratio Y shown in the following formula (R6). Y = [(A-1) / T _B ] T + 1 The calculated speech speed magnification Y is transmitted to the signal synthesizing unit 9 (R
7). Then, the process returns to R3 and waits for the elapse of the next minute time Δt.

【００３８】Ｒ５にて、更新後の経過時間Ｔが規定時間
Ｔ_Bに達すると、目標話速倍率Ｙ＝Ａを信号合成部９へ
送出する。[0038] At R5, the elapsed time T after the update reaches the predetermined time T _B, and sends the target speech speed ratio Y = A to the signal synthesizing unit 9.

【００３９】このように構成された話速変換装置におい
ては、図３に示すように、時刻ｔ₀で音声信号ａが入力
開始されると、出力端子１２から出力される音声信号ａ
₄の話速は通常話速（Ｙ＝１）である。そして、時刻ｔ_S
にて、有音期間１６が開始され、有音が検出されると、
経過時間Ｔの計時が開始され、話速倍率Ｙが増加を開始
する。In the speech speed conversion device thus configured, as shown in FIG. 3, when the input of the audio signal a is started at time t ₀ , the audio signal a output from the output terminal 12 is output.
The voice speed of ₄ is the normal voice speed (Y = 1). Then, the time t _S
At, the sound period 16 starts, and when sound is detected,
Timing of the elapsed time T is started, and the speech speed magnification Y starts increasing.

【００４０】そして、時刻ｔ_Sから経過時間Ｔが規定時
間Ｔ_Bを経過した時刻ｔ_Eにて、話速倍率Ｙが目標話速倍
率Ａに達する（Ｙ＝Ａ）。規定時間Ｔ_Bを経過した時刻
ｔ_E以降は、話速倍率Ｙは目標話速倍率Ａを維持する。[0040] Then, at time t _E the elapsed time T has exceeded the specified time T _B from the time t _S, speech speed ratio Y reaches the target speech speed ratio A (Y = A). Time t _E after a lapse of specified time T _B is speaking rate ratio Y maintains the target speech speed ratio A.

【００４１】したがって、この話速変換装置を用いるこ
とによって、会話やナレーションの冒頭部分のみ通常話
速でその後は目標話速の早口となる。よって、語学学習
者が会話やナレーションの冒頭部分を聞き逃すことはな
い。Therefore, by using this speech speed conversion device, only the beginning portion of the conversation or narration becomes the normal speech speed, and thereafter the target speech speed becomes faster. Thus, language learners do not miss the beginning of a conversation or narration.

【００４２】（第２実施形態）図４は本発明の第２実施
形態に係る話速変換装置の概略構成を示すブロック図で
ある。図１に示す第１実施形態の話速変換装置と同一部
分には同一符号を付して重複する部分の詳細説明を省略
する。(Second Embodiment) FIG. 4 is a block diagram showing a schematic configuration of a speech speed conversion device according to a second embodiment of the present invention. The same portions as those of the speech speed conversion device of the first embodiment shown in FIG. 1 are denoted by the same reference numerals, and detailed description of the overlapping portions will be omitted.

【００４３】この話速変換装置においては、音声解析部
８で、無声音２と有声音３と無音４とに区分けされた音
声信号ａ₂は信号合成部９及び話速算出部１４ａへ入力
される。信号合成部９は、入力された音声信号ａ₂にお
ける各母音を構成する複数の有声音３のうち、話速設定
部１４ａにで指定された話速倍率Ｙに対応した数だけ間
引く。そして、入力された音声信号ａ₂における無声音
２と間引き後の有声音３と無音４とを接続して新たな音
声信号ａ₃を合成して出力する。In this speech speed conversion device, the speech signal a ₂ divided into the unvoiced sound 2, the voiced sound 3 and the silence 4 by the speech analysis unit 8 is input to the signal synthesis unit 9 and the speech speed calculation unit 14a. . Signal combining unit 9, among the plurality of voiced 3 constituting each vowel in the speech signal a ₂ input, thins out the number corresponding to the designated speech rate ratio Y in the speech speed setting unit 14a. Then, the unvoiced sound ₂ in the input audio signal a ₂ and the voiced sound 3 and the silence 4 after the thinning are connected to synthesize a new audio signal a ₃ and output it.

【００４４】したがって、出力端子１２から出力された
新たなアナログの音声信号ａ₄の話速は、入力端子１に
入力されたアナログの音声信号ａに対して、話速設定部
１４ａにで指定された話速倍率Ｙの話速となる。Therefore, the speech speed of the new analog voice signal a ₄ output from the output terminal 12 is specified by the voice speed setting unit 14 a with respect to the analog voice signal a input to the input terminal 1. It becomes the speaking speed of the speaking speed magnification Y.

【００４５】話速算出部１４ａには、ステップ数設定部
２０からステップ数Ｍが入力されると共に、目標話速設
定部１９から目標話速倍率Ａが入力される。ステップ数
Ｍは、話速倍率Ｙを通常話速（Ｙ＝１）から何ステップ
で目標話速倍率Ａまで移行させるかのステップ数であ
る。The speech speed calculation unit 14a receives the number of steps M from the step number setting unit 20 and the target speech speed magnification A from the target speech speed setting unit 19. The number of steps M is the number of steps at which the voice speed Y is shifted from the normal voice speed (Y = 1) to the target voice speed A.

【００４６】次に、話速算出部１４ａにおける話速倍率
Ｙの算出処理動作を説明する。この話速算出部１４ａ
は、図６に示すように、音声信号ａ₂の入力時刻ｔ_Sから
規定時間Ｔ_B経過までの期間内に、話速倍率Ｙを通常話
速（Ｙ＝１）から通常話速より速い目標話速倍率ＡへＭ
ステップで変化させる処理を実施する。Next, the operation of calculating the speech speed magnification Y in the speech speed calculator 14a will be described. This speech speed calculator 14a
As shown in FIG. 6, in the period until the specified time T _B has elapsed from the input time t _S of the audio signal a _2, faster target than normal speaking speed the speech speed ratio Y from the normal speech speed (Y = 1) M to speech speed magnification A
Perform the process of changing in steps.

【００４７】具体的には、図５に示す処理を実施する。
図５において、図６における話速倍率Ｙのステップ数I
を初期値０に設定し（Ｉ＝０）、図７における音声信号
ａ₂の無音期間１７の継続時間Ｔがしきい値時間Ｔ_Sを超
えたことを示す時間経過フラグＦを初期値０に設定（Ｆ
＝０）する（Ｓ１）。Specifically, the processing shown in FIG. 5 is performed.
In FIG. 5, the number of steps I of the speech speed magnification Y in FIG.
Is set to an initial value 0 (I = 0), and a time lapse flag F indicating that the duration T of the silent period 17 of the audio signal a ₂ in FIG. 7 exceeds the threshold time T _S is set to the initial value 0. Setting (F
= 0) (S1).

【００４８】微小時間Δｔが経過すると（Ｓ２）、その
時点における音声信号ａ₂の解析結果を読取る（Ｓ
３）。そして、解析結果が無音期間１７の場合（Ｓ
４）、Ｓ５にて、無音期間１７の継続時間Ｔ₁を更新す
る（Ｔ₁＝Ｔ₁＋Δｔ）。When the minute time Δt has elapsed (S2), the analysis result of the audio signal a ₂ at that time is read (S2).
3). When the analysis result is the silent period 17 (S
4) In S5, the duration T ₁ of the silent period 17 is updated (T ₁ = T ₁ + Δt).

【００４９】更新後の継続時間Ｔ_! がしきい値時間Ｔ_S
を経過していない場合は（Ｓ６）、Ｓ２へ戻り、微小時
間Δｔの経過を待つ。更新後の継続時間Ｔ₁がしきい値
時間Ｔ_Sを経過した場合は（Ｓ６）、無音期間１７の継
続時間Ｔ₁を０にクリアし（Ｓ７）、時間経過フラグＦ
が０のままである場合（Ｓ８）、この無音期間１７で初
めて継続時間Ｔ₁がしきい値時間Ｔ_Sを経過したので、こ
の時間経過フラグＦを１に設定する（Ｓ９）。そして、
Ｓ１０にて、ステップ数Iを更新する（Ｉ＝Ｉ＋１）。The updated duration T _! Is equal to the threshold time T _S
Does not elapse (S6), the process returns to S2, and waits for the elapse of the minute time Δt. If the updated duration T ₁ has exceeded the threshold time T _S (S6), the duration T ₁ of the silence period 17 is cleared to 0 (S7), and the time elapsed flag F
Remains at 0 (S8), the duration T ₁ has exceeded the threshold time T _S for the first time in the silence period 17, so the time lapse flag F is set to 1 (S9). And
In S10, the number of steps I is updated (I = I + 1).

【００５０】そして、ステップ数Iが通常話速より速い
話速に対応する目標話速倍率Ａに対応する最終ステップ
数Ｍを超えていないことを確認すると（Ｓ１１）、次式
で、話速倍率Ｙを算出する（Ｓ１２）。Ｙ＝［（Ａ―１）／Ｍ］Ｉ＋１算出した話速倍率Ｙを信号合成部９へ送出する（Ｓ１
３）。Then, when it is confirmed that the number of steps I does not exceed the final number of steps M corresponding to the target speech speed magnification A corresponding to the speech speed higher than the normal speech speed (S11), the speech speed magnification is calculated by the following equation. Y is calculated (S12). Y = [(A-1) / M] I + 1 The calculated speech speed magnification Y is sent to the signal synthesizing unit 9 (S1).
3).

【００５１】なお、Ｓ８にて、既に、時間経過フラグＦ
が１に設定されていた場合は、この無音期間１７内で既
に話速倍率Ｙの算出処理（更新処理）が終了しているの
で、なにもせずにＳ２へ戻る。At S8, the time lapse flag F has already been set.
Is set to 1, since the calculation (updating) of the speech speed magnification Y has already been completed within the silent period 17, the process returns to S2 without doing anything.

【００５２】さらに、Ｓ４にて、無音期間１７でなく、
有音期間１６の場合、Ｓ１４へ進み、無音期間１７の継
続時間Ｔ₁を０にクリアし、かつ時間超過フラグＦをク
リアする。Further, in S4, instead of the silent period 17,
In the case of the sound period 16, the process proceeds to S 14, where the duration T ₁ of the silence period 17 is cleared to 0, and the time excess flag F is cleared.

【００５３】さらに、Ｓ１１にて、ステップ数Iが目標
話速倍率Ａに対応する最終ステップ数Ｍを超えると、話
速は既に通常話速より速い目標話速倍率Ａに達している
ので、なにもせずにＳ２へ戻る。Further, if the number of steps I exceeds the final number of steps M corresponding to the target speech speed magnification A in S11, the speech speed has already reached the target speech speed magnification A higher than the normal speech speed. Return to S2 without any action.

【００５４】このように構成された第２実施形態の話速
変換装置において、図７に示す信号波形を有した無声音
２と有声音３と無音４とに区分けされた音声信号ａ₂が
話速算出部１４ａへ入力された場合、時刻ｔ₂〜ｔ₃の区
間、時刻ｔ₄〜ｔ₆の区間、及び時刻ｔ₆〜ｔ₈の区間にお
いて、無音期間１７が存在するが、２番目の時刻ｔ₂〜
ｔ₃の区間の継続時間Ｔのみが、時刻ｔ₅にて、しきい値
時間Ｔ_Sを経過するので、この時刻ｔ₅にて、話速倍率Ｙ
が話速が上昇する方向に、１ステップだけ更新される。In the speech speed conversion device of the second embodiment thus configured, the speech signal a ₂ divided into the unvoiced sound 2, the voiced sound 3 and the silence 4 having the signal waveform shown in FIG. If input to the calculation unit 14a, the time t ₂ ~t ₃ sections, the time t ₄ ~t ₆ intervals, and in the period from time t ₆ ~t _8, although silent period 17 exists, the second time t ₂ ~
only the duration T of t ₃ of the section is, at time t _5, since passed the threshold time T _S, at this time t _5, the speech speed magnification Y
Is updated by one step in the direction in which the speech speed increases.

【００５５】したがって、図６に示すように、入力され
た音声信吾ａにおいて、音声信号の入力開始時刻から、
しきい値時間Ｔ_S以上の無音期間１７が発生する毎に、
話速が目標話速倍率Ａに達するまで変化される。そし
て、話速が目標話速倍率Ａに速に達すると、それ以降
は、目標話速倍率Ａを維持する。Therefore, as shown in FIG. 6, in the input voice signal “a”, from the input start time of the voice signal,
Each time a silent period 17 longer than the threshold time T _S occurs,
The speech speed is changed until the speech speed reaches the target speech speed magnification A. When the speech speed reaches the target speech speed magnification A, the target speech speed magnification A is maintained thereafter.

【００５６】したがって、図１に示した第１実施形態の
話速変換装置と同様に、この第２実施形態の話速変換装
置を用いることによって、会話やナレーションの冒頭部
分のみ通常話速でその後は目標話速の早口となる。よっ
て、語学学習者が会話やナレーションの冒頭部分を聞き
逃すことはない。Therefore, similarly to the speech speed conversion device of the first embodiment shown in FIG. 1, by using the speech speed conversion device of the second embodiment, only the beginning portion of conversation or narration is changed to the normal speech speed and thereafter Becomes the target speech speed. Thus, language learners do not miss the beginning of a conversation or narration.

【００５７】さらに、この第２実施形態の話速変換装置
においては、会話やナレーションの途中に存在するしき
い値時間Ｔ_S以上の無音期間１７が発生する毎に、話速
倍率Ｙは、話速が増加する方向に変化される。したがっ
て、一つの単語や一つの文章の途中で話速が変化するこ
とが防止され、より自然な言葉として聴くことができ
る。Further, in the speech speed conversion device of the second embodiment, each time a silent period 17 that is present in the middle of a conversation or a narration and has a threshold time T _S or more occurs, the speech speed magnification Y is increased. The speed is changed in the increasing direction. Therefore, it is possible to prevent the speech speed from being changed in the middle of one word or one sentence, and to listen as a more natural word.

【００５８】図８は、実際の会話（ナレーション）の話
速を、通常話速（Ｙ＝１）から中間話速（１＜Ｙ＜Ａ）
から目標話速（Ｙ＝Ａ）まで変化させた場合における出
力端子１２から出力される音声信号ａ₄の波形を示す図
である。図示するように、話速倍率Ｙが高くなると、各
母音を構成する複数の有声音３が間引かれることが理解
できる。FIG. 8 shows the actual conversation speed (narration) from the normal speech speed (Y = 1) to the intermediate speech speed (1 <Y <A).
From a diagram showing the waveform of the audio signal a ₄ output from the output terminal 12 in the case of changing to the target speech speed (Y = A). As shown in the figure, it can be understood that when the voice speed magnification Y increases, a plurality of voiced sounds 3 constituting each vowel are thinned out.

【００５９】（第３実施形態）次に本発明の第３実施形
態の話速変換装置を説明する。この第３実施形態の話速
変換装置のハード構成は図４に示す第２実施形態の話速
変換装置と同じである。そして、異なるところは、話速
算出部１４ａのソフト構成のみである。(Third Embodiment) Next, a speech speed converter according to a third embodiment of the present invention will be described. The hardware configuration of the speech speed conversion device of the third embodiment is the same as that of the speech speed conversion device of the second embodiment shown in FIG. The only difference is the software configuration of the speech speed calculation unit 14a.

【００６０】すなわち、この第３実施形態の話速変換装
置における話速算出部１４ａは、図１０に示すように、
音声信号ａ₂における無音期間１７の継続時間Ｔ₁がしき
い値時間Ｔ_Sを超える毎に、該当無音期間１７の次に来
る無声音２又は有声音３からなる有音期間１６の開始時
刻からの経過時間Ｔの増加に伴って通常話速（Ｙ＝１）
から通常話速より速い目標話速倍率Ｍまで変化し、その
後目標話速倍率Ｍを維持する話速倍率Ｙを順次算出して
信号合成部９へ送出する。That is, the speech speed calculator 14a in the speech speed converter of the third embodiment, as shown in FIG.
Every time the duration T ₁ of the silent period 17 in the audio signal a ₂ exceeds the threshold time T _S , from the start time of the voiced period 16 consisting of the unvoiced sound 2 or voiced sound 3 following the relevant silent period 17. Normal speech speed (Y = 1) as the elapsed time T increases
To the target speech speed M which is higher than the normal speech speed, and thereafter the speech speed Y which keeps the target speech speed M is sequentially calculated and sent to the signal synthesizing unit 9.

【００６１】具体的には、図９に示す処理を実施する。
図９において、図１０における音声信号ａ₂の無音期間
１７の継続時間Ｔ₁がしきい値時間Ｔ_Sを超えたことを示
す時間経過フラグＦを初期値０に設定（Ｆ＝０）する
（Ｑ１）。Specifically, the processing shown in FIG. 9 is performed.
In FIG. 9, a time lapse flag F indicating that the duration T ₁ of the silence period 17 of the audio signal a ₂ in FIG. 10 exceeds the threshold time T _S is set to an initial value 0 (F = 0) (FIG. 10). Q1).

【００６２】微小時間Δｔが経過すると（Ｑ２）、その
時点における音声信号ａ₂の解析結果を読取る（Ｑ
３）。そして、解析結果が無音期間１７の場合（Ｑ
４）、Ｑ５にて、無音期間１７の継続時間Ｔ₁を更新す
る（Ｔ₁＝Ｔ₁＋Δｔ）。When the short time Δt has elapsed (Q2), the analysis result of the audio signal a ₂ at that time is read (Q2).
3). When the analysis result is the silent period 17 (Q
4), in the Q5, to update the duration T ₁ of the silent period _{_{17 (T 1 = T 1 +}} Δt).

【００６３】更新後の継続時間Ｔ₁がしきい値時間Ｔ_Sを
経過していない場合は（Ｑ６）、Ｑ２へ戻り、微小時間
Δｔの経過を待つ。更新後の継続時間Ｔ₁がしきい値時
間Ｔ_Sを経過した場合は（Ｑ６）、この継続時間Ｔ₁を０
にクリアし（Ｑ７）、時間経過フラグＦが０のままであ
る場合（Ｑ８）、この無音期間１７で初めて、継続時間
Ｔ₁がしきい値時間Ｔ_Sを経過したので、この時間経過フ
ラグＦを１に設定する（Ｑ９）。そして、Ｑ１０にて、
話速倍率Ｙを変化させるための経過時間Ｔを０に初期化
する（Ｔ＝０）。If the updated continuation time T ₁ has not exceeded the threshold time T _S (Q 6), the flow returns to Q 2 and waits for the lapse of the minute time Δt. If the updated duration T ₁ has exceeded the threshold time T _S (Q6), the duration T _{1 is set} to 0.
(Q7), and if the time lapse flag F remains 0 (Q8), the duration T ₁ has exceeded the threshold time T _S for the first time in the silence period 17, so that the time lapse flag F Is set to 1 (Q9). And in Q10,
The elapsed time T for changing the voice speed magnification Y is initialized to 0 (T = 0).

【００６４】なお、Ｑ８にて、時間経過フラグＦが既に
１に設定されていた場合は、なにもせずにＱ２へ戻る。
Ｑ４にて、無音期間１７の場合は、Ｑ１１にて、経過時
間Ｔを更新し（Ｔ＝Ｔ＋Δｔ）、時間経過フラグＦが１
に設定されていた場合は（Ｑ１２）、この時間経過フラ
グＦを０に解除し、かつ経過時間Ｔを０にクリアする
（Ｑ１３）。そして、Ｑ１４へ進む。なお、Ｑ１２で既
に時間経過フラグＦが０に解除されていた場合はそのま
まＱ１４へ進む。If the time lapse flag F has already been set to 1 at Q8, the process returns to Q2 without doing anything.
In Q4, in the case of the silent period 17, the elapsed time T is updated in Q11 (T = T + Δt), and the time elapsed flag F is set to 1
Is set to (Q12), the elapsed time flag F is cleared to 0, and the elapsed time T is cleared to 0 (Q13). Then, the process proceeds to Q14. If the time lapse flag F has already been cleared to 0 in Q12, the process directly proceeds to Q14.

【００６５】Ｑ１４においては、下式を用いて、話速倍
率Ｙを算出する。In Q14, the speech speed magnification Y is calculated using the following equation.

【００６６】Ｙ＝Ａ―（Ａ―１）exp［―Ｔ／（Ｔ_B／５）］算出した話速倍率Ｙを信号合成部９へ設定する（Ｑ１
５）。Y = A− (A−1) exp [−T / (T _B / 5)] The calculated speech speed magnification Y is set in the signal synthesizing unit 9 (Q 1
5).

【００６７】このように構成された第３実施形態の話速
変換装置において、図１０に示す信号波形を有した無声
音２又は有声音３からなる有音時間１６と無音４からな
る無音期間１７とに区分けされた音声信号ａ₂が話速算
出部１４ａへ入力された場合を考える。In the speech speed converter according to the third embodiment having the above-described structure, the voiced time 16 composed of the unvoiced sound 2 or the voiced sound 3 having the signal waveform shown in FIG. audio signal a ₂ which is divided into consider the case where input to the speech rate calculation unit 14a.

【００６８】この場合、１番目の有音期間１７におい
て、話速は目標話速（Ｙ＝Ａ）まで増加するように変化
する。１番目の無音期間１７が到来しても、この無音期
間１７の継続期間Ｔ₁はしきい値時間Ｔ_Sより短いので、
話速は元に戻らない。そして、１番目の有音期間１６に
おいも話速は変化を続け、目標話速（Ｙ＝Ａ）に達する
と、この目標話速（Ｙ＝Ａ）を維持する。In this case, during the first sound period 17, the speech speed changes so as to increase to the target speech speed (Y = A). Even if the first silent period 17 arrives, the duration T ₁ of the silent period 17 is shorter than the threshold time T _S ,
The speech speed does not return. Then, the speech speed continues to change in the first sound period 16 and when the speech speed reaches the target speech speed (Y = A), the target speech speed (Y = A) is maintained.

【００６９】２番目の無音期間１７が到来すると、この
無音期間１７の継続時間Ｔ₁はしき値時間Ｔ_Sより長いの
で、次の３番目の有音期間１６の開始時に話速は元に戻
る。そして、再度話速は目標話速（Ｙ＝Ａ）まで増加す
るように変化を開始する。When the second silence period 17 arrives, the duration T ₁ of the silence period 17 is longer than the threshold time T _S , so that the speech speed returns to the original speed at the start of the next third speech period 16. . Then, the speech speed starts changing again so as to increase to the target speech speed (Y = A).

【００７０】したがって、図１１に示すように、一連の
音声信号ａの開始時刻から終了時刻までの全区間におい
て、話速は通常話速（Ｙ＝１）から目標話速（Ｙ＝Ａ）
への変化を繰り返す。Therefore, as shown in FIG. 11, in the entire section from the start time to the end time of the series of voice signals a, the speech speed changes from the normal speech speed (Y = 1) to the target speech speed (Y = A).
Repeat the change to.

【００７１】したがって、会話やナレーションの途中に
存在するしきい値時間以上の無音期間が発生する毎に、
話速が通常話速からこの通常話速よる速い目標話速まで
変化するので、上述した第１実施形態の話速変換装置と
ほぼ同様の作用効果を奏することが可能である。Therefore, every time a silent period longer than the threshold time that occurs during conversation or narration occurs,
Since the speech speed changes from the normal speech speed to the target speech speed which is higher than the normal speech speed, it is possible to achieve substantially the same operation and effect as the speech speed conversion device of the first embodiment described above.

【００７２】なお、本発明は上述した実施形態に限定さ
れるものではない。第１、第２、第３実施形態の話速変
換装置においては、図１２におけるＤ特性に示すよう
に、話速が通常話速（Ｙ＝１）から、目標話速（Ｙ＝
Ａ）への変化を開始するタイミングを音声信号ａ
（ａ₂）の有音期間１６における先頭の無声音２の開始
位置に設定した。The present invention is not limited to the above embodiment. In the speech speed conversion devices of the first, second, and third embodiments, as shown by the D characteristic in FIG. 12, the speech speed changes from the normal speech speed (Y = 1) to the target speech speed (Y =
The timing for starting the change to A) is determined by the audio signal a.
The start position of the first unvoiced sound 2 in the sound period 16 of (a ₂ ) was set.

【００７３】しかし、図１２におけるＥ特性に示すよう
に、話速が通常話速（Ｙ＝１）から目標話速（Ｙ＝Ａ）
への変化を開始するタイミングを音声信号ａ（ａ₂）の
有音期間１６の先頭の有声音３の開始位置に設定するこ
とも可能である。However, as shown by the E characteristic in FIG. 12, the speech speed changes from the normal speech speed (Y = 1) to the target speech speed (Y = A).
It is also possible to set the timing to start changing to the start position of the voiced sound 3 at the beginning of the voiced period 16 of the audio signal a (a ₂ ).

【００７４】音声の冒頭部分における「シーッ」や「チ
ーッ」等の無声音２はあまり意味をなさず、有声音３が
意味をなす場合が多い。したがって、図１２におけるＥ
特性に示すように、有声音３の開始時刻から話速の変化
を開始するのが望ましい。The unvoiced sound 2 such as "sheep" or "chee" at the beginning of the voice does not make much sense, and the voiced sound 3 often makes sense. Therefore, E in FIG.
As shown in the characteristics, it is desirable to start changing the speech speed from the start time of the voiced sound 3.

【００７５】なお、本発明は上述した各実施形態に限定
されるものではない。各実施形態の話速変換装置におい
ては、図１、図４に示すように、Ａ／Ｄ変換器６を設け
て、入力端子５から入力されたアナログの音声信号をデ
ジタルの音声信号に変換して音声信号メモリ７へ書込む
ようにした。さらに、出力バッファ１０から出力された
話速変化処理した後のデジタルの音声信号をＤ／Ａ変換
器１１を用いてアナログの音声信号に変換して出力端子
１２から出力するようにした。The present invention is not limited to the above embodiments. In the speech speed converter of each embodiment, as shown in FIGS. 1 and 4, an A / D converter 6 is provided to convert an analog audio signal input from the input terminal 5 into a digital audio signal. To be written into the audio signal memory 7. Further, the digital audio signal output from the output buffer 10 and subjected to the speech speed change processing is converted into an analog audio signal using the D / A converter 11 and output from the output terminal 12.

【００７６】しかし、既にコンピユータ等でデジタル処
理されたデジタルの音声信号に対して話速変化処理を実
施したり、話速変化処理した後のデジタルの音声信号を
再度コンピユータ等でデジタル処理する場合において
は、図１、図４の鎖線で示すように、入力端子５ａから
デジタルの音声信号を直接音声信号メモリ７へ入力す
る。また、出力バッファ１０から出力された話速変化処
理した後のデジタルの音声信号を直接出力端子１２ａへ
出力する。However, when a speech speed change process is performed on a digital voice signal that has already been digitally processed by a computer or the like, or when the digital voice signal after the speech speed change process is digitally processed again by a computer or the like, Inputs a digital audio signal directly from the input terminal 5a to the audio signal memory 7 as shown by a chain line in FIGS. Also, the digital audio signal output from the output buffer 10 and subjected to the speech speed change processing is directly output to the output terminal 12a.

【００７７】[0077]

【発明の効果】以上説明したように、本発明の話速変換
装置においては、話し始めは通常の話速で開始し、規定
時間経過までにその話速を目標話速まで次第に速くして
いようにしている。As described above, in the speech speed conversion apparatus according to the present invention, the speech starts at the normal speech speed, and the speech speed is gradually increased to the target speech speed by the lapse of the specified time. I have to.

【００７８】したがって、たとえ話し速度を速く設定し
たとしても、会話又はナレーションの冒頭部分も正確に
聞き分けることができ、語学学習者にとって、この話速
変換装置が組込まれた語学学習機器の使い勝手を大幅に
向上できる。Therefore, even if the speaking speed is set to be high, the beginning of the conversation or the narration can be accurately distinguished, and the language learner can easily use the language learning device incorporating the speech speed conversion device. Can be improved.

[Brief description of the drawings]

【図１】本発明の第１実施形態に係わる話速変換装置の
概略構成を示すブロック図FIG. 1 is a block diagram showing a schematic configuration of a speech speed conversion device according to a first embodiment of the present invention;

【図２】同話速変換装置に組込まれた話速算出部の話速
倍率の算出処理を示す流れ図FIG. 2 is a flowchart showing a calculation process of a speech speed magnification of a speech speed calculation unit incorporated in the same speech speed conversion device.

【図３】同話速変換装置における話速の変化を示す図FIG. 3 is a diagram showing a change in speech speed in the same speech speed conversion device.

【図４】本発明の第２実施形態に係わる話速変換装置の
概略構成を示すブロック図FIG. 4 is a block diagram showing a schematic configuration of a speech speed conversion device according to a second embodiment of the present invention;

【図５】同話速変換装置に組込まれた話速算出部の話速
倍率の算出処理を示す流れ図FIG. 5 is a flowchart showing a calculation process of a speech speed magnification of a speech speed calculation unit incorporated in the same speech speed conversion device.

【図６】同話速変換装置における話速の変化を示す図FIG. 6 is a diagram showing a change in speech speed in the same speech speed conversion device.

【図７】同話速変換装置の動作を説明するための音声信
号波形を示す図FIG. 7 is a diagram showing an audio signal waveform for explaining the operation of the same speech speed conversion device.

【図８】同話速変換装置の出力端子から出力されたそれ
ぞれ話速が異なる音声信号派遣を示す図FIG. 8 is a diagram showing the transmission of audio signals having different voice speeds output from the output terminal of the voice speed converter.

【図９】本発明の第３実施形態に係わる話速変換装置に
組込まれた話速算出部の話速倍率の算出処理を示す流れ
図FIG. 9 is a flowchart showing a process of calculating a speech speed magnification of a speech speed calculation unit incorporated in a speech speed conversion device according to a third embodiment of the present invention.

【図１０】同話速変換装置の動作を説明するための音声
信号波形を示す図FIG. 10 is a diagram showing an audio signal waveform for explaining the operation of the same speech speed conversion device.

【図１１】同話速変換装置における話速の変化特性を示
す図FIG. 11 is a diagram showing a change characteristic of a voice speed in the voice speed converter.

【図１２】話速変換装置の変形例における話速の変化特
性を示す図FIG. 12 is a diagram showing a change characteristic of a speech speed in a modification of the speech speed conversion device.

【図１３】一般的な音声信号波形を示す図FIG. 13 is a diagram showing a general audio signal waveform.

【図１４】一般的な音声信号の詳細を示す図FIG. 14 is a diagram showing details of a general audio signal.

[Explanation of symbols]

２…無声音３…有声音４…無音５…入力端子６…Ａ／Ｄ変換器７…音声信号メモリ８…音声解析部９…信号合成部１０…出力バッファ１１…Ｄ／Ａ変換器１２…出力端子１３…有音検出部１４，１４ａ…話速算出部１６…有音期間１７…無音期間１８…時間設定部１９…目標話速設定部２０…ステップ数設定部 2 unvoiced sound 3 voiced sound 4 silent 5 input terminal 6 A / D converter 7 audio signal memory 8 audio analyzer 9 signal synthesizer 10 output buffer 11 D / A converter 12 output Terminal 13: Voice detection unit 14, 14a: Voice speed calculation unit 16: Voice period 17: Silence period 18: Time setting unit 19: Target voice speed setting unit 20: Step number setting unit

Claims

[Claims]

1. An audio signal memory (7) for storing an input digital audio signal, an audio analysis unit (8) for analyzing a digital audio signal output from the audio signal memory, and an audio analysis unit A signal synthesizing unit (9) for synthesizing the digital audio signal analyzed in (1) and outputting it as a new audio signal having a specified speech speed. Voice detection means (13) for detecting the presence of voice, and within a period from the voice detection time of the voice detection means until a lapse of a specified time (T _B ), the voice speed is higher than the normal voice speed with the lapse of time. A speech speed calculation unit (14) for changing the speech speed to a target speech speed and sequentially calculating the speech speed for maintaining the target speech speed after the lapse of the specified time, and sending the speech speed to the signal synthesizing unit. Speaking speed converter.

2. An audio signal memory (7) for storing an input digital audio signal, and an audio analysis unit (11) for dividing the digital audio signal output from the audio signal memory into unvoiced sound, voiced sound, and silent sound. 8) and thinning out only the voiced sounds classified by the voice analysis unit in accordance with the specified speech speed, and connecting the unvoiced sound, the decimated voiced sound and the unvoiced sound to form a new voice signal. And a signal synthesizing unit (9) for synthesizing the normal speech with a lapse of time within a period from a start time of the input audio signal until a lapse of a specified time (T _B ). A speech speed calculation unit (14a) that changes the speech speed to a target speech speed higher than the speed and after the lapse of the prescribed time, sequentially calculates a speech speed that maintains the target speech speed and sends the speech speed to the signal synthesis unit. Characteristic speech speed converter.

3. The start time of the input audio signal is:
3. The speech speed conversion device according to claim 2, wherein the speech signal is a start time of a first voiced sound segmented by the speech analysis unit in the speech signal.

4. The speech speed calculation section, wherein the speech analysis section counts a duration of the silence from a silence start time in each silence in the speech signal, and a silence duration timer; 4. A speech speed discontinuity calculation means for calculating a speech speed which increases in accordance with the number of times each time the threshold time is counted.
The described speech speed conversion device.

5. An audio signal memory (7) for storing an input digital audio signal, and an audio analysis unit (11) for classifying the digital audio signal output from the audio signal memory into unvoiced sound, voiced sound, and non-voiced sound. 8) and thinning out only the voiced sounds classified by the voice analysis unit in accordance with the specified speech speed, and connecting the unvoiced sound, the decimated voiced sound and the unvoiced sound to form a new voice signal. And a signal synthesizing unit (9) for synthesizing the audio signal. The audio analysis unit measures the duration of the silence from the silence start time for each silence in the audio signal. Every time the silence duration timer counts the threshold time, it changes from the start time of the unvoiced sound or voiced sound that follows the corresponding silence to the target speech speed higher than the normal speech speed with the lapse of time, and thereafter The goal Speech speed conversion apparatus characterized by comprising a speech rate calculation unit sequentially calculates the speaking speed to maintain a fast delivery to the signal combining unit.