JP2004177635A

JP2004177635A - Sentence read-aloud device, and program and recording medium for the device

Info

Publication number: JP2004177635A
Application number: JP2002343275A
Authority: JP
Inventors: Shigeaki Komatsu; 慈明小松
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2002-11-27
Filing date: 2002-11-27
Publication date: 2004-06-24
Anticipated expiration: 2022-11-27
Also published as: JP3838193B2

Abstract

<P>PROBLEM TO BE SOLVED: To eliminates the need for a user to start or stop a read by a sentence read-aloud device according to change contents of read-aloud conditions each time the read-aloud conditions are changed. <P>SOLUTION: A CPU 10 of the sentence read-aloud device is equipped internally with a read-aloud start control part 16 and a read-aloud stop control part 17; when the read-aloud conditions of sound volume, a read-aloud speed, etc., are changed, inputted change information is divided into change processing to be performed in a sentence read-aloud processing like sound volume and change processing in a sentence read-aloud stop state like the speed. Then read-aloud start or stop control is performed according to the change information on the respective cases and read-aloud processing is controlled under the changed sentence read-aloud conditions. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は文章読み上げ装置、文章を読み上げるためのプログラム及び同プログラムを記録した記録媒体に関するものである。
【０００２】
【従来の技術】
電子書籍などのテキストデータを基に音声信号を合成して音声出力を行う文章読み上げ装置において、読み上げ文章の読み上げ中に音量や速度を調整できるようにしたものは既に知れらている。
【０００３】
例えば、特許文献１には、透明なタッチパネルがディスプレイの画面上に一体に構成された表示入力デバイスを用い、操作者が指などでタッチパネルをなぞるトレース動作を行うことで、発音速度、音量を表すパラメータを反映して音声合成を行い、それによって操作者の意図に沿った了解性の高い合成音声を得ることができるテキスト読み上げ装置が記載されている。
【０００４】
また、読み上げ中の読み上げ条件の変更は上記以外にも例えば、トーンの変更、声質の変更、男女の変更などが挙げられる。
【０００５】
【特許文献１】
特開平９−２６５２９９号公報（要約、段落（０００９）、段落（００１０）、図１，図２）
【０００６】
ところで、文章読み上げ中に読み上げの条件を変更する場合、その変更操作がリアルタイムで読み上げに反映できる場合と、変更してもそれが出力に反映するのに時間がかかる、即ち入出力に時間差が生じる場合とがある。
【０００７】
例えば、音量制御の場合は、音声出力用のアンプのゲインを調整するだけで済むため音声変換処理後に調整することができるが、速度調整を行う場合、それを出力に反映させるためにはその変更速度に基づく音声変換処理を要するから、音声変換処理以前に行わないと出力される音声に反映させることはできない。
【０００８】
このように、文章読み上げ装置における読み上げ条件の変更は、音声変換処理後に調整できるものとできないものがある。例えば、音量の変更、エフェクト（トーン（高低）、エコー、周波数等）の変更は音声変換処理後でも調整できるが、速度の変更、声質（男女（性別）、話手）の変更はその変更に基づく音声変換処理を要するから、音声変換処理前でないと出力される音声には反映されない。
【０００９】
図４は読み上げ装置において、音声変換処理と実際に音声が出力されるタイミングを示したタイミングチャートである。図示のように、音声変換処理と実際に音声が出力されるタイミングに時間的な差が生じている。これは文章読み上げ中に例えば読み上げ速度の変更を行うと、その変更がされた速度での読み上げ出力までに時間差がでるため、読み上げが不自然になることを表している。
【００１０】
この点について図５、図６を参照して更に説明する。
【００１１】
図５は、先行技術文献によるものではないが、本発明の文章読み上げ装置の前提技術となる読み上げ装置の１例を示す正面図である。文章読み上げ装置１の画面には、読上用（読み上げの開始）及び停止用のボタン３と共に速度、速度変更用及び音量変更用のスライドバー４、及びスピーカ５等が表示されている。その画面の全面には透明なタッチパネル２が配置されており、そのタッチパネル２と画面の表示内容とは対応がとられている。
【００１２】
図６は、図５に示す文章読み上げ装置１の構成を概略的に示したブロック図である。
【００１３】
文章読み上げ装置１のＣＰＵ１０には、タッチパネル２、例えば液晶ディスプレイ６のような表示手段５０と、読み上げを行う電子書籍データや、音量値、速度値等を記録したＲＡＭ３０と、音声合成用のプログラム、音声合成のための文書の解析や音声合成に使用する単語辞書データ、合成音素データ等を格納したＲＯＭ４０とが接続されている。またＣＰＵ１０は、タッチパネル２からの入力を受けその操作内容をチェックして特定するタップ部１１と、そのタップ部１１からの信号を受けそれぞれＲＡＭ３０に記録された速度値を読み出して変更し、変更した速度値を音声変換処理部１４に渡すと共に、ＲＡＭ３０の速度値を変更した速度値に更新する速度制御部１２と，同様にタップ部１１からの信号を受けそれぞれＲＡＭ３０に記録された音量値を読み出して変更し、変更した音量値でアンプ６０を制御しスピーカ５から出力すると共にＲＡＭ３０の音量値を変更された音量値で更新する音量制御部１３と、ＲＡＭ３０に記録された電子書籍データを読み出し、ＲＯＭ４０に格納された音声合成プログラムや辞書データを用いて、音声合成を行い、速度制御部１２からの出力信号に基づく音声速度でアンプ６０に出力する音声変換処理部１４と、並びに、ＲＡＭ３０に格納されている書籍データから読み上げる文章データを液晶ディスプレイ６等の表示手段５０に表示するよう制御をする表示制御部１５とからなっている。
【００１４】
ここで、音声変換処理部１４における音声合成処理手順について図７に示すフロー図に従って説明する。
【００１５】
ステップＳ１０１において、ＣＰＵ１０は、ＲＡＭ３０に格納された電子書籍データから読み上げ用に抽出された抽出文に対して逆かな漢字変換を行う。つまり、抽出文に対して読みを付与する。例えば、抽出文が“昔々、・・・・”であれば、逆かな漢字変換によって“ムカシムカシ、・・・・”が得られる。続いて、ステップＳ１０２において、逆かな漢字変換されたものに対してアクセント型を付与する（アクセント処理）。例えば、“ムカシムカシ”に対してはアクセント型として０（ゼロ）型が付与される。
【００１６】
ステップＳ１０３において、ステップＳ１０２でアクセント型が付与された後の夫々の音節の継続時間長Ｔを発声速度係数αとＲＯＭ４０に格納されているその音節の継続長Ｌとを乗算することによって算出する（Ｔ＝α×Ｌ）。
【００１７】
ステップＳ１０４において、夫々の音節の基本周波数を算出し、続いてステッＳ１０５において、夫々の音節の音量を算出する。例えば、アクセントが高くなる音節に対して、その音節の基本周波数が高くなるように基本周波数を制御する（ステップＳ１０４）とともに、音量が大きくなるように音量を制御する（ステップＳ１０５）。これらの処理はアクセントに対応して文章の読み上げに抑揚をつけるために行なう処理である。
【００１８】
ステップＳ１０６において、ステップＳ１０２で抽出された抽出文（液晶ディスプレイ６の表示画面に表示されている文章の中の最後の文）の先頭の文字から液晶ディスプレイ６に表示されている最後の文字（改ページタグの直前の文字）までのステップＳ１０３で算出された夫々の継続時間長を加算することによって想定時間（頁切り換えまでに要する時間）を算出する。
【００１９】
ステップＳ１０７において、ＲＯＭ４０に記憶されている言語処理用の辞書や音声合成用の音声データ、ステップＳ１０３で算出された夫々の音節の継続時間長、ステップＳ１０４で算出された夫々の音節の基本周波数、ステップＳ１０５で算出された夫々の音量を利用して、ステップＳ１０２で抽出された抽出文の、所望の速度及び音量の音声合成データを作成して、アンプ６０を介してスピーカ５等の音声出力装置に出力する。
【００２０】
このようにして、液晶ディスプレイ６の表示画面に表示されている１又は複数の文章の中から表示画面に表示されている最後の文が抽出された場合に、その最後の文の先頭から液晶ディスプレイ６の表示画面に表示されている最後の文字までを読み上げるのに要する時間を想定し、その最後の文の読み上げが開始されてから想定された時間が経過したときに、液晶ディスプレイ６の表示画面の表示内容を切り換え、液晶ディスプレイ６の表示画面の表示内容を切り換えるタイミングをその表示内容の読み上げの終了のタイミングに合わせる制御を行っている。
【００２１】
以上で示した文章読み上げ装置１において、図５の例で下方のスライドバー４を操作して音量調整を行った場合は、そのタッチパネル２の操作からタップ部１１がその操作が音量調整であると特定して音量制御部１３に伝え、音量制御部１３は指示に従いＲＡＭ３０から音量値を読み出して変更し、変更後の音量値に基づき直接アンプ６０を制御して音量を調整し、同時にＲＡＭ３０の音量値の領域に変更した音量値を書き込む。このように、音量変更操作は音声変換処理部１４を介在させずに行うことができるから、出力にリアルタイムで反映することができる。
【００２２】
これに対し、読み上げ速度の変更の場合は、タッチパネル２の上方のスライドバー４の操作からタップ部１１がその操作が速度調整であると特定して速度制御部１２に伝え、速度制御部１２はＲＡＭ３０から読み出した速度値を変更して音声変換処理部１４に送り、同時に変更された速度をＲＡＭ３０に記録する。音声変換処理部１４は変更した信号に基づき音声変換処理を行ってアンプ６０を制御し、スピーカ５から変更した速度で音声を出力する。
【００２３】
以上の処理動作において、読み上げ速度を変更するときは既に説明したように、音声変換入力と出力に時間差があって、速度変更がリアルタイムで出力に反映されず読み上げたとき違和感が残る。そのため一旦、停止用ボタン３を操作して音声変換処理を停止させた上で変更を行うということが行われている。
【００２４】
そのため、ユーザは文章読み上げ中に読み上げ条件を変更したい場合（機能設定も含む）には、その変更を音声変換処理を止めて行うべきか否かその都度判断し、かつその判断に従って読上用又は停止用ボタン３を操作しなければならない。具体的には、ユーザーにとって現在の読み上げの速度が適切ではない（例えば、聞きづらい）ために、ユーザーがその読み上げの速度を変更しようと思った場合には、読み上げを停止するため停止用ボタン３を操作し、次に、速度変更用のスライドバー４を操作し、更に、読み上げを開始するための読上用ボタン３を操作する。あるいは、読み上げを行っていない状態の時に音量を変更しようとユーザが思った場合には、音量変更用のスライドバー４を操作し、更に読み上げを開始するための読上用ボタン３を操作する。このような多数の操作は煩雑で不便なため問題であった。
【００２５】
【発明が解決しようとする課題】
上述したようなユーザの操作は煩雑で不便という問題点があった。
【００２６】
本発明は、以上の問題（煩雑で不便な操作）を解決するためなされたもので、その目的は、ユーザが読み上げ条件を容易に変更できるようにすることである。
【００２７】
【課題を解決するための手段】
請求項１の発明は、文章読み上げ条件が変更可能な文章読み上げ装置において、文章読み上げ条件を変更するための変更情報を入力する手段と、入力された変更情報に基づき、文章読み上げ状態で変更処理するか文章読み上げ停止状態で変更処理するかを判断して読み上げ開始又は停止制御する手段と、変更された文章読み上げ条件に基づき読み上げを制御する手段とを備えたことを特徴とする文章読み上げ装置である。即ち、文章読み上げ条件の変更と、読み上げ開始又は停止の制御とが関連付けられている。
【００２８】
請求項２の発明は、請求項１に記載された文章読み上げ装置において、さらに文章読み上げ中か否かの判別に基き文章読み上げ開始制御又は文章読み上げ停止制御を行う読み上げ開始又は停止制御する手段を備えており、文章読み上げ中か否かということと、読み上げ開始又は停止とが関連付けされている。
【００２９】
請求項３の発明は、請求項１又は２に記載された文章読み上げ装置において、前記変更情報による読み上げ条件の変更が音声出力に遅延して反映されるものであるときは停止制御し、遅延せずに反映されるものであるときは停止制御しない前記読み上げ開始又は停止制御する手段を備えており、変更による遅延と、読み上げ停止とが関連づけされている。
【００３０】
請求項４の発明は、請求項２又は３に記載された文章読み上げ装置において、前記変更情報が読み上げ音量又はトーンの変更情報であるとき、文章読み上げ装置が読み上げ中であるか否かを判断し、読み上中でなければ読み上げ開始制御する前記読み上げ開始又は停止制御する手段を備えており、音量又はトーンといった特定の変更と読み上げ停止とが関連づけされている。
【００３１】
請求項５の発明は、請求項３に記載された文章読み上げ装置において、前記読み上げ開始又は停止制御する手段により停止制御したとき、読み上げ文章の先頭から読み上げるよう制御する手段を備え、文章読み上げ条件の変更があれば、文章の先頭から読み上げられる。
【００３２】
請求項６の発明は、文章読み上げ条件が変更可能に文章を読み上げるためにコンピュータに、文章読み上げ条件を変更するための変更情報に基づき、文章読み上げ状態で変更処理するか文章読み上げ停止状態で変更処理するかを判断する手順と、該判断に基づき読み上げ開始又は停止制御する手順とを実行させることを特徴とするプログラムである。即ち、文章読み上げ条件の変更と、読み上げ開始又は停止の制御とが関連づけられている。
【００３３】
請求項７の発明は、請求項６に記載されたプログラムを記録したことを特徴とするコンピュータ読み取り可能な記録媒体である。この請求項７の発明によれば、請求項６と同様の作用を奏する。
【００３４】
【発明の実施の形態】
本発明の実施の形態について添付図面を参考に説明する。
【００３５】
図１は本発明に係る文章読み上げ装置１の実施の形態を示している。
【００３６】
この実施の形態は、図６に示した文章読み上げ装置１のＣＰＵ１０に読み上げ開始制御部１６と読み上げ停止制御部１７とを付加した構成である。
【００３７】
読み上げ開始制御部１６及び読み上げ停止制御部１７は共にタップ部１１からの信号を受けて、読み上げ開始制御部１６は文章読み上げ装置１が読み上げ中でないときに音声変換処理部１４を開始制御し、また読み上げ停止制御部１７は、文章読み上げ中に音声変換処理部１４を停止制御する。更に、ＲＡＭ３０には、文章読み上げ中か否かを示すための読み上げ中フラグのための記憶領域が設けられている。その他の構成機能は図７について説明したものと同様である。
【００３８】
次に、以上で説明した文章読み上げ装置１による読み上げ中における文章読み上げ条件の変更（設定をも含む）について、読み上げの音量と速度を例に採って説明する。
【００３９】
図２は前記文章読み上げ装置における処理の手順（第１の実施の形態）を説明するためのフロー図である。
【００４０】
文章読み上げ装置１の音量、速度を変更する場合、まず、ユーザによってタッチパネル２の一部がタップされたことを検出する（Ｓ２０１、ＹＥＳ）。タップ部１１はタッチパネル２からの信号を受けてその内容をチェックし、それが読上用のボタン３による読み上げ（再生）の開始のための操作であると判断（特定）したときは（Ｓ２０２、ＹＥＳ）、ＲＯＭ３０に記憶された文章の先頭から読み上げ（再生）を開始し（Ｓ２０３）そのまま読み上げを行う。
【００４１】
タップ部１１が読上（再生）の開始のための操作でなく（Ｓ２０２、ＮＯ）、音量変更スライドバー４による音量変更制御のための操作であると判断（特定）したときは（Ｓ２０４、ＹＥＳ）、読み上げ開始制御部１６はＲＡＭ３０に記録された読み上げ中（読み上げモード）であるか否かを示す「読み上げ中フラグ」をチェックし（Ｓ２０５）、読み上げ停止中（フラグが０）であれば（Ｓ２０６、ＹＥＳ）ＲＡＭ３０に記憶された文章の先頭から読み上げを開始し（Ｓ２０７）、また、読み上げ中であれば（Ｓ２０６、ＮＯ）そのまま読み上げ中の状態で、それぞれ音量制御部１３はＲＡＭ３０から音量値を読み出して変更した音量値で音量調節を行い（Ｓ２０８）、変更した音量値をＲＡＭ３０に書き込む（Ｓ２０９）と共にアンプ６０を介してスピーカ５から変更された音量で音声出力する。ここで、ユーザは読み上げ音量を聞きながら音量調節を音量変更用のスライドバー４によって行い、必要な設定が終了するまでステップＳ２０８、Ｓ２０９の処理を繰り返し、設定が終われば（Ｓ２１０、ＹＥＳ）処理を終了する。
【００４２】
ステップＳ２０４において、タッチパネル２へのタップ操作が音量変更のための操作でなく（Ｓ２０４、ＮＯ）、速度変更用のスライドバー４による速度変更のための操作であると、タップ部１１が特定（判断）したとき（Ｓ２１１、ＹＥＳ）、読み上げ停止制御部１７はＲＡＭ３０に記録された「読み上げ中フラグ」をチェックし（Ｓ２１２）、読み上げ中（読み上げモード）であれば（Ｓ２１３、ＹＥＳ）読み上げを停止すると共にＲＡＭ３０に記録された「読み上げ中フラグ」をリセット（例えば「１」から「０」に変更）する（Ｓ２１４）。読み上げ中（読み上げモード）でなければ（Ｓ２１３、ＮＯ）そのまま、速度制御部１２はＲＡＭ３０から速度値を読み出し、入力された速度値で速度調節を行い（Ｓ２１５）、変更した速度値を持った合成音声をアンプ６０を介してスピーカ５から出力し、かつ変更（調節）した速度値をＲＡＭ３０に書き込む（Ｓ２１６）。
【００４３】
ステップＳ２１５〜Ｓ２１６の手順は速度設定が終了するまで行われ、設定が終了すれば（Ｓ２１８、ＹＥＳ）、読み上げ位置を文章の先頭（つまり、読み上げ途中であればその文章の頭）に戻して（Ｓ２１８）、新しく設定された読み上げ速度で読み上げを開始し、同時にＲＡＭ３０に保存されている「読み上げ中フラグ」をセット（例えば「０」から「１」に変更）する（Ｓ２１９）。これによって速度制御の処理手順を終了して、読み上げる文章がまだあれば、読み上げを変更された条件で継続する。
【００４４】
なお、ステップＳ２１１において速度制御でない場合（Ｓ２１１，ＮＯ）については説明を省略するが、例えば速度変更用のスライドバー４以外の読み上げ音声の変更、男女声の変更等、一旦読み上げを停止した後に変更を実施した方がよい場合における処理手順は速度制御と同様であり、また、読み上げ中でないと適切に調節できないおそれがあるトーンの変更等は音量の変更と同様の手順で処理が実行される。
【００４５】
文章読み上げ中に読み上げ条件を変更するための操作がなされたとき、その変更又は設定を行うのに、例えば音量やトーンの変更や設定のように読み上げを止める必要のないものについては読み上げを止めず、止める必要のあるものは自動的に読み上げを止めるようにし、また、音量やトーンの変更や設定のように、読み上げ中でないと設定できない、つまり実際の音量やトーンを聞きいてみなければ設定ができないものについては、読み上げ停止中であっても自動的に読み上げを開始できるようにすることで、それによってユーザが読み上げ条件を変更する際の操作の負担軽減を図ることができる。
【００４６】
図３は、上記文章読み上げ装置１における別の処理手順（第２の実施の形態）を説明するためのフロー図である。
【００４７】
文章読み上げ装置１の音量、速度を変更する場合、まず、ユーザによってタッチパネル２の一部がタップされたことを検出し（Ｓ３０１、ＹＥＳ）、タップ部１１はそのタップの内容をチェックし、それが読上用のボタン３による読み上げ（再生）の開始のための操作であると判断したときは（Ｓ３０２、ＹＥＳ）、読み上げを開始し（Ｓ３０３）ＲＡＭ３０に記憶された文章の先頭から読み上げを行う。
【００４８】
読み上げの開始のための操作でなく（Ｓ３０２、ＮＯ）、音量変更用のスライドバー４による速度変更のための操作であると判断（特定）したときは（Ｓ３０４、ＹＥＳ）、読み上げ開始制御部１６はＲＡＭ３０に記録された「読み上げ中フラグ」をチェックし（Ｓ３０５）、その結果、読み上げ停止中であれば（Ｓ３０６、ＹＥＳ）、読み上げないと音量調節は不可能あるいは、適切にできないおそれがであるのでＲＡＭ３０に記憶された文章の先頭から読み上げを開始し（Ｓ３０７）、ＲＡＭ３０中の「読み上げ中フラグ」をセットする（Ｓ３０８）。ステップＳ３０６において、読み上げ中であれば（Ｓ３０６、ＮＯ）そのまま読み上げ中の状態で、それぞれ音量制御部１３はＲＡＭ３０から音量値を読み出して入力された音量値に変更することで音量調節を行い（Ｓ３０９）アンプ６０を介してスピーカ５から音声出力し、かつ変更（調節）後の音量値をＲＡＭ３０に書き込む（Ｓ３１０）。
【００４９】
ここで、ユーザが読み上げ音量を聞きながら音変更用のスライドバー４により音量調節を行い、必要な設定が終了するまでステップＳ３０９、Ｓ３１０の処理を繰り返し、調節（設定）が終われば（Ｓ３１１、ＹＥＳ）処理を終了する。
【００５０】
ステップＳ３０４において、タップ部１１により、タッチパネル２のタップ操作が音量変更のための操作でなく（Ｓ３０４、ＮＯ）、速度変更用のスライドバー４の操作による速度変更のためであるとタップ部１１が判断されたときは（Ｓ３１２、ＹＥＳ）、ユーザが速度調節操作を行うと（Ｓ３１３）、タッチパネル２の入力はタップ部１１で特定され、速度制御部１２はＲＡＭ３０から速度値を読み出して変更された速度を音声変換処理部１４に渡すと共に変更した速度値をＲＡＭ３０に書き込む（Ｓ３１４）。この段階で、読み上げ停止制御部１７はＲＡＭ３０に記録された読み上げ中フラグをチェックし（Ｓ３１５）、チェックの結果、「読み上げフラグ」がセット状態（例えばフラグが１）で読み上げ中（読み上げモード）であると判断されたときは（Ｓ３１６，ＹＥＳ）、読み上げを一旦停止し（Ｓ３１７）「読み上げフラグ」をリセットする。その後読み上げを開始し、変更された速度の合成音声をアンプ６０を介してスピーカ５から出力すると共に「読み上げフラグ」をセットする（Ｓ３１８）。以下ステップＳ３１３〜Ｓ３１８の処理を設定が終了するまで実行し、設定が終われば（Ｓ３１９、ＹＥＳ）処理を終了して、読み上げる文章がまだあれば、読み上げを変更された条件で継続する。
【００５１】
ステップＳ３１６において、読み上げ中でなければ（Ｓ３１６、ＮＯ）、設定終了後（Ｓ３１９、ＹＥＳ）に処理を終了する。
【００５２】なお、ステップ３１２において速度変更用のスライドバー４でないとき（Ｓ３１２、ＮＯ）の処理は第１実施の形態と同様である。即ち、例えば速度変更用のスライドバー４以外の読み上げ音声の変更について、男女、相手などの声質の変更等、一旦読み上げを停止した後に変更を実施した方がよい場合における処理手順は速度制御と同様であり、また、読み上げ中でないと調節できないトーン、エコー、周波数などのエフェクトの変更等は音量の変更と同様の手順で処理が実行される。
【００５３】
上述した実施の形態では、いずれも合成音声を発生するためのスピーカ５を備える１つの装置において、図２、図３に示す処理全て行っているが、各処理や処理の一部を別々の装置で処理して、最終的にスピーカ５から変更された条件に沿った合成音声を生じさせ、文章の読み上げを行うようにしても良い。例えば、第１のコンピュータはユーザーの入力を受けるのみで、その他の実質的な処理は別の第２のコンピュータが行う。更に、装置は文章読み上げのための専用の装置に限らず、読み上げ以外の機能を有するＰＤＡ、パソコン、携帯電話、カーナビゲーションの端末、ＴＶ等であっても良い。
【００５４】
尚、読み上げられる文章は書籍に限らず手紙（電子メールを含む）、道案内、宣伝並びに歌詞などであっても良い。また、ＲＡＭ３０に記憶されたデータは装置の電源が落されると消失するが、装置の電源が落されても継続して記憶されても良い。
【００５５】
上述した実施の形態では、いずれも読み上げ条件の変更に伴って、読み上げの自動的な開始と停止との両方を行うが、自動的な開始か、自動的な停止かの一方を行う構成であっても良い。また、読み上げの開始位置は、文章の先頭としているが、文章の途中でも良い。更に、条件変更が完了するまでの間に読み上げられる対象は、その条件変更時に用いられる専用の文章であっても良い。
【００５６】
以上で説明した処理は、該処理の手順を記述したプログラムにより文章読み取り装置１のＣＰＵ１０で実行させることができる。また、本プログラムは、ＦＤ（フレキシブルディスク）、ＣＤＲＯＭ、ＭＯ、ＤＶＤＲＯＭ等のプログラムを記録する周知の記録媒体に記録されて提供される他、インターネット等のネットワーク網を介して提供することができる。
【００５７】
【発明の効果】
本願の請求項１に記載の発明によれば、文章読み上げ中に読み上げ条件の変更を行う場合、ユーザは読み上げ開始又は停止のための操作を従来のように行うことなく、自動で読み上げを止める必要のないものについては読み上げを止めず、あるいは止める必要のあるものは読み上げを止めるようにすることができ、文章読み上げ条件の変更が読み上げ中に行うべきものであるときは、装置が読み上げ中でないときは自動で読み上げ開始を行うことができる。そのため、従来のようにユーザがタッチパネル等の操作を行う煩雑さがなく、読み上げ条件の変更を容易に行うことができる。
【００５８】
本願の請求項２に記載の発明によれば、請求項１に記載の発明の効果を奏し、文章を読み上げ中か否かということと、読み上げ開始又は停止とが関連付けられており、良好な読み上げが可能である。
【００５９】
本願の請求項３に記載の発明によれば、請求項１又は２に記載の発明の効果を奏し、ユーザは読み上げ速度等の変更情報の入力から出力までに時間差のある読み上げ条件変更を行っても、その時間差を意識することなく合成音声による読み上げを自然に聞くことができる。
【００６０】
本願の請求項４に記載の発明によれば、請求項２又は３に記載の発明の効果を奏し、音量又はトーンといった特定の変更と読み上げ開始とが関連づけられており、迅速な読み上げが可能である。
【００６１】
本願の請求項５に記載の発明によれば、請求項３に記載の発明の効果を奏し、文章読み上げ条件の変更があれば、文章が頭書から読み上げられるため、良好な読み上げが可能である。
【００６２】
本願の請求項６に記載の発明によれば、文章読み上げ条件の変更と、読み上げ開始又は停止の制御とが関連づけられており、文章読み上げ条件の変更を容易に行うことができる。
【００６３】
本願の請求項７に記載の発明によれば、請求項６に記載の発明と同様の効果を奏し、そのプログラムを携帯端末その他の情報機器のコンピュータに読み取らせることにより、任意の情報機器において上記効果を実現することができる。
【図面の簡単な説明】
【図１】本発明の文章読み上げ装置の第１実施の形態に係る構成を示したブロック図である。
【図２】読み取り条件変更手順を説明するためのフロー図である。
【図３】他の読み取り条件変更手順を説明するためのフロー図である。
【図４】音声変換処理と音声出力の時間差を示すタイムチャートである。
【図５】従来の文章読み上げ装置の１例を示す正面図である。
【図６】図５に示す文章読み上げ装置の概略構成を示すブロック図である。
【図７】図５に示す文章読み上げ装置の音声変換処理を説明するフロー図である。
【符号の説明】
１・・・文章読み上げ装置、２・・・タッチパネル、３・・・読み上げ及び停止用のボタン、４・・・速度及び音量変更用のスライドバー、５・・・スピーカ、６・・・液晶ディスプレイ、１０・・・ＣＰＵ、３０・・・ＲＡＭ、４０・・・ＲＯＭ、５０・・・表示手段、６０・・・アンプ。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a text-to-speech apparatus, a program for reading a text, and a recording medium on which the program is recorded.
[0002]
[Prior art]
2. Description of the Related Art A text-to-speech device that synthesizes a voice signal based on text data of an electronic book or the like and outputs a voice by voice output is known so that the volume and speed can be adjusted while the text is read.
[0003]
For example, in Patent Literature 1, a sound input speed and a sound volume are represented by performing a trace operation in which a transparent touch panel is integrally formed on a display screen and an operator traces the touch panel with a finger or the like. There is described a text-to-speech apparatus capable of performing speech synthesis by reflecting parameters and thereby obtaining a highly intelligible synthesized speech according to an operator's intention.
[0004]
In addition to the above, the change of the reading condition during the reading is, for example, a change of tone, a change of voice quality, a change of gender, and the like.
[0005]
[Patent Document 1]
JP-A-9-265299 (abstract, paragraph (0009), paragraph (0010), FIGS. 1 and 2)
[0006]
By the way, when the reading condition is changed during the reading of the text, the change operation can be reflected in the reading in real time, and even if it is changed, it takes time to reflect the change in the output, that is, there is a time difference between the input and the output. There are cases.
[0007]
For example, in the case of volume control, it is only necessary to adjust the gain of the amplifier for audio output, so it can be adjusted after the audio conversion process. Since the voice conversion process based on the speed is required, it cannot be reflected on the output voice unless it is performed before the voice conversion process.
[0008]
As described above, the change of the reading condition in the text-to-speech apparatus can be adjusted after the voice conversion processing and cannot be changed. For example, changes in volume and effects (tones (high and low), echo, frequency, etc.) can be adjusted after the voice conversion process, but changes in speed and changes in voice quality (male / female (sex), speaker) can be changed. Since the sound conversion processing based on the sound conversion processing is required, the sound is not reflected in the output sound unless it is before the sound conversion processing.
[0009]
FIG. 4 is a timing chart showing the voice conversion processing and the timing at which the voice is actually output in the reading device. As shown in the figure, there is a time difference between the voice conversion process and the timing at which the voice is actually output. This means that if, for example, the reading speed is changed while the text is being read out, there is a time difference between the reading speed and the reading output at the changed speed, so that the reading becomes unnatural.
[0010]
This point will be further described with reference to FIGS.
[0011]
FIG. 5 is a front view showing an example of a reading device which is not based on the prior art document but is a prerequisite technology of the text reading device of the present invention. On the screen of the text-to-speech apparatus 1, a slide bar 4 for speed, speed change and volume change, a speaker 5, and the like are displayed together with a button 3 for reading (start of reading) and a stop. A transparent touch panel 2 is arranged on the entire surface of the screen, and the touch panel 2 and the display contents of the screen correspond to each other.
[0012]
FIG. 6 is a block diagram schematically showing a configuration of the text-to-speech apparatus 1 shown in FIG.
[0013]
The CPU 10 of the text-to-speech apparatus 1 includes a touch panel 2, a display unit 50 such as a liquid crystal display 6, a RAM 30 in which electronic book data to be read out, a volume value, a speed value, and the like are recorded, a speech synthesis program, A ROM 40 storing word dictionary data, synthesized phoneme data, and the like used for analysis of a document for speech synthesis and speech synthesis is connected. Further, the CPU 10 receives the input from the touch panel 2 and checks and identifies the operation content thereof. The CPU 10 receives the signal from the tap unit 11, reads and changes the speed value recorded in the RAM 30, and changes the speed value. The speed control unit 12 transfers the speed value to the voice conversion processing unit 14 and updates the speed value of the RAM 30 to the changed speed value. Similarly, the speed control unit 12 receives a signal from the tap unit 11 and reads the volume value recorded in the RAM 30. Read out the electronic book data recorded in the RAM 30 and the volume control unit 13 for controlling the amplifier 60 with the changed volume value, outputting the output from the speaker 5 and updating the volume value of the RAM 30 with the changed volume value, Speech synthesis is performed using a speech synthesis program or dictionary data stored in the ROM 40, and an output signal from the speed control unit 12 is output. A voice conversion processing unit 14 for outputting to the amplifier 60 at a voice speed based on the voice data, and a display control unit 15 for controlling text data read out from the book data stored in the RAM 30 to be displayed on the display means 50 such as the liquid crystal display 6. It consists of
[0014]
Here, the speech synthesis processing procedure in the speech conversion processing unit 14 will be described with reference to the flowchart shown in FIG.
[0015]
In step S101, the CPU 10 performs reverse kana-kanji conversion on an extracted sentence extracted for reading out from the electronic book data stored in the RAM 30. That is, a reading is given to the extracted sentence. For example, if the extracted sentence is “Once upon a time,...”, “Mukashimukashi,. Subsequently, in step S102, an accent type is given to the inverted Kana-Kanji converted (accent processing). For example, a “0 (zero)” type is given as an accent type to “Mukashimukashi”.
[0016]
In step S103, the duration T of each syllable after the accent type is added in step S102 is calculated by multiplying the utterance speed coefficient α by the duration L of the syllable stored in the ROM 40 ( T = α × L).
[0017]
In step S104, the fundamental frequency of each syllable is calculated, and then in step S105, the volume of each syllable is calculated. For example, for a syllable whose accent is high, the basic frequency is controlled so that the basic frequency of the syllable is high (step S104), and the volume is controlled so that the volume is high (step S105). These processes are performed to add inflection to the reading of a sentence corresponding to the accent.
[0018]
In step S106, from the first character of the extracted sentence (the last sentence in the sentence displayed on the display screen of the liquid crystal display 6) extracted in step S102, the last character (break) is displayed on the liquid crystal display 6. The estimated time (the time required for page switching) is calculated by adding the respective durations calculated in step S103 up to the character immediately before the page tag).
[0019]
In step S107, the dictionary for speech processing and speech data for speech synthesis stored in the ROM 40, the duration of each syllable calculated in step S103, the fundamental frequency of each syllable calculated in step S104, Using the respective sound volumes calculated in step S105, speech synthesis data of a desired speed and sound volume of the extraction sentence extracted in step S102 is created, and an audio output device such as the speaker 5 via the amplifier 60. Output to
[0020]
In this way, when the last sentence displayed on the display screen is extracted from one or a plurality of sentences displayed on the display screen of the liquid crystal display 6, the liquid crystal display starts from the beginning of the last sentence. Assuming the time required to read out the last character displayed on the display screen 6 and reading the last sentence, the display screen of the liquid crystal display 6 Is controlled so that the timing at which the display content of the display screen of the liquid crystal display 6 is switched to the timing at which the reading of the display content ends.
[0021]
In the text-to-speech apparatus 1 described above, when the volume is adjusted by operating the lower slide bar 4 in the example of FIG. 5, the operation of the touch panel 2 causes the tap unit 11 to perform the volume adjustment. The volume is specified and transmitted to the volume controller 13. The volume controller 13 reads and changes the volume value from the RAM 30 according to the instruction, and controls the amplifier 60 directly based on the changed volume value to adjust the volume. Write the changed volume value to the value area. As described above, since the volume change operation can be performed without the intervention of the voice conversion processing unit 14, the output can be reflected in real time.
[0022]
On the other hand, in the case of changing the reading speed, the tap unit 11 specifies that the operation is speed adjustment from the operation of the slide bar 4 above the touch panel 2 and notifies the speed control unit 12 that the operation is the speed adjustment. The speed value read from the RAM 30 is changed and sent to the voice conversion processing unit 14, and the changed speed is simultaneously recorded in the RAM 30. The voice conversion processing unit 14 performs voice conversion processing based on the changed signal, controls the amplifier 60, and outputs a voice at a changed speed from the speaker 5.
[0023]
In the above processing operations, as described above, when changing the reading speed, there is a time difference between the voice conversion input and the output, and the speed change is not reflected in the output in real time, and when reading aloud, an uncomfortable feeling remains. For this reason, it has been performed that the voice conversion process is temporarily stopped by operating the stop button 3 and then the change is performed.
[0024]
Therefore, when the user wants to change the reading conditions during the reading of the text (including the function setting), the user determines each time whether or not the change should be performed by stopping the voice conversion process, and according to the determination, the reading or reading is performed. The stop button 3 must be operated. Specifically, when the current reading speed is not appropriate for the user (for example, it is difficult to hear), and the user wants to change the reading speed, the stop button 3 is used to stop the reading. The user then operates the slide bar 4 for changing the speed, and further operates the reading button 3 for starting reading. Alternatively, when the user intends to change the volume when the reading is not performed, the user operates the slide bar 4 for changing the volume, and further operates the reading button 3 for starting the reading. Such a large number of operations are problematic because they are complicated and inconvenient.
[0025]
[Problems to be solved by the invention]
The above-described operation of the user is complicated and inconvenient.
[0026]
The present invention has been made to solve the above problem (complex and inconvenient operation), and an object of the present invention is to enable a user to easily change a reading condition.
[0027]
[Means for Solving the Problems]
According to a first aspect of the present invention, in a text-to-speech apparatus capable of changing a text-to-speech condition, means for inputting change information for changing a text-to-speech condition, and performing a change process in a text-to-speech state based on the input change information. A text-to-speech apparatus comprising: means for determining whether or not to perform a change process in a text-to-speech stopped state and performing reading start or stop control; and means for controlling text-to-speech based on the changed text-to-speech condition. . That is, the change of the text-to-speech condition is associated with the control of starting or stopping the text-to-speech.
[0028]
According to a second aspect of the present invention, in the text-to-speech apparatus according to the first aspect, the text-to-speech apparatus further includes means for performing text-to-speech start control or text-to-speech stop control based on determination of whether or not the text is being read. In this case, whether or not the text is being read aloud is associated with the start or stop of the reading.
[0029]
According to a third aspect of the present invention, in the text-to-speech apparatus according to the first or second aspect, when the change of the reading condition based on the change information is reflected in the voice output with a delay, the stop control is performed and the delay is controlled. Means for controlling the start or stop of the reading which does not stop the reading when the reading is reflected without delay, and the delay due to the change and the stopping of the reading are associated with each other.
[0030]
According to a fourth aspect of the present invention, in the text-to-speech apparatus according to the second or third aspect, when the change information is change information of a read-out volume or tone, it is determined whether or not the text-to-speech apparatus is reading out. Means for controlling reading start or stop when reading is not being performed, and a specific change such as volume or tone is associated with stopping reading.
[0031]
According to a fifth aspect of the present invention, in the text-to-speech apparatus according to the third aspect, when the text-to-speech start or stop means is controlled to stop, the text-to-speech condition is controlled to be read from the beginning of the text to be read. If there is a change, it will be read from the beginning of the sentence.
[0032]
According to a sixth aspect of the present invention, the computer reads the text so that the text-to-speech condition can be changed, and based on the change information for changing the text-to-speech condition, performs the change processing in the text-to-speech state or the change processing in the text-to-speech stop state. A program for executing a procedure for determining whether or not to perform the reading, and a procedure for performing a reading start or stop control based on the determination. That is, the change of the text-to-speech condition is associated with the control of the start or stop of the text-to-speech.
[0033]
According to a seventh aspect of the present invention, there is provided a computer-readable recording medium on which the program according to the sixth aspect is recorded. According to the seventh aspect of the invention, the same operation as the sixth aspect is achieved.
[0034]
BEST MODE FOR CARRYING OUT THE INVENTION
Embodiments of the present invention will be described with reference to the accompanying drawings.
[0035]
FIG. 1 shows an embodiment of a text-to-speech apparatus 1 according to the present invention.
[0036]
In this embodiment, a reading start control unit 16 and a reading stop control unit 17 are added to the CPU 10 of the text reading apparatus 1 shown in FIG.
[0037]
The reading start control unit 16 and the reading stop control unit 17 both receive the signal from the tap unit 11, and the reading start control unit 16 starts and controls the voice conversion processing unit 14 when the text reading apparatus 1 is not reading. The reading stop control unit 17 controls the speech conversion processing unit 14 to stop while reading the text. Further, the RAM 30 is provided with a storage area for a reading flag indicating whether or not the text is being read. Other configuration functions are the same as those described with reference to FIG.
[0038]
Next, the change of the text-to-speech conditions (including the setting) during text-to-speech by the text-to-speech apparatus 1 described above will be described using the volume and speed of the text-to-speech as an example.
[0039]
FIG. 2 is a flowchart for explaining a procedure (first embodiment) of processing in the text-to-speech apparatus.
[0040]
When changing the volume and speed of the text-to-speech apparatus 1, first, it is detected that the user has tapped a part of the touch panel 2 (S201, YES). The tap unit 11 receives the signal from the touch panel 2 and checks the content thereof. When it is determined (specified) that this is an operation for starting reading out (playback) using the reading button 3 (S202, YES), the reading (reproduction) is started from the beginning of the text stored in the ROM 30 (S203), and the reading is performed as it is.
[0041]
When the tap unit 11 determines (specifies) that the operation is not the operation for starting reading (playback) (S202, NO), but is the operation for controlling the volume change by the volume change slide bar 4 (S204, YES). The reading start control unit 16 checks a "reading flag" indicating whether or not the reading is being performed (reading mode) recorded in the RAM 30 (S205), and if the reading is stopped (the flag is 0) (step S205). (S206, YES) The reading of the text stored in the RAM 30 is started from the beginning (S207). If the reading is being performed (S206, NO), the volume control unit 13 reads the volume value from the RAM 30 while the reading is being performed. Is read out, the volume is adjusted with the changed volume value (S208), and the changed volume value is written into the RAM 30 (S209), and the amplifier 6 Voice output at the volume is changed from the speaker 5 via a. Here, the user adjusts the volume while listening to the reading volume using the slide bar 4 for changing the volume, repeats the processing of steps S208 and S209 until the necessary setting is completed, and if the setting is completed (YES in S210), performs the processing. finish.
[0042]
In step S204, the tap unit 11 specifies (determines) that the tap operation on the touch panel 2 is not the operation for changing the volume (S204, NO), but the operation for changing the speed by the slide bar 4 for changing the speed. ) (S211, YES), the reading stop control unit 17 checks the "reading flag" recorded in the RAM 30 (S212), and stops reading if the reading is being performed (reading mode) (S213, YES). At the same time, the "reading flag" recorded in the RAM 30 is reset (for example, changed from "1" to "0") (S214). If it is not during the reading (reading mode) (S213, NO), the speed control unit 12 reads the speed value from the RAM 30, adjusts the speed with the input speed value (S215), and synthesizes with the changed speed value. The sound is output from the speaker 5 via the amplifier 60, and the changed (adjusted) speed value is written in the RAM 30 (S216).
[0043]
The procedure of steps S215 to S216 is performed until the speed setting is completed. When the setting is completed (S218, YES), the reading position is returned to the beginning of the text (that is, if the text is being read, the beginning of the text) ( (S218) The reading is started at the newly set reading speed, and at the same time, the "reading flag" stored in the RAM 30 is set (for example, changed from "0" to "1") (S219). As a result, the processing procedure of the speed control is terminated, and if there is a sentence to be read, the reading is continued under the changed condition.
[0044]
If the speed control is not performed in step S211 (S211, NO), the description is omitted. However, for example, a change is made after temporarily stopping the reading, such as a change in the reading voice other than the speed change slide bar 4 or a change in the gender voice. Is performed in the same manner as the speed control, and a tone change or the like that may not be properly adjusted unless the reading is being performed is executed in the same procedure as the volume change.
[0045]
When an operation to change the reading conditions is performed during text reading, the reading or reading is not stopped for those that do not need to be stopped, such as changing the volume or tone, or changing the setting. If it is necessary to stop, automatically stop reading aloud.Also, like changing or setting the volume or tone, you can not set it unless you are reading it out, that is, you can not set it unless you listen to the actual volume and tone. For those that cannot be read, the reading can be automatically started even while the reading is stopped, thereby reducing the burden on the user when changing the reading conditions.
[0046]
FIG. 3 is a flowchart for explaining another processing procedure (second embodiment) in the text-to-speech apparatus 1.
[0047]
When changing the volume and speed of the text-to-speech apparatus 1, first, it is detected that a part of the touch panel 2 has been tapped by the user (S301, YES), and the tap unit 11 checks the content of the tap. When it is determined that the operation is to start reading (playback) using the reading button 3 (S302, YES), reading is started (S303), and reading is started from the beginning of the text stored in the RAM 30.
[0048]
If it is determined (specified) that the operation is not the operation for starting the reading (S302, NO) but is the operation for changing the speed with the slide bar 4 for changing the volume (S304, YES), the reading start control unit 16 Checks the "reading flag" recorded in the RAM 30 (S305). As a result, if reading is stopped (S306, YES), there is a possibility that the volume cannot be adjusted or cannot be properly adjusted without reading. Therefore, the reading is started from the beginning of the sentence stored in the RAM 30 (S307), and the "reading flag" in the RAM 30 is set (S308). If it is determined in step S306 that the voice is being read (S306, NO), the volume controller 13 reads the volume value from the RAM 30 and changes the volume value to the input volume value in the state where the voice is being read (S309). ) The sound is output from the speaker 5 via the amplifier 60, and the changed (adjusted) volume value is written in the RAM 30 (S310).
[0049]
Here, the user adjusts the sound volume using the slide bar 4 for changing the sound while listening to the reading volume, and repeats the processing of steps S309 and S310 until the necessary setting is completed. If the adjustment (setting) is completed (YES in S311) ) End the processing.
[0050]
In step S304, the tap unit 11 determines that the tap operation on the touch panel 2 is not an operation for changing the volume (S304, NO), but is for changing the speed by operating the slide bar 4 for changing the speed. When it is determined (S312, YES), when the user performs the speed adjustment operation (S313), the input on the touch panel 2 is specified by the tap unit 11, and the speed control unit 12 reads out the speed value from the RAM 30 and changes it. The speed is transferred to the voice conversion processing unit 14, and the changed speed value is written in the RAM 30 (S314). At this stage, the reading stop control unit 17 checks the reading flag recorded in the RAM 30 (S315), and as a result of the check, the reading flag is set (for example, the flag is 1) and the reading is being performed (reading mode). When it is determined that there is any (S316, YES), the reading is temporarily stopped (S317), and the "reading flag" is reset. Thereafter, the reading is started, and the synthesized voice at the changed speed is output from the speaker 5 via the amplifier 60, and the "reading flag" is set (S318). Thereafter, the processes of steps S313 to S318 are executed until the setting is completed, and when the setting is completed (S319, YES), the process is ended, and if there is a sentence to be read, the reading is continued under the changed condition.
[0051]
In step S316, if the reading is not being performed (S316, NO), the process ends after the setting is completed (S319, YES).
When the slide bar 4 is not the speed change slide bar 4 in step 312 (S312, NO), the processing is the same as in the first embodiment. That is, for example, for the change of the reading voice other than the speed change slide bar 4, the processing procedure in the case where it is better to perform the change after temporarily stopping the reading, such as changing the voice quality of a man and a woman or the other party, is the same as the speed control. In addition, processing for changing effects such as tone, echo, and frequency, which cannot be adjusted unless reading is being performed, is executed in the same procedure as for changing the volume.
[0053]
In the above-described embodiment, all of the processes shown in FIGS. 2 and 3 are performed by one device including the speaker 5 for generating a synthesized voice, but each process and a part of the processes are performed by separate devices. , The synthesized speech is finally generated from the speaker 5 according to the changed condition, and the sentence may be read aloud. For example, the first computer only receives a user's input, and other substantial processing is performed by another second computer. Further, the apparatus is not limited to a dedicated apparatus for reading out text, but may be a PDA, a personal computer, a mobile phone, a car navigation terminal, a TV, or the like having a function other than reading out.
[0054]
The text to be read out is not limited to a book, but may be a letter (including an e-mail), a guide, an advertisement, lyrics, and the like. Further, the data stored in the RAM 30 is lost when the power of the apparatus is turned off, but may be continuously stored even when the power of the apparatus is turned off.
[0055]
In each of the above-described embodiments, both the automatic start and stop of the reading are performed in accordance with the change of the reading condition, but either the automatic start or the automatic stop is performed. May be. The reading start position is at the beginning of the text, but may be in the middle of the text. Furthermore, an object read out until the condition change is completed may be a dedicated text used at the time of the condition change.
[0056]
The processing described above can be executed by the CPU 10 of the text reading device 1 using a program describing the procedure of the processing. The program can be provided by being recorded on a well-known recording medium that records the program such as an FD (flexible disk), CDROM, MO, or DVDROM, or can be provided via a network such as the Internet.
[0057]
【The invention's effect】
According to the invention described in claim 1 of the present application, when changing the reading conditions during the reading of the text, the user needs to automatically stop the reading without performing the operation for starting or stopping the reading as in the related art. If the device does not read aloud, it is possible to continue reading aloud for those that do not have text, or to stop reading aloud if it is necessary to do so. Can start reading automatically. Therefore, there is no need for the user to operate the touch panel or the like as in the related art, and the reading conditions can be easily changed.
[0058]
According to the invention described in claim 2 of the present application, the effect of the invention described in claim 1 is exerted, and whether or not the text is being read is linked to the start or stop of reading, and good reading is performed. Is possible.
[0059]
According to the invention described in claim 3 of the present application, the effect of the invention described in claim 1 or 2 is exerted, and the user performs a reading condition change with a time lag from input to output of change information such as reading speed. In addition, the user can naturally hear the synthesized speech readout without being aware of the time difference.
[0060]
According to the invention described in claim 4 of the present application, the effect of the invention described in claim 2 or 3 is exhibited, and a specific change such as volume or tone is associated with the start of reading out, so that quick reading out is possible. is there.
[0061]
According to the invention as set forth in claim 5 of the present application, the effects of the invention as set forth in claim 3 are exhibited, and if there is a change in the text reading condition, the text is read from the beginning, and good reading is possible.
[0062]
According to the invention described in claim 6 of the present application, the change of the text-to-speech condition is associated with the control of the start or stop of the text-to-speech, so that the change of the text-to-speech condition can be easily performed.
[0063]
According to the invention described in claim 7 of the present application, the same effect as that of the invention described in claim 6 is exerted, and the program is read by a computer of a portable terminal or other information device, so that the above-described program can be executed in any information device. The effect can be realized.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a text-to-speech apparatus according to a first embodiment of the present invention.
FIG. 2 is a flowchart for explaining a reading condition changing procedure.
FIG. 3 is a flowchart for explaining another reading condition changing procedure.
FIG. 4 is a time chart showing a time difference between a voice conversion process and a voice output.
FIG. 5 is a front view showing an example of a conventional text-to-speech apparatus.
FIG. 6 is a block diagram showing a schematic configuration of the text-to-speech apparatus shown in FIG. 5;
FIG. 7 is a flowchart illustrating a speech conversion process of the text-to-speech apparatus shown in FIG. 5;
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Text-to-speech device, 2 ... Touch panel, 3 ... Buttons for reading and stopping, 4 ... Slide bar for changing speed and volume, 5 ... Speaker, 6 ... Liquid crystal display Reference numeral 10 CPU, 30 RAM, 40 ROM, 50 display means, 60 amplifier.

Claims

In a text-to-speech device that can change text-to-speech conditions,
Means for inputting change information for changing the text-to-speech condition;
Based on the input change information, means to determine whether to perform the change processing in the text reading state or to perform the change processing in the text reading stop state, and control reading start or stop,
Means for controlling speech based on the modified text-to-speech conditions;
A text-to-speech device comprising:

The text-to-speech apparatus according to claim 1,
The text-to-speech apparatus according to claim 11, wherein the means for performing read-aloud start or stop control further performs text-to-speech start control or text-to-speech stop control based on whether or not the text is being read.

The text-to-speech device according to claim 1 or 2,
The reading start or stop control means controls the stop when the change of the reading condition based on the change information is reflected in the audio output with a delay, and stops when the change is reflected without delay. A text-to-speech device that is not controlled.

The text-to-speech apparatus according to claim 2 or 3,
The reading start or stop control means determines whether the text-to-speech apparatus is reading aloud when the change information is change information of a reading volume or a tone, and controls reading start if the reading is not being performed. A text-to-speech apparatus characterized in that:

The text-to-speech apparatus according to claim 3,
A text-to-speech apparatus, wherein the text-to-speech apparatus controls reading from the beginning of the text to be read when the reading control is performed by the reading start or stop control means.

Change the text-to-speech conditions to a computer
A step of determining whether to perform the change processing in the text-to-speech state or the text-to-speech stop state based on the change information for changing the text-to-speech condition;
A procedure of controlling reading start or stop based on the determination;
A program characterized by executing

A computer-readable recording medium on which the program according to claim 6 is recorded.