JP2004064640A

JP2004064640A - Communication terminal device with character display function

Info

Publication number: JP2004064640A
Application number: JP2002223315A
Authority: JP
Inventors: Hiroko Ida; 井田　裕子
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2002-07-31
Filing date: 2002-07-31
Publication date: 2004-02-26

Abstract

<P>PROBLEM TO BE SOLVED: To provide a communication terminal device with character display function, in which a caller can have conversation with his/her conversation party without talking verbally by creating conversation sentences using characters or character strings, and transmitting them as a voice signal to the conversation party when the caller is in an environment that he/she cannot talk verbally in making conversation over a telephone. <P>SOLUTION: The communication terminal device reads out the characters and/or the character strings stored in a character string storage RAM 106, and generates a recitation tone signal based on the recitation tone of the characters and/or the character strings by a voice signal generating program stored in a system ROM 102. Then, the device accumulates at least the recitation tone signal in a voice accumulation RAM 105 or the like, and transfers the recitation tone signal to voice signal transmitting means by a transmission section 107. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、音声通話開始時または音声通話中に、無音通話をすることができるネットワークに接続されたコンピュータ、携帯情報端末装置（ＰＤＡ）、携帯型電話機等の音声通話端末装置に関する。
【０００２】
【従来の技術】
音声通話は、有線通話回線を利用した利用形態にとどまらず、通信ネットワーク技術の進歩により、インターネット回線や無線通話を利用したＰＣ（Ｐｅｒｓｏｎａｌ　Ｃｏｍｐｕｔｅｒ），携帯情報端末（ＰＤＡ　Ｐｅｒｓｏｎａｌ　Ｄｉｇｉｔａｌ　Ａｓｓｉｓｔａｎｔｓ），携帯型電話機等の通信機器から様々な形態で可能となっている。
【０００３】
音声通話端末装置では、通話者がマイクロホンに向って発声するとともに、通話相手からの音声をスピーカ（あるいはイヤホン）により聴取することでリアルタイムの会話が行なわれる。
【０００４】
このため、従来の音声通話端末装置では、静寂が要求される会議室、映画館、電車等の公衆の場では、電源をオフにしておく、着信があっても非応答とする、または留守番応答機能をオンにしておき通話相手のメッセージを録音する等、実質上の着信規制をしなければならない。
【０００５】
従来の音声通話端末装置では、着信があったときに非応答とした場合には、この場合には発信者は、着信者からのメッセージを全く取得できない。また、発信元の電話番号通知がなされていれば、後に発信者に電話を折り返し用件を聞くことができるが、用件が緊急であるときには、対処が遅れることもある。
【０００６】
このため、音声通話端末装置では、着信規制をすべき環境では、留守番応答機能をオンにしてくことで発信者の用件を録音しておき後に音声を聞くことができる。しかし、この場合にも、発信者はリアルタイムで、発信者と通話ができないため、発信者の用件が緊急であるときにはやはり対処ができないことがある。
【０００７】
このような不都合を解消するために、相手方の電話機でダイヤル操作キーが操作されたときに、自己の電話機でも、同一のダイヤル操作キーが動作し、その結果、ダイヤル操作キーに割り当てられたメッセージによって無声通話を可能とする通信端末装置が知られている（特開平１１−３５５４０４号公報）。これにより、予め通話の相手との間でダイヤル操作キーによって特定の文字列を設定しておくことにより、ダイヤル操作キーの動きによって文字列を確認することで無声通話を行うことができ、さらに、相手方のキー操作によって文字メッセージが作成されていれば、その文字メッセージをディスプレイに表示させることができる。
【０００８】
また、留守番応答を予め文字コード入力して記憶手段に記憶しておき、着信があったときに、文字コード列を音声合成し、文字コード列に対応する音声メッセージを発信者側に送信する通信端末装置も知られている。（特開２０００−２１６８７５公報参照）。
【０００９】
さらに、電車内等の携帯型電話機の使用が制限される環境において、電話交換機に返信用のメッセージを用意しておきこれを指定して発信者に送信するといった無声応答方式も知られている（特開平１１−２０５４３７号公報参照）。
【００１０】
【発明が解決しようとする課題】
しかし、特開平１１−３５５４０４号公報記載の技術では、視覚的にメッセージの確認や自由度のある応答メッセージを作成することはできるが、ダイヤル操作キーによる無声通話を行うためには予め通話相手と、操作キーに対応する特定文字列を設定する手間を要するとともに、設定者毎に操作も異なるため誤操作する恐れがある。また送信者は発声できる状況にあるにも関わらず、キー操作により会話の入出力を行わなければならず不便である。
【００１１】
また、特開２０００−２１６８７５公報記載の技術では、送信者は、音声でメッセージを受取ることができるが、音声メッセージは予め記号と対応付け定型化されたものでありメッセージ数が制限され、またリアルタイムでのコミュニケーションができないという問題がある。
【００１２】
さらに、特開平１１−２０５４３７号公報記載の技術においては、着信側は発信側からの音声をイヤホンを介して聞くようになっており、文字などで視覚的に内容を確認することができない。また、定型化された応答用メッセージを使用するため、内容が固定されており、受信内容に即したメッセージを送ることができない。
【００１３】
本発明の目的は、通話者が電話による通話を行うに際して発声できない環境にあるときに、文字または文字列による会話文を作成しこれを音声信号として通話相手に送信することで、通話者は発声することなく通話相手との会話ができる文字通話機能付きの通信端末装置を提供することにある。
【００１４】
本発明の他の目的は、通話者が電話による通話を行うに際して聴取ができな
い環境にあるときに、通話相手から受信した音声信号から文字または文字列による会話文を作成しこれを表示手段に表示することで、通話者は通話相手からの音声を聴取（ないし発生）することなく当該通話相手との会話ができる文字通話機能付きの通信端末装置を提供することにある。
【００１５】
本発明のさらに他の目的は、通話者が電話による通話を行うに際して発声できず、かつ聴取ができない環境にあるときに、文字または文字列による会話文を作成しこれを音声信号として通話相手に送信し、かつ通話相手から受信した音声信号から文字または文字列による会話文を作成しこれを表示手段に表示することで、通話者は発声することなくかつ通話相手からの音声を聴取（ないし発生）することなく当該通話相手との会話ができる文字通話機能付きの通信端末装置を提供することにある。
【００１６】
【課題を解決するための手段】
本発明の文字通話機能付きの通信端末装置は、操作手段と、表示手段と、通話者の音声が入力される音声入力手段と、前記音声入力手段から入力された音声にかかる音声信号を通話相手に送信する音声信号送信手段と、通話相手の音声にかかる音声信号を受信する音声信号受信手段と、前記音声信号受信手段により受信した音声信号を音声として出力する音声出力手段と、少なくとも、前記操作手段から手入力された通話情報としての文字および／または文字列が手入力された文字および／または文字列を記憶する文字・文字列記憶手段と、前記文字・文字列記憶手段に記憶されている文字および／または文字列を読み出し、当該文字および／または文字列の読み音声にかかる音声信号を生成する読み音声信号生成手段と、少なくとも、前記読み音声信号生成手段が生成した読み音声信号を蓄積する音声信号蓄積手段と、前記音声信号蓄積手段に蓄積された読み音声信号を前記音声信号送信手段に転送する読み音声信号転送手段とを備えたことを特徴とする。
【００１７】
また、本発明の文字通話機能付きの通信端末装置は、操作手段と、表示手段と、通話者の音声が入力される音声入力手段と、前記音声入力手段から入力された音声にかかる音声信号を通話相手に送信する音声信号送信手段と、通話相手の音声にかかる音声信号を受信する音声信号受信手段と、前記音声信号受信手段により受信した音声信号を音声として出力する音声出力手段と、少なくとも、前記操作手段から手入力された通話情報としての文字および／または文字列が手入力された文字および／または文字列を記憶する文字・文字列記憶手段と、前記文字・文字列記憶手段に記憶されている文字および／または文字列を読み出し、当該文字および／または文字列の読み音声にかかる音声信号を生成する読み音声信号生成手段と、少なくとも、前記読み音声信号生成手段が生成した読み音声信号を蓄積する音声信号蓄積手段と、前記音声信号蓄積手段に蓄積された読み音声信号を前記音声信号送信手段に転送する読み音声信号転送手段と、前記音声信号受信手段が受信した音声信号を蓄積する受信音声信号蓄積手段と、前記受信音声信号蓄積手段に記憶されている音声信号を読み出し、当該音声信号にかかる音声の発音対応文字または発音対応文字列を生成する発音対応文字列と、前記発音対応文字列生成手段により生成した発音対応文字列を前記表示手段に転送する発音対応文字列転送手段と備えたことを特徴とする。
【００１８】
上記の文字通話機能付き通信端末装置では、前記文字・文字列記憶手段に蓄積される文字列を、予め作成された定型文字列に含めること、および／または前記音声信号蓄積手段に蓄積される読み音声信号を、予め作成された定型読み音声にかかる音声信号を含めることができる。また、上記の文字通話機能付き通信端末装置では、前記定型文字列に蓄積された前記定型文字列、および／または前記音声信号蓄積手段に蓄積された前記定型読み音声を、特定の記号と対応づけることができる。さらに、上記の文字通話機能付き通信端末装置では、前記定型文字列に蓄積された前記定型文字列、および／または前記音声信号蓄積手段に蓄積された前記定型読み音声を、前記操作手段および表示手段を用いて読み出し可能に保存することができる。
【００１９】
さらに、本発明の文字通話機能付きの通信端末装置は、操作手段と、表示手段と、通話者の音声が入力される音声入力手段と、前記音声入力手段から入力された音声にかかる音声信号を通話相手に送信する音声信号送信手段と、通話相手の音声にかかる音声信号を受信する音声信号受信手段と、前記音声信号受信手段により受信した音声信号を音声として出力する音声出力手段と、前記音声信号受信手段が受信した音声信号を蓄積する受信音声信号蓄積手段と、前記受信音声信号蓄積手段に記憶されている音声信号を読み出し、当該音声信号にかかる音声の発音対応文字または発音対応文字列を生成する発音対応文字列生成手段と、前記発音対応文字列生成手段により生成した発音対応文字列を前記表示手段に転送する発音対応文字列転送手段とを備えたことを特徴とする。
【００２０】
【発明の実施の形態】
図１は本発明の文字通話機能付き通信端末装置の一実施形態を示すハードウェアブロック図である。
【００２１】
図１において、文字通話機能付き通信端末装置１は携帯型電話であり、ＣＰＵ１０１，システムＲＯＭ１０２，ワークＲＡＭ（システムＲＡＭ）１０３，ユーザＲＡＭ１０４，音声蓄積ＲＡＭ１０５，文字列記憶ＲＡＭ１０６，送信部１０７，受信部１０８，操作Ｉ／Ｆ（インタフェース）１０９，キー・ボタン１１０，表示Ｉ／Ｆ１１１，ディスプレイ１１２，マイクロホンＩ／Ｆ１１３，マイクロホン１１４，スピーカＩ／Ｆ１１５，スピーカ１１６を備えている。図１においては、これらはバス１００に接続されている。なお、図１では時計，電源，バイブレータ等、本発明の理解に不要な構成要素の図示は省略してある。
【００２２】
ＣＰＵ１０１は携帯型電話機１全体を統括制御する。システムＲＯＭ１０２には図２に示すように、システムプログラム（オペレーティングシステム，各種ドライバ等）をベースに読み音声信号生成プログラム，音声対応文字列生成プログラム，定型文字列・定型音声利用プログラム，各種アプリケーションプログラム（通信プログラム，インターネット接続プログラム等）等が格納されている。読み音声信号生成プログラムとＣＰＵ１０１とが本発明の読み音声信号生成手段に相当し、音声対応文字列生成プログラムとＣＰＵ１０１とが本発明の音声対応文字列生成に対応する。また、定型文字列・定型音声利用プログラムは、使用された定型文字列や定型音声を後述する図３（Ａ），（Ｂ）に例示するよなテーブルに格納するとともに、適時に記号を用いて所望の定型文字列や定型音声を呼び出すことができる。なお、システムプログラムには、本発明における転送プログラム（音声信号転送手段に対応する）が含まれている。
【００２３】
ワークＲＡＭ１０３は、システムの作業領域として使用される。ユーザＲＡＭ１０４には、通信相手情報リスト（いわゆる電話帳），ユーザ設定情報等のユーザ入力情報が格納される。
【００２４】
音声蓄積ＲＡＭ１０５は、たとえば、音声信号生成プログラムにより生成された音声信号生読み音声信号、受信部（音声信号受信手段）１０８が受信した音声信号が蓄積されるもので、本発明における音声信号蓄積手段、受信音声信号蓄積手段に対応する。音声信号蓄積手段や受信音声信号蓄積手段の具体的な構成は、図１には限定されない。たとえば、音声信号生成プログラムは、ワークＲＡＭ１０３上で作成され、音声蓄積ＲＡＭ１０５に蓄積されずに送信部１０７に送出されるようにもでき、この場合にはワークＲＡＭ１０３と音声蓄積ＲＡＭ１０５とが音声信号蓄積手段に対応する。なお、音声蓄積ＲＡＭ１０５には、たとえば携帯型電話機出荷時に定型読み音声が記憶されることもあるし、ユーザにより予め作成された定型読み音声が記憶されることもある。
【００２５】
文字列記憶ＲＡＭ１０６は、たとえばキー・ボタン１１０から操作Ｉ／Ｆ１０９を介して手入力された通話情報としての文字および／または文字列を記憶するもので、本発明の文字・文字列記憶手段に対応する。文字・文字列記憶手段の具体的な構成は、図１には限定されない。なお、文字列記憶ＲＡＭ１０６には、たとえば携帯型電話機出荷時に定型文字列が記憶されることもあるし、ユーザにより予め作成された定型文字列が記憶されることもある。
【００２６】
送信部１０７は、通話相手に通話信号（音声信号）を送信するもので、本発明の音声信号送信手段に対応する。送信部１０７は、ネットワークに接続することもできる。なお、ネットワーク上のウェブサイト等にアクセスしては、定型読み音声を音声蓄積ＲＡＭ１０５にダウンロードすることもできるし、定型文字列を文字列記憶ＲＡＭ１０６にダウンロードすることもできる。
【００２７】
受信部１０８は、通話相手から通話信号（音声信号）を受信するもので、前述したように本発明の音声信号受信手段に対応する。
操作Ｉ／Ｆ（インタフェース）１０９，キー・ボタン１１０は、本発明の操作手段に対応する。本実施形態では通信端末装置１は携帯型電話であるため、キー・ボタン１１０が操作装置であるが、たとえば通信端末装置がコンピュータやＰＤＡのときは、キーボード、マウス、入力ペン等を操作装置して使用することもできる。キー・ボタン１１０からは、電話番号やアドレスの入力・選択、いわゆる電話帳の所定欄へのデータ書き込み、送信情報の入力や選択、送信指示等の他、前述したように通話情報としての文字および／または文字列が手入力される。キー・ボタン１１０には、文字通話機能をオン・オフするキーが割り当てられている。
【００２８】
表示Ｉ／Ｆ１１１，ディスプレイ１１２は、本発明の表示手段に対応する。ディスプレイ１１２は、通常は液晶ディスプレイ（ＬＣＤ），ＬＥＤにより構成されるが、通信端末装置がコンピュータであるときはＣＲＴを表示手段として使用することもできる。ディスプレイ１１２には、通信端末装置１になされる各種の設定や、操作部１８から手入力された通話情報としての文字および／または文字列や、文字列記憶ＲＡＭ１０６から呼び出された文字および／または文字列が表示される。
【００２９】
マイクロホンＩ／Ｆ１１３，マイクロホン１１４は、本発明の音声入力手段に対応する。また、スピーカＩ／Ｆ１１５，スピーカ１１６は、本発明の音声出力手段に対応する。スピーカ１１６に代えて、イヤーホンを用いることもできる。
【００３０】
なお、図１には図示していないが、通信端末装置１にはメモリカードドライブ等の有線通信Ｉ／Ｆや赤外線Ｉ／Ｆを備えることもできる。また、たとえば通信端末装置がコンピュータやＰＤＡであるときには、メモリカードドライブ、フロッピーディスクドライブ、ハードディスクドライブ、光磁気ディスクドライブ等のＩ／Ｆを備えることができる。これらのＩ／Ｆを介して、通信端末装置のメーカ等が用意した定型読み音声や定型文字列を取り込むことができる。
【００３１】
図３（Ａ），（Ｂ）に、文字列記憶ＲＡＭ１０６に記憶された定型文字列のテーブル例を示す。図３（Ａ）のテーブルは、指定した記号により定型文字列を呼び出すことができる。また、図３（Ｂ）のテーブルには、通信相手の電話番号ごとに、過去に使用された定型文字列が記録され、指定した記号で通話中に所望の定型文字列を通信相手に送信することができる。なお、「はい。」「いいえ。」などの短メッセージ（フレーズ）、名詞や形容詞などの品詞単位を登録しておき、これらを接続できるようにしてもよい。
【００３２】
図４のフローチャートを参照して図１の通話装置１の受信に際しての文字通話機能（音声入力文字表示機能）を説明する。
ここでは、音声入力文字表示機能が予めオンになっているものとする。音声入力文字表示機能は、着信がある前にオンとしておくこともできるし、着信があった後に通話相手を確認してオンとすることができる。
【００３３】
音声受信があったか否かが監視されており（Ｓ１０１）、音声受信がないときは、さらに音声入力文字表示機能がオフになったかが監視され（Ｓ１０２）、音声入力文字表示機能がオフとなったときは、通常の音声受信に移行する（Ｓ１０３）。
【００３４】
音声信号受信があると当該音声信号は、音声蓄積ＲＡＭ１０５に蓄積され（Ｓ１０４）る。次いで、システムＲＯＭ１０２の読み音声信号生成プログラムが、音声信号を音声対応文字列に変換し（Ｓ１０５）、ディスプレイ１１２には音声対応文字列が表示される（Ｓ１０６）。
【００３５】
図５のフローチャートを参照して図１の通話装置１の送信に際しての文字通話機能（文字入力音声送信機能）を説明する。ここでは、文字入力音声送信機能が予めオンになっているものとする。文字入力音声送信機能は、着信がある前にオンとしておくこともできるし、着信があった後に通話相手を確認してオンすることができることは、図４の音声入力文字表示機能の場合と同様である。
【００３６】
ここでは、文字入力音声送信機能がオンになっているものとする。この場合には、キー・ボタン１１０からの文字入力が確定していない場合において（Ｓ２０１）文字入力音声送信機能がオフとなったときは（Ｓ２０２）、通常の音声送信に移行する（Ｓ２０３）。
【００３７】
文字入力が確定すると、ワークＲＡＭ１０３に入力文字列が記憶され（Ｓ２０４）、当該入力文字列は、読み音声信号に変換される（Ｓ２０５）。この読み音声信号は、文字列記憶ＲＡＭ１０６に記憶される一方（Ｓ２０６）、送信部に転送され（Ｓ２０７）、送信部１９７から通信相手に通信され（Ｓ２０８）、処理をＳ２０１に戻す。
【００３８】
なお、通話者（ユーザ）は、メッセージの入力に際して、文字列記憶ＲＡＭ１０６に記憶されたテーブル（図３（Ａ），（Ｂ））から所望の定型文字列を選択することができる。
【００３９】
図４に示した音声入力文字表示機能と、図５に示した文字入力音声送信機能は、いずれか一方をオンにすることもできるし、双方を同時にオンとすることもできる。
【００４０】
図５の例では、通話相手に読み音声信号として送信するメッセージを、文字列記憶ＲＡＭ１０６に記憶されている定型文字列を利用して作成したが、音声蓄積ＲＡＭ１０５に蓄積した定型音声を利用して作成することもできる。この場合、キー・ボタン１１０から入力した文字や文字列と、音声蓄積ＲＡＭ１０５に蓄積した定型音声を組み合わせた場合、１つのメッセージ中に異なる音質が混在することになる。たとえば、音声蓄積ＲＡＭ１０５に蓄積した定型音声が、通話者（ユーザ）が音声入力することにより作成された音声であり、文字列から作成された読み音声信号にかかる音声が人工音声または工場出荷時にメモリに保存されている通話者以外の音声である場合に、通話者が作成したメッセージに定型音声が含まれると、通話相手は複数音質が混在した極めて不自然な音声を聴取することになる。このようなことから、定型音声の音質を通話者の音声の音質に変換する機能を通信端末装置１に搭載しておくことができる。通話者の音声の音質に変換する手法として、通話者に「あいうえお・・・」等の５０音を音声入力した文をサンプリングして音素を作成しておき、文字列音声を作成するときに当該音素を用いることができる。この場合、形態素解析を利用してイントネーションを文字列音声に付与することもできる。
【００４１】
図６に形態素解析を利用した、抑揚付き音声生成装置の機能ブロックを示す。図６において、抑揚付き音声生成装置２は、形態素解析手段２１と、抑揚解析手段２２と、基本抑揚記憶手段２３と、基本音声記憶手段２４と合成手段２５とを備えている。
【００４２】
形態素解析手段２１には、操作手段（図１のキー・ボタン１１０等）から文字列が入力され、形態素解析手段２１は当該文字列の解析を行い、文章の切れ目等を解析する。また抑揚解析手段２２は、音声入力手段（図１のマイクロホン１１４等）から入力されたサンプル音声（特定の文書を読んだときの音声データ）から、通話者（ユーザ）の発声の特徴（間の置き方、抑揚のくせ）等を解析して解析結果（抑揚情報）を基本抑揚記憶手段２３に記憶しておく。また、基本音声記憶手段２４には、前記サンプル音声からたとえば５０音の音素を記憶しておく。
【００４３】
合成手段２５は、形態素解析手段２１からメッセージ（文字列）およびメッセージにおける切れ目の情報を取得するとともに、基本抑揚記憶手段２３に記憶された抑揚情報と、基本音声記憶手段２４に記憶された５０音の音素とからメッセージを作成し、送信手段（図１の送信部１０７）に出力することができる。
【００４４】
【発明の効果】
本発明の通信端末装置では、操作手段から入力された文字や文字列を、音声信号生成手段により読み音声信号変換し、これを送信手段を介して通話先に送信できるので、ユーザが発声できない環境にあるときであっても、通話の続行ができ、しかも通話相手には通常と通話と同様の通話環境が保証される。
【００４５】
また、本発明の通信端末装置では、受信手段から受信した音声信号を発音対応文字列生成手段により文字や文字列として表示手段に表示することができるので、ユーザがスピーカやイヤホンを使用できない環境にあっても、通話の続行ができき、しかも通話相手には通常と通話と同様の通話環境が保証される。
【図面の簡単な説明】
【図１】本発明の文字通話機能付き通信端末装置の一実施形態を示すハードウェアブロック図である。
【図２】システムＲＯＭの構成例を示す説明図である。
【図３】（Ａ）は指定した記号により定型文字列を呼び出すことができるテーブルを示し、（Ｂ）は通信相手の電話番号ごとに、過去に使用された定型文字列が記録され、指定した記号で通話中に所望の定型文字列を通信相手に送信することができるテーブルを示す。
【図４】図１の通話装置の受信に際しての文字通話機能（音声入力文字表示機能）を説明するためのフローチャートである。
【図５】図１の通話装置の送信に際しての文字通話機能（文字入力音声送信機能）を説明するためのフローチャートである。
【図６】形態素解析を利用した、抑揚付き音声生成装置の機能ブロックである。
【符号の説明】
１　通信端末装置
２　抑揚付き音声生成装置
２１　形態素解析手段
２２　抑揚解析手段
２３　基本抑揚記憶手段
２４　基本音声記憶手段
２５　合成手段
１０１　ＣＰＵ
１０２　システムＲＯＭ
１０３　ワークＲＡＭ
１０４　ユーザＲＡＭ
１０５　音声蓄積ＲＡＭ
１０６　文字列記憶ＲＡＭ
１０７　送信部
１０８　受信部
１０９　操作Ｉ／Ｆ
１１０　キー・ボタン
１１１　表示Ｉ／Ｆ
１１２　ディスプレイ
１１３　マイクロホンＩ／Ｆ
１１４　マイクロホン
１１５　スピーカＩ／Ｆ
１１６　スピーカ
１００　バス[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a voice communication terminal device such as a computer, a personal digital assistant (PDA), and a portable telephone connected to a network that enables a silent voice communication at the start of a voice call or during a voice call.
[0002]
[Prior art]
The voice call is not limited to the use form using the wired call line, but the PC (Personal Computer), the portable information terminal (PDA Personal Digital Assistants), and the portable telephone using the Internet line and the wireless call due to the progress of the communication network technology. And so on in various forms.
[0003]
In a voice call terminal device, a caller speaks toward a microphone, and a voice from a call partner is heard by a speaker (or an earphone) so that a real-time conversation is performed.
[0004]
For this reason, in a conventional voice communication terminal device, in a public place such as a conference room, a movie theater, or a train where silence is required, the power is turned off, the call is not answered even if there is an incoming call, or the answering machine responds. You must turn on the function and record the message of the other party.
[0005]
In a conventional voice call terminal device, if a non-response is made when a call is received, in this case, the caller cannot acquire a message from the callee at all. In addition, if the caller's telephone number is notified, the caller can return to the telephone and ask for the message later, but when the message is urgent, the response may be delayed.
[0006]
For this reason, in an environment where the incoming call should be barred, the voice mail terminal device can turn on the answering machine function to record the message of the caller and listen to the voice afterwards. However, also in this case, since the caller cannot talk to the caller in real time, he / she may not be able to cope when the caller's business is urgent.
[0007]
In order to solve such inconvenience, when the dial operation key is operated on the other party's telephone, the same dial operation key operates on the own telephone, and as a result, the message assigned to the dial operation key 2. Description of the Related Art A communication terminal device capable of performing a silent call is known (Japanese Patent Laid-Open No. 11-355404). Thereby, by setting a specific character string with the dial operation key in advance with the other party of the call, it is possible to make a silent call by checking the character string by the movement of the dial operation key, If a text message has been created by the key operation of the other party, the text message can be displayed on the display.
[0008]
In addition, a communication in which the answering machine response is input in advance in a character code and stored in a storage means, and when an incoming call is received, a character code string is synthesized by voice and a voice message corresponding to the character code string is transmitted to the sender. Terminal devices are also known. (See JP-A-2000-216875).
[0009]
Further, in an environment where the use of a portable telephone is restricted, such as in a train, a silent response system is also known in which a reply message is prepared in a telephone exchange, designated and transmitted to a caller ( See Japanese Patent Application Laid-Open No. H11-205437).
[0010]
[Problems to be solved by the invention]
However, according to the technology described in Japanese Patent Application Laid-Open No. H11-355404, it is possible to visually confirm a message and create a response message with a degree of freedom. In addition, it takes time and effort to set a specific character string corresponding to the operation key, and there is a risk of erroneous operation because the operation is different for each setter. In addition, the sender must input and output conversations by operating keys, which is inconvenient, even though the sender can speak.
[0011]
Further, according to the technique described in Japanese Patent Application Laid-Open No. 2000-216875, the sender can receive a message by voice, but the voice message is preliminarily associated with a symbol and standardized, and the number of messages is limited. There is a problem that can not communicate in the.
[0012]
Further, in the technology described in Japanese Patent Application Laid-Open No. H11-205439, the receiving side listens to the voice from the transmitting side via the earphone, and the content cannot be visually confirmed with characters or the like. In addition, since a stylized response message is used, the content is fixed, and it is not possible to send a message according to the received content.
[0013]
SUMMARY OF THE INVENTION An object of the present invention is to create a conversational sentence of characters or character strings and transmit it as an audio signal to the other party in an environment where the caller cannot make a speech when making a telephone call, so that the caller can speak. It is an object of the present invention to provide a communication terminal device with a text communication function that allows a user to have a conversation with a communication partner without having to perform the communication.
[0014]
Another object of the present invention is to create a text or character string conversation from a voice signal received from a caller and display the same on a display means when the caller is in an environment where they cannot hear when making a telephone call. Accordingly, it is an object of the present invention to provide a communication terminal device with a text communication function that enables a caller to talk with the call partner without listening to (or generating) voice from the call partner.
[0015]
Still another object of the present invention is to create a conversational sentence of characters or character strings as an audio signal and communicate it to the other party in a situation where the caller cannot speak and hear when making a telephone call. By creating a sentence in characters or character strings from the voice signal transmitted and received from the other party and displaying it on the display means, the party can hear (or generate) the voice from the other party without speaking. It is another object of the present invention to provide a communication terminal device with a text communication function that allows a user to have a conversation with the other party without performing the communication.
[0016]
[Means for Solving the Problems]
A communication terminal device with a character call function according to the present invention includes an operation unit, a display unit, a voice input unit for inputting a voice of a caller, and a voice signal relating to the voice input from the voice input unit. Voice signal transmitting means for transmitting the voice signal of the other party, voice signal receiving means for receiving the voice signal, voice output means for outputting the voice signal received by the voice signal receiving means as voice, at least the operation A character / character string storage means for storing the character and / or character string as the call information manually input from the means, and the character / character string storage means for storing the manually input character and / or character string; Reading voice signal generation means for reading a character and / or a character string and generating a voice signal concerning the reading voice of the character and / or the character string; Audio signal storage means for storing the read audio signal generated by the only audio signal generation means, and read audio signal transfer means for transferring the read audio signal stored in the audio signal storage means to the audio signal transmission means. It is characterized by the following.
[0017]
In addition, the communication terminal device with a character call function of the present invention includes an operation unit, a display unit, a voice input unit into which a voice of a caller is input, and a voice signal related to the voice input from the voice input unit. Voice signal transmitting means for transmitting to the other party, voice signal receiving means for receiving a voice signal relating to the voice of the other party, voice output means for outputting the voice signal received by the voice signal receiving means as voice, at least, Character and / or character string storage means for storing manually input characters and / or character strings as characters and / or character strings as call information manually input from the operation means, and stored in the character and / or character string storage means Reading voice signal generating means for reading a character and / or a character string, and generating a voice signal relating to a reading voice of the character and / or the character string; An audio signal storage unit that stores the read audio signal generated by the read audio signal generation unit, a read audio signal transfer unit that transfers the read audio signal stored in the audio signal storage unit to the audio signal transmission unit, A receiving voice signal storing means for storing a voice signal received by the voice signal receiving means; reading a voice signal stored in the receiving voice signal storing means; And a pronunciation corresponding character string transfer unit for transferring the pronunciation corresponding character string generated by the pronunciation corresponding character string generation unit to the display unit.
[0018]
In the above-mentioned communication terminal device with a character call function, the character string stored in the character / character string storage means may be included in a fixed-form character string prepared in advance, and / or the reading stored in the voice signal storage means may be included. The audio signal can include an audio signal relating to a pre-created standard read audio. Further, in the communication terminal device with a text communication function described above, the fixed character string stored in the fixed character string and / or the fixed reading voice stored in the voice signal storage unit is associated with a specific symbol. be able to. Further, in the communication terminal device with a text call function, the fixed character string stored in the fixed character string and / or the fixed read voice stored in the voice signal storage means are stored in the operating means and the display means. And can be stored so that it can be read.
[0019]
Further, the communication terminal device with a character call function of the present invention includes an operation unit, a display unit, a voice input unit for inputting a voice of a caller, and a voice signal relating to the voice input from the voice input unit. Voice signal transmitting means for transmitting to the other party, voice signal receiving means for receiving a voice signal relating to the voice of the other party, voice output means for outputting the voice signal received by the voice signal receiving means as voice, A receiving voice signal storing means for storing a voice signal received by the signal receiving means; and reading a voice signal stored in the receiving voice signal storing means, and generating a sound-corresponding character or a sound-corresponding character string of the voice related to the voice signal. A pronunciation-corresponding character string generating means for generating, and a pronunciation-corresponding character string transfer means for transferring the pronunciation-corresponding character string generated by the pronunciation-corresponding character string generating means to the display means. Characterized by comprising and.
[0020]
BEST MODE FOR CARRYING OUT THE INVENTION
FIG. 1 is a hardware block diagram showing one embodiment of a communication terminal device with a text communication function of the present invention.
[0021]
In FIG. 1, a communication terminal device 1 with a text call function is a mobile phone, and includes a CPU 101, a system ROM 102, a work RAM (system RAM) 103, a user RAM 104, a voice storage RAM 105, a character string storage RAM 106, a transmission unit 107, and a reception unit. 108, an operation I / F (interface) 109, key buttons 110, a display I / F 111, a display 112, a microphone I / F 113, a microphone 114, a speaker I / F 115, and a speaker 116. In FIG. 1, these are connected to a bus 100. In FIG. 1, components such as a clock, a power supply, and a vibrator that are not necessary for understanding the present invention are not shown.
[0022]
The CPU 101 controls the entire mobile phone 1. As shown in FIG. 2, in the system ROM 102, based on a system program (operating system, various drivers, etc.), a read voice signal generation program, a voice-compatible character string generation program, a standard character string / standard voice utilization program, various application programs ( Communication programs, Internet connection programs, etc.). The read voice signal generation program and the CPU 101 correspond to a read voice signal generation unit of the present invention, and the voice-compatible character string generation program and the CPU 101 correspond to the voice-compatible character string generation of the present invention. In addition, the program for using a fixed-form character string and a fixed-form sound stores a used fixed-form character string and a fixed-form sound in a table as illustrated in FIGS. 3A and 3B described later, and uses symbols in a timely manner. A desired fixed character string or fixed sound can be called. Note that the system program includes the transfer program (corresponding to the audio signal transfer means) in the present invention.
[0023]
The work RAM 103 is used as a work area of the system. The user RAM 104 stores user input information such as a communication partner information list (so-called telephone directory) and user setting information.
[0024]
The voice storage RAM 105 stores, for example, a voice signal raw reading voice signal generated by a voice signal generation program and a voice signal received by the receiving unit (voice signal receiving means) 108. , Corresponding to the received voice signal storage means. The specific configuration of the audio signal storage means and the received audio signal storage means is not limited to FIG. For example, the audio signal generation program may be created on the work RAM 103 and sent to the transmission unit 107 without being stored in the audio storage RAM 105. In this case, the work RAM 103 and the audio storage RAM 105 store the audio signal. Corresponding to the means. Note that the voice storage RAM 105 may store, for example, a fixed reading voice at the time of shipment of the mobile phone, or may store a fixed reading voice created in advance by the user.
[0025]
The character string storage RAM 106 stores, for example, characters and / or character strings as call information manually input from the key buttons 110 via the operation I / F 109, and corresponds to the character / character string storage means of the present invention. I do. The specific configuration of the character / character string storage means is not limited to FIG. Note that the character string storage RAM 106 may store a fixed character string when the mobile phone is shipped, for example, or may store a fixed character string created in advance by the user.
[0026]
The transmitting unit 107 transmits a call signal (voice signal) to the other party, and corresponds to a voice signal transmitting unit of the present invention. The transmitting unit 107 can be connected to a network. When accessing a web site or the like on the network, the standard read voice can be downloaded to the voice storage RAM 105, and the standard character string can be downloaded to the character string storage RAM.
[0027]
The receiving unit 108 receives a call signal (voice signal) from a call partner, and corresponds to the voice signal receiving unit of the present invention as described above.
The operation I / F (interface) 109 and the key button 110 correspond to the operation means of the present invention. In the present embodiment, since the communication terminal device 1 is a portable telephone, the key button 110 is an operation device. For example, when the communication terminal device is a computer or a PDA, a keyboard, a mouse, an input pen, and the like are operated. Can also be used. From the key buttons 110, input / selection of a telephone number or an address, writing of data in a predetermined column of a so-called telephone directory, input / selection of transmission information, transmission instruction, etc. // A character string is manually input. The key button 110 is assigned a key for turning on / off the text communication function.
[0028]
The display I / F 111 and the display 112 correspond to display means of the present invention. The display 112 is usually constituted by a liquid crystal display (LCD) and an LED, but when the communication terminal device is a computer, a CRT can be used as a display means. The display 112 displays various settings made on the communication terminal device 1, characters and / or character strings as call information manually input from the operation unit 18, characters and / or characters called from the character string storage RAM 106. Columns are displayed.
[0029]
The microphone I / F 113 and the microphone 114 correspond to the voice input unit of the present invention. Further, the speaker I / F 115 and the speaker 116 correspond to a sound output unit of the present invention. Instead of the speaker 116, an earphone can be used.
[0030]
Although not shown in FIG. 1, the communication terminal device 1 may include a wired communication I / F such as a memory card drive or an infrared I / F. Further, for example, when the communication terminal device is a computer or a PDA, an I / F such as a memory card drive, a floppy disk drive, a hard disk drive, or a magneto-optical disk drive can be provided. Via these I / Fs, it is possible to take in a fixed reading voice or a fixed character string prepared by a communication terminal device maker or the like.
[0031]
FIGS. 3A and 3B show examples of tables of fixed character strings stored in the character string storage RAM 106. FIG. In the table of FIG. 3A, a fixed character string can be called by a designated symbol. In the table of FIG. 3B, a fixed character string used in the past is recorded for each telephone number of the communication partner, and a desired fixed character string is transmitted to the communication partner during a call using the designated symbol. be able to. In addition, a short message (phrase) such as “Yes.” Or “No.” and a part of speech unit such as a noun or an adjective may be registered, and these may be connected.
[0032]
The text communication function (voice input character display function) at the time of reception by the communication device 1 of FIG. 1 will be described with reference to the flowchart of FIG.
Here, it is assumed that the voice input character display function is turned on in advance. The voice input character display function can be turned on before there is an incoming call, or can be turned on by confirming the other party after receiving the incoming call.
[0033]
It is monitored whether or not a voice has been received (S101). If no voice has been received, it is further monitored whether or not the voice input character display function has been turned off (S102), and if the voice input character display function has been turned off. Shifts to normal voice reception (S103).
[0034]
When the voice signal is received, the voice signal is stored in the voice storage RAM 105 (S104). Next, the reading voice signal generation program in the system ROM 102 converts the voice signal into a voice-compatible character string (S105), and the voice-compatible character string is displayed on the display 112 (S106).
[0035]
The character communication function (character input voice transmission function) at the time of transmission by the communication device 1 of FIG. 1 will be described with reference to the flowchart of FIG. Here, it is assumed that the character input voice transmission function is turned on in advance. The character input voice transmission function can be turned on before there is an incoming call, or it can be checked and turned on after receiving an incoming call, as in the case of the voice input character display function in FIG. It is.
[0036]
Here, it is assumed that the character input voice transmission function is turned on. In this case, when the character input from the key button 110 is not confirmed (S201) and the character input voice transmission function is turned off (S202), the process shifts to normal voice transmission (S203).
[0037]
When the character input is confirmed, the input character string is stored in the work RAM 103 (S204), and the input character string is converted into a reading voice signal (S205). This read voice signal is stored in the character string storage RAM 106 (S206), transferred to the transmission unit (S207), communicated from the transmission unit 197 to the communication partner (S208), and the process returns to S201.
[0038]
The caller (user) can select a desired fixed character string from the tables (FIGS. 3A and 3B) stored in the character string storage RAM 106 when inputting a message.
[0039]
Either the voice input character display function shown in FIG. 4 or the character input voice transmission function shown in FIG. 5 can be turned on, or both can be turned on at the same time.
[0040]
In the example of FIG. 5, the message to be read and transmitted as a voice signal to the other party is created by using the standard character string stored in the character string storage RAM 106, but by using the standard voice stored in the voice storage RAM 105. Can also be created. In this case, when a character or a character string input from the key button 110 is combined with the fixed voice stored in the voice storage RAM 105, different voice qualities are mixed in one message. For example, the standard voice stored in the voice storage RAM 105 is a voice generated by a caller (user) inputting voice, and a voice relating to a read voice signal generated from a character string is an artificial voice or a memory at factory shipment. If the message created by the caller includes a fixed voice in the case where the voice is a voice other than the caller stored in the other party, the call partner will hear an extremely unnatural voice mixed with a plurality of sound qualities. For this reason, the communication terminal device 1 can be provided with a function of converting the sound quality of the standard voice into the voice quality of the caller's voice. As a method of converting the voice of the caller into sound quality, a phoneme is created by sampling a sentence in which the caller has input 50 sounds such as "aiueo ..." Phonemes can be used. In this case, the intonation can be added to the character string voice using morphological analysis.
[0041]
FIG. 6 shows functional blocks of a speech generation device with intonation using morphological analysis. In FIG. 6, the intonation-added speech generating device 2 includes a morphological analysis unit 21, an intonation analysis unit 22, a basic intonation storage unit 23, a basic speech storage unit 24, and a synthesis unit 25.
[0042]
A character string is input to the morphological analysis unit 21 from an operation unit (such as the key / button 110 in FIG. 1), and the morphological analysis unit 21 analyzes the character string and analyzes a break in a sentence. Further, the intonation analyzing means 22 converts the characteristics of the utterance of the caller (user) from the sample voice (voice data when reading a specific document) input from the voice input means (such as the microphone 114 in FIG. 1). The analysis results (inflection information) are stored in the basic intonation storage means 23. The basic voice storage means 24 stores, for example, 50 phonemes from the sample voice.
[0043]
The synthesizing unit 25 acquires the message (character string) and the break information in the message from the morphological analysis unit 21, and stores the intonation information stored in the basic intonation storage unit 23 and the 50 tones stored in the basic voice storage unit 24. A message can be created from the phonemes of the above and output to the transmitting means (the transmitting unit 107 in FIG. 1).
[0044]
【The invention's effect】
In the communication terminal device according to the present invention, a character or a character string input from the operation means is read by the sound signal generation means, converted into a sound signal, and can be transmitted to the destination via the transmission means. , The call can be continued, and the other party is assured of the same call environment as the normal call.
[0045]
Further, in the communication terminal device of the present invention, the sound signal received from the receiving means can be displayed on the display means as a character or a character string by the sound-corresponding character string generating means, so that the user cannot use the speaker or the earphone. Even if there is a call, the call can be continued, and the other party is assured of the same call environment as the normal call.
[Brief description of the drawings]
FIG. 1 is a hardware block diagram showing an embodiment of a communication terminal device with a text communication function of the present invention.
FIG. 2 is an explanatory diagram showing a configuration example of a system ROM.
FIG. 3A shows a table in which a fixed character string can be called by a designated symbol, and FIG. 3B shows a fixed character string used in the past recorded for each telephone number of a communication partner, and is designated. 5 shows a table capable of transmitting a desired fixed-form character string to a communication partner during a call using symbols.
FIG. 4 is a flowchart for explaining a text communication function (voice input character display function) upon reception of the communication device of FIG. 1;
FIG. 5 is a flowchart for explaining a text communication function (text input voice transmission function) at the time of transmission of the telephone device of FIG. 1;
FIG. 6 is a functional block diagram of a speech generation device with intonation using morphological analysis.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 Communication terminal device 2 Speech generation apparatus with intonation 21 Morphological analysis means 22 Inflection analysis means 23 Basic intonation storage means 24 Basic speech storage means 25 Synthesis means 101 CPU
102 System ROM
103 Work RAM
104 User RAM
105 Voice storage RAM
106 Character string storage RAM
107 Transmission unit 108 Receiving unit 109 Operation I / F
110 key button 111 display I / F
112 Display 113 Microphone I / F
114 Microphone 115 Speaker I / F
116 speaker 100 bus

Claims

Operating means;
Display means;
Voice input means for inputting the voice of the caller;
Sound signal transmitting means for transmitting a sound signal relating to the sound input from the sound input means to the other party,
Voice signal receiving means for receiving a voice signal relating to the voice of the other party;
Audio output means for outputting the audio signal received by the audio signal receiving means as audio,
At least character / character string storage means for storing manually input characters and / or character strings as characters and / or character strings as call information manually input from the operation means;
Reading voice signal generation means for reading a character and / or a character string stored in the character / character string storage means and generating a voice signal relating to a reading voice of the character and / or the character string;
At least, an audio signal storage unit that stores the read audio signal generated by the read audio signal generation unit,
A read audio signal transfer unit that transfers the read audio signal stored in the audio signal storage unit to the audio signal transmission unit,
A communication terminal device with a character call function, comprising:

Operating means;
Display means;
Voice input means for inputting the voice of the caller;
Sound signal transmitting means for transmitting a sound signal relating to the sound input from the sound input means to the other party,
Voice signal receiving means for receiving a voice signal relating to the voice of the other party;
Audio output means for outputting the audio signal received by the audio signal receiving means as audio,
A reception audio signal storage unit that stores the audio signal received by the audio signal reception unit,
Reading a voice signal stored in the received voice signal storage means, pronunciation corresponding character string generation means for generating a pronunciation corresponding character or a pronunciation corresponding character string of the voice related to the voice signal,
Pronunciation corresponding character string transfer means for transferring the pronunciation corresponding character string generated by the pronunciation corresponding character string generation means to the display means,
A communication terminal device with a character call function, comprising:

Operating means;
Display means;
Voice input means for inputting the voice of the caller;
Sound signal transmitting means for transmitting a sound signal relating to the sound input from the sound input means to the other party,
Voice signal receiving means for receiving a voice signal relating to the voice of the other party;
Audio output means for outputting the audio signal received by the audio signal receiving means as audio,
At least character / character string storage means for storing manually input characters and / or character strings as characters and / or character strings as call information manually input from the operation means;
Reading voice signal generating means for reading a character and / or a character string stored in the character / character string storing means and generating a voice signal relating to a reading voice of the character and / or the character string;
At least, an audio signal storage unit that stores the read audio signal generated by the read audio signal generation unit,
A read audio signal transfer unit that transfers the read audio signal stored in the audio signal storage unit to the audio signal transmission unit,
A reception audio signal storage unit that stores the audio signal received by the audio signal reception unit,
Reading a voice signal stored in the received voice signal storage means, a pronunciation-corresponding character string for generating a pronunciation-corresponding character or a pronunciation-corresponding character string of the voice related to the audio signal;
Pronunciation corresponding character string transfer means for transferring the pronunciation corresponding character string generated by the pronunciation corresponding character string generation means to the display means,
A communication terminal device with a character call function, comprising:

The character string stored in the character / character string storage means may include a pre-formed fixed character string, and / or the read voice signal stored in the voice signal storage means may include a pre-formed fixed read voice. The communication terminal device with a text communication function according to claim 1 or 3, wherein the communication terminal device includes a voice signal according to (1).

5. The fixed character string stored in the fixed character string and / or the fixed reading voice stored in the voice signal storage unit is associated with a specific symbol. Communication terminal device with a character call function.

The standard character string stored in the standard character string and / or the standard read voice stored in the audio signal storage unit is stored so as to be readable using the operation unit and the display unit. The communication terminal device with a character call function according to claim 4 or 5.