JP2004048277A

JP2004048277A - Communication system

Info

Publication number: JP2004048277A
Application number: JP2002201591A
Authority: JP
Inventors: Shuhei Yasuda; 安田　周平
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2002-07-10
Filing date: 2002-07-10
Publication date: 2004-02-12

Abstract

<P>PROBLEM TO BE SOLVED: To confirm communication contents in real time without providing any speech recognition sections in a communication terminal 120. <P>SOLUTION: The communication terminal 20 has a display 21 for displaying text data being distributed via a network 10, calls up other communication terminals via the network 10, and transmits voice data on call contents to a speech recognition section 11 via the network 10 during call. The speech recognition section 11 performs the voice recognition of the voice data being received from the communication terminal 20. A text data generation section 12 generates the text data from the speech recognition result of the speech recognition section 11, and distributes the text data to the communication terminal 20 via the network 10 in real time. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
この発明は、通信端末装置がネットワークを介して他と通信を行う通信システムに係り、特に、通信端末装置における通話内容を確認するための通信システムに関するものである。
【０００２】
【従来の技術】
図５は従来の通信システムの構成を示す図である。
図５において、１１０は従来の通信システムのネットワーク、１２０は従来の通信システムの通信端末装置である。通信端末装置１２０には、メモリ部１２２が内蔵されている。また例えば、特開２００１−１１１６８６公報に開示された通信端末装置１２０には、メモリ部１２２に加えて、音声認識部１２１も設けられている。
【０００３】
次に動作について説明する。
不図示の他の通信端末装置とネットワーク１１０を介して通話中の通信端末装置１２０は、必要に応じて通話内容を音声データとしてメモリ部１２２に保存する。通話終了後に、通信端末装置１２０の通話者は、メモリ部１２２から音声データを読み出して保存した通話内容を確認できる。
【０００４】
また、メモリ部１２２と音声認識部１２１とを備えた通信端末装置１２０の場合には、通話内容の音声データを音声認識部１２１によって文字データとして認識し、メモリ部１２２に文字データを保存する。通信端末装置１２０で送信・受信した音声データを対応する文字データに変換して保存・表示するため、通話者は、通話終了後に通話内容を確認できるばかりでなく、通話中であっても通話内容をリアルタイムで把握可能になっている。さらに、その保存した文字データを不図示の無線基地局へ送信し、無線基地局のサーバに文字データを格納することもできる。
【０００５】
【発明が解決しようとする課題】
従来の通信システムは以上のように構成されているので、通話内容を音声データとしてメモリ部に保存するため、通話内容を通話中に確認できないという課題があった。
【０００６】
また、従来の通信システムは、通話内容の音声データを音声認識して文字データとして保存するため、通信端末装置に音声認識部を設ける必要があり、通信端末装置が大型化してしまうという課題があった。
【０００７】
この発明は上記のような課題を解決するためになされたもので、通信端末装置に音声認識部を設けることなく、通話内容をリアルタイムに確認できる通信システムを提供することを目的とする。
【０００８】
【課題を解決するための手段】
この発明に係る通信システムは、通信端末装置から音声データを受信すると、音声データを音声認識する音声認識部と、音声認識結果からテキストデータを生成して通信端末装置へリアルタイム配信するテキストデータ生成部とをネットワークに備え、テキストデータ生成部から配信されたテキストデータを表示する表示部を通信端末装置が備えるとともに、通話内容の音声データを通信端末装置が音声認識部へ通話中に送信するようにしたものである。
【０００９】
この発明に係る通信システムは、通信端末装置から音声データが送信されると、通信端末装置の通話者が使用する言語を識別する通話者言語識別部と、識別した通話者の言語に対応して選択され、通話者言語識別部からの音声データを音声認識する複数の音声認識部と、複数の音声認識部に対応して設けられ、音声認識部の音声認識結果からテキストデータを生成し、通信端末装置へリアルタイム配信する複数のテキストデータ生成部とをネットワークに備えるようにしたものである。
【００１０】
この発明に係る通信システムは、音声認識部の音声認識結果を指定言語に翻訳してテキストデータ生成部へ与える翻訳部をネットワークに備えるようにしたものである。
【００１１】
この発明に係る通信システムは、通信端末装置からメール配信先のアドレスが指定されるアドレス帳と、通信端末装置から音声データを受信すると、音声データを音声認識する音声認識部と、音声認識結果からテキストデータを生成するテキストデータ生成部と、テキストデータを蓄積するとともに、通話終了後に、アドレス帳に指定されたメール配信先のアドレスへテキストデータのメール配信を行う蓄積部とをネットワークに備え、通信端末装置が、蓄積部から配信されたテキストデータを表示する表示部を備えるとともに、通話内容の音声データを音声認識部へ通話中に送信するようにしたものである。
【００１２】
【発明の実施の形態】
以下、この発明の実施の一形態を説明する。
実施の形態１．
図１はこの発明の実施の形態１による通信システムの構成を示す図である。
図１において、１０は実施の形態１による通信システムのネットワーク、２０は実施の形態１による通信システムの通信端末装置である。ネットワーク１０には音声認識部１１とテキストデータ生成部１２とが設けられており、また通信端末装置２０は表示部２１を備えている。
【００１３】
次に動作について説明する。
図１において、通信端末装置２０と不図示の他の通信端末装置がネットワーク１０を媒介して通話中に、通話内容の音声データ（上り・下りとも）がネットワーク１０を介して音声認識部１１へ送信される。音声認識部１１は、通信端末装置２０や不図示の他の通信端末装置から受信した音声データを音声認識してテキストデータ生成部１２へ送る。テキストデータ生成部１２は、音声認識部１１の音声認識結果から文字データであるテキストデータを生成する。
【００１４】
ネットワーク１０は、音声データの通信チャネルとは別の通信チャネルを利用して、テキストデータ生成部１２が生成したテキストデータを通話者側の通信端末装置２０へリアルタイム配信する。２本の通信チャネルのうちの一方を介して受信したテキストデータは、通信端末装置２０の表示部２１に表示されるため、通信端末装置２０の通話者は、通話内容を通話中にリアルタイムで確認することができる。
【００１５】
ここで、通信端末装置２０は、ネットワーク１０と２本の通信チャネルで通信している。２本の通信チャネルとは、音声データ用の一方の通信チャネルとテキストデータ用の他方の通信チャネルである。例えば、ネットワーク１０が無線通信システムの場合には、Ｗ−ＣＤＭＡシステムのトランスポートチャネルＤＣＨを複数用いて、音声データとテキストデータとをリアルタイムに通信する。また例えば、ネットワーク１０が有線通信システムの場合には、ＩＳＤＮでＢチャネルを通信チャネルとして２本用いて、一方のＢチャネルを用いて音声データを、他方のＢチャネルを用いてテキストデータをリアルタイムに通信する。
【００１６】
このように、通信端末装置２０では、音声データ用の通信チャネルを用いて、他の通信端末装置と音声データの通話が行われると同時に、テキストデータ用の通信チャネルを用いて、通話内容のテキストデータがテキストデータ生成部１２によってネットワーク１０から通話中にリアルタイム配信されるようになっている。通信端末装置２０は受信したテキストデータを表示部２１に表示するので、通信端末装置２０に音声認識部を設けることなく、ユーザは通話中にリアルタイムで通話内容を確認できる。また、リアルタイムで通話内容をテキストデータとして確認できるため、聴覚障害者の利用や騒音環境下での利用も可能になる。
【００１７】
以上のように、この実施の形態１によれば、通信端末装置２０は、ネットワーク１０を介して配信されたテキストデータを表示する表示部２１を備え、他の通信端末装置とネットワーク１０を介して通話するとともに、通話内容の音声データをネットワーク１０を介して音声認識部１１へ通話中に送信し、音声認識部１１は、通信端末装置２０から受信した音声データを音声認識し、テキストデータ生成部１２は、音声認識部１１の音声認識結果からテキストデータを生成し、ネットワーク１０を介して通信端末装置２０へテキストデータをリアルタイム配信するようにしたので、通信端末装置２０に音声認識部を設けることなく、通話者は通話中に通話内容をリアルタイムで確認できるという効果が得られる。加えて、聴覚障害者の利用や騒音環境下での利用が可能になるという効果が得られる。
【００１８】
実施の形態２．
実施の形態１では、通話内容のテキストデータを通話中にリアルタイム配信するようにしたが、このテキストデータを通話者の言語に対応したテキストデータに変換しても良い。
【００１９】
図２はこの発明の実施の形態２による通信システムの構成を示す図である。図１と同一符号は同一または相当する構成を示している。図２において、１３は通話者言語識別部、１１−１〜１１−Ｎはそれぞれ各言語の音声認識部、１２−１〜１２−Ｎはそれぞれ各言語音声認識部１１−１〜１１−Ｎに対応したテキストデータ生成部である。
【００２０】
次に動作について説明する。
通話者言語識別部１３は、ネットワーク１０を介して通信端末装置２０から音声データが送信されると、通信端末装置２０の通話者が使用している言語を識別する。ここで、通話者言語識別部１３は、例えば通話者の通信端末装置２０の電話番号や認証情報などから通話者の言語を識別する。次に、識別した通話者の言語に対応する音声認識部１１−１〜１１−Ｎの一つを通話者言語識別部１３が選択して音声データを与えると、選択された音声認識部１１−１〜１１−Ｎは音声データを音声認識する。
【００２１】
各言語の音声認識部１１−１〜１１−Ｎはそれぞれ各言語に対応したテキストデータ生成部１２−１〜１２−Ｎと接続されており、テキストデータ生成部１２−１〜１２−Ｎは音声認識部１１−１〜１１−Ｎの音声認識結果から該当言語のテキストデータを生成し、通信端末装置２０へリアルタイム配信する。以下、実施の形態１と同様に、通信端末装置２０で受信したテキストデータが表示部２１に表示される。通話者言語識別部１３，各言語の音声認識部１１−１〜１１−Ｎ，テキストデータ生成部１２−１〜１２−Ｎを備えることにより、多様な通話者の言語に対応してテキストデータを生成でき、通話者は通話中に通話内容の確認をリアルタイムに行うことができる。
【００２２】
以上のように、この実施の形態２によれば、ネットワーク１０を介して通信端末装置２０から音声データが送信されると、通信端末装置２０の通話者が使用している言語を識別する通話者言語識別部１３と、識別した通話者の言語に対応して選択され、通話者言語識別部１３からの音声データを音声認識する音声認識部１１−１〜１１−Ｎと、音声認識部１１−１〜１１−Ｎに対応して設けられ、音声認識部１１−１〜１１−Ｎの音声認識結果からテキストデータを生成し、通信端末装置２０へリアルタイム配信するテキストデータ生成部１２−１〜１２−Ｎとを備えるようにしたので、多様な通話者の言語に対応してテキストデータを生成でき、通信端末装置２０に音声認識部を設けることなく、通話者は通話中に通話内容をリアルタイムで確認できるという効果が得られる。
【００２３】
実施の形態３．
実施の形態２では、通話者の使用言語を識別して、その該当言語でテキストデータを生成して配信するようにしたが、テキストデータ生成の際の言語を特定し、その言語に翻訳しても良い。このことにより、通話者言語に依存することなく、異言語同士の通話者間においても、テキストデータの配信が可能になる。
【００２４】
図３はこの発明の実施の形態３による通信システムの構成を示す図である。図１，図２と同一符号は同一または相当する構成を示している。図３において、１４は翻訳部である。
【００２５】
次に動作について説明する。
まず実施の形態２と同様に、通話者言語識別部１３は、ネットワーク１０を介して通信端末装置２０から音声データが送信されると、通信端末装置２０の通話者が使用している言語を識別する。次に、識別した通話者の言語に対応する音声認識部１１−１〜１１−Ｎの一つを通話者言語識別部１３が選択して音声データを与えると、選択された音声認識部１１−１〜１１−Ｎは音声データを音声認識する。
【００２６】
そして、テキストデータ生成部１２−１〜１２−Ｎは、通話者の使用している言語（指定言語）または予め設定した言語（指定言語）に音声認識結果を翻訳する翻訳部１４を通じてテキストデータを生成し、通信端末装置２０へリアルタイム配信する。以下、実施の形態１と同様に、通信端末装置２０で受信したテキストデータが表示部２１に表示される。このように、翻訳部１４で指定言語に翻訳したテキストデータを配信することで、言語の異なる者どうしの通話内容も、通信端末装置２０でテキストデータとしてリアルタイムに確認できる。
【００２７】
なお、指定言語の認証機能を翻訳部１４に設けるようにし、通信端末装置２０毎に用いる指定言語を予め特定しておき、通信開始時に、翻訳部１４の認証機能によってその通信端末装置２０の指定言語を特定するようにしても良く、通話者の指定言語の設定を省くことができる。
【００２８】
以上のように、この実施の形態３によれば、音声認識部１１−１〜１１−Ｎの音声認識結果を指定言語に翻訳してテキストデータ生成部１２−１〜１２−Ｎへ与える翻訳部１４をネットワーク１０に備えるようにしたので、言語の異なる通話者どうしが音声通話を行う場合でも、テキストデータとして通話内容を確認できるという効果が得られる。
【００２９】
実施の形態４．
実施の形態１〜実施の形態３では、通話内容のテキストデータをリアルタイム配信するようにしたが、通話内容をテキストデータとしてネットワークに蓄積しておき、通話終了後に配信しても良い。
【００３０】
図４はこの発明の実施の形態４による通信システムの構成を示す図である。図１と同一符号は同一または相当する構成を示している。図４において、１５はアドレス帳、１６は蓄積部である。
【００３１】
次に動作について説明する。
通話開始時に、通信端末装置２０からネットワーク１０のアドレス帳１５へメール配信先のアドレスを指定しておく。蓄積部１６は、通話中に逐次得られた通話内容のテキストデータを蓄積するものであり、通話終了後に、アドレス帳１５に指定されたメール配信先のアドレスへテキストデータのメール配信を行う。このようにすることで、通話終了後に、メール配信されたテキストデータを読み出して通話内容を確認でき、議事録として利用できる。
なお、この実施の形態４は、実施の形態１〜実施の形態３と組み合わせて通信システムを構築しても良いし、テキストデータをリアルタイム配信しない場合の通信システムに単独で適用しても良く、いずれの場合も、メール配信されたテキストデータは全通話内容の議事録として利用価値がある。
【００３２】
以上のように、この実施の形態４によれば、通信端末装置２０からメール配信先のアドレスが指定されるアドレス帳１５と、音声認識部１１，テキストデータ生成部１２によって通話中に得られた通話内容のテキストデータを蓄積するとともに、通話終了後に、アドレス帳１５に指定されたメール配信先のアドレスへテキストデータのメール配信を行う蓄積部１６とをネットワーク１０に備えるようにしたので、通話終了後に、メール配信されたテキストデータを読み出して通話内容を確認できるという効果が得られ、例えば議事録などに利用可能という効果が得られる。
【００３３】
【発明の効果】
以上のように、この発明によれば、通信端末装置から音声データを受信すると、音声データを音声認識する音声認識部と、音声認識結果からテキストデータを生成して通信端末装置へリアルタイム配信するテキストデータ生成部とをネットワークに備え、テキストデータ生成部から配信されたテキストデータを表示する表示部を通信端末装置が備えるとともに、通話内容の音声データを通信端末装置が音声認識部へ通話中に送信するようにしたので、通信端末装置に音声認識部を設けることなく、通話者は通話中に通話内容をリアルタイムで確認できるという効果が得られる。
【図面の簡単な説明】
【図１】この発明の実施の形態１による通信システムの構成を示す図である。
【図２】この発明の実施の形態２による通信システムの構成を示す図である。
【図３】この発明の実施の形態３による通信システムの構成を示す図である。
【図４】この発明の実施の形態４による通信システムの構成を示す図である。
【図５】従来の通信システムの構成を示す図である。
【符号の説明】
１０　ネットワーク、１１，１１−１〜１１−Ｎ　音声認識部、１２，１２−１〜１２−Ｎ　テキストデータ生成部、１３　通話者言語識別部、１４　翻訳部、１５　アドレス帳、１６　蓄積部、２０　通信端末装置、２１　表示部。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a communication system in which a communication terminal device communicates with another via a network, and more particularly to a communication system for confirming the contents of a call in the communication terminal device.
[0002]
[Prior art]
FIG. 5 is a diagram showing a configuration of a conventional communication system.
In FIG. 5, reference numeral 110 denotes a network of a conventional communication system, and 120 denotes a communication terminal device of the conventional communication system. The communication unit 120 has a built-in memory unit 122. Further, for example, the communication terminal device 120 disclosed in Japanese Patent Application Laid-Open No. 2001-111686 is provided with a voice recognition unit 121 in addition to the memory unit 122.
[0003]
Next, the operation will be described.
The communication terminal device 120 during a call with another communication terminal device (not shown) via the network 110 stores the contents of the call as voice data in the memory unit 122 as necessary. After the end of the call, the caller of the communication terminal device 120 can read the voice data from the memory unit 122 and check the saved call content.
[0004]
In the case of the communication terminal device 120 including the memory unit 122 and the voice recognition unit 121, the voice data of the content of the call is recognized as character data by the voice recognition unit 121, and the character data is stored in the memory unit 122. Since the voice data transmitted / received by the communication terminal device 120 is converted into corresponding character data and stored / displayed, the caller can not only confirm the content of the call after the end of the call, but also check the content of the call even during the call. Can be grasped in real time. Further, the stored character data can be transmitted to a wireless base station (not shown), and the character data can be stored in a server of the wireless base station.
[0005]
[Problems to be solved by the invention]
Since the conventional communication system is configured as described above, the contents of the call are stored in the memory unit as voice data, so that there is a problem that the contents of the call cannot be confirmed during the call.
[0006]
Further, in the conventional communication system, since voice data of the contents of a call is recognized by voice and stored as character data, it is necessary to provide a voice recognition unit in the communication terminal device, and there is a problem that the communication terminal device becomes large. Was.
[0007]
The present invention has been made to solve the above-described problem, and has as its object to provide a communication system capable of confirming the contents of a call in real time without providing a communication terminal device with a voice recognition unit.
[0008]
[Means for Solving the Problems]
A communication system according to the present invention, when receiving voice data from a communication terminal device, a voice recognition unit that recognizes voice data, and a text data generation unit that generates text data from the voice recognition result and distributes the data to the communication terminal device in real time And the communication terminal device includes a display unit for displaying text data distributed from the text data generation unit, and transmits the voice data of the call content to the voice recognition unit during the call. It was done.
[0009]
In the communication system according to the present invention, when voice data is transmitted from the communication terminal device, a caller language identification unit for identifying a language used by a caller of the communication terminal device, and a language corresponding to the identified caller language. A plurality of voice recognition units for voice recognition of voice data from the selected caller language identification unit; and a plurality of voice recognition units provided in correspondence with the plurality of voice recognition units. A network is provided with a plurality of text data generators for real-time distribution to a terminal device.
[0010]
A communication system according to the present invention is provided with a translation unit in a network, which translates a speech recognition result of a speech recognition unit into a designated language and provides the result to a text data generation unit.
[0011]
A communication system according to the present invention includes: an address book in which an address of a mail delivery destination is specified from a communication terminal device; a voice recognition unit that recognizes voice data when voice data is received from the communication terminal device; A network comprising: a text data generation unit for generating text data; and a storage unit for storing the text data and delivering the text data to a mail delivery destination address specified in the address book after the call is over. A terminal device includes a display unit that displays text data distributed from a storage unit, and transmits voice data of a call content to a voice recognition unit during a call.
[0012]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an embodiment of the present invention will be described.
Embodiment 1 FIG.
FIG. 1 is a diagram showing a configuration of a communication system according to Embodiment 1 of the present invention.
In FIG. 1, reference numeral 10 denotes a network of the communication system according to the first embodiment, and reference numeral 20 denotes a communication terminal device of the communication system according to the first embodiment. The network 10 includes a voice recognition unit 11 and a text data generation unit 12, and the communication terminal device 20 includes a display unit 21.
[0013]
Next, the operation will be described.
In FIG. 1, while a communication terminal device 20 and another communication terminal device (not shown) are communicating via a network 10, voice data (both uplink and downlink) of the content of the communication is transmitted to a voice recognition unit 11 via the network 10. Sent. The voice recognition unit 11 performs voice recognition of voice data received from the communication terminal device 20 or another communication terminal device (not shown) and sends the voice data to the text data generation unit 12. The text data generation unit 12 generates text data that is character data from the speech recognition result of the speech recognition unit 11.
[0014]
The network 10 distributes the text data generated by the text data generation unit 12 to the communication terminal device 20 on the caller side in real time using a communication channel different from the communication channel of the voice data. The text data received via one of the two communication channels is displayed on the display unit 21 of the communication terminal device 20, so that the caller of the communication terminal device 20 can check the contents of the call in real time during the call. can do.
[0015]
Here, the communication terminal device 20 is communicating with the network 10 via two communication channels. The two communication channels are one communication channel for voice data and the other communication channel for text data. For example, when the network 10 is a wireless communication system, voice data and text data are communicated in real time using a plurality of transport channels DCH of the W-CDMA system. For example, when the network 10 is a wired communication system, two B channels are used as communication channels in ISDN, and voice data is transmitted in real time using one B channel and text data is transmitted in real time using the other B channel. connect.
[0016]
As described above, the communication terminal device 20 uses the communication channel for voice data to make a voice data call with another communication terminal device, and at the same time, uses the communication channel for text data to transmit the text of the call content. Data is distributed in real time from the network 10 by the text data generator 12 during a call. Since the communication terminal device 20 displays the received text data on the display unit 21, the user can check the contents of the call in real time during the call without providing the communication terminal device 20 with the voice recognition unit. In addition, since the contents of a call can be confirmed as text data in real time, it can be used by a hearing-impaired person or in a noisy environment.
[0017]
As described above, according to the first embodiment, the communication terminal device 20 includes the display unit 21 that displays the text data distributed via the network 10, and communicates with other communication terminal devices via the network 10. While making a call, voice data of the content of the call is transmitted to the voice recognition unit 11 via the network 10 during the call, and the voice recognition unit 11 voice-recognizes the voice data received from the communication terminal device 20 and generates a text data generation unit. 12 is to provide text data from the voice recognition result of the voice recognition unit 11 and to deliver the text data to the communication terminal device 20 in real time via the network 10, so that the communication terminal device 20 is provided with the voice recognition unit. Therefore, the effect is obtained that the caller can check the contents of the call in real time during the call. In addition, there is an effect that the use by a hearing-impaired person or the use in a noise environment becomes possible.
[0018]
Embodiment 2 FIG.
In the first embodiment, the text data of the call content is distributed in real time during the call, but the text data may be converted to text data corresponding to the language of the caller.
[0019]
FIG. 2 is a diagram showing a configuration of a communication system according to Embodiment 2 of the present invention. The same reference numerals as those in FIG. 1 indicate the same or corresponding components. In FIG. 2, reference numeral 13 denotes a caller language identification unit, 11-1 to 11-N denote speech recognition units of respective languages, and 12-1 to 12-N denote speech recognition units 11-1 to 11-N, respectively. A corresponding text data generator.
[0020]
Next, the operation will be described.
When voice data is transmitted from the communication terminal device 20 via the network 10, the caller language identification unit 13 identifies the language used by the caller of the communication terminal device 20. Here, the caller language identification unit 13 identifies the caller's language from, for example, the telephone number or authentication information of the communication terminal device 20 of the caller. Next, when one of the speech recognition units 11-1 to 11-N corresponding to the identified caller's language is selected by the caller language identification unit 13 and the speech data is given, the selected speech recognition unit 11- 1 to 11-N recognize voice data by voice.
[0021]
The speech recognition units 11-1 to 11-N of each language are connected to text data generation units 12-1 to 12-N corresponding to each language, respectively. It generates text data of the corresponding language from the speech recognition results of the recognition units 11-1 to 11-N, and distributes the data to the communication terminal device 20 in real time. Hereinafter, the text data received by the communication terminal device 20 is displayed on the display unit 21 as in the first embodiment. Providing a caller language identification unit 13, speech recognition units 11-1 to 11-N for each language, and text data generation units 12-1 to 12-N enables text data to be handled in various caller languages. This allows the caller to confirm the contents of the call in real time during the call.
[0022]
As described above, according to the second embodiment, when voice data is transmitted from communication terminal device 20 via network 10, the caller of communication terminal device 20 identifies the language used by the caller. A language identification unit 13; voice recognition units 11-1 to 11-N which are selected corresponding to the language of the identified caller and recognize the voice data from the caller language identification unit 13; Text data generators 12-1 to 12 that are provided corresponding to the speech recognition units 11-1 to 11-N, generate text data from the speech recognition results of the speech recognizers 11-1 to 11-N, and deliver the data to the communication terminal device 20 in real time. -N, text data can be generated corresponding to various caller languages, and the caller can perform real-time call contents during a call without providing a voice recognition unit in the communication terminal device 20. An effect that can be confirmed is obtained.
[0023]
Embodiment 3 FIG.
In the second embodiment, the language used by the caller is identified, and the text data is generated and distributed in the corresponding language. However, the language at the time of generating the text data is specified and translated into that language. Is also good. Thus, text data can be distributed between callers of different languages without depending on the caller language.
[0024]
FIG. 3 is a diagram showing a configuration of a communication system according to Embodiment 3 of the present invention. 1 and 2 indicate the same or corresponding components. In FIG. 3, reference numeral 14 denotes a translation unit.
[0025]
Next, the operation will be described.
First, as in the second embodiment, when voice data is transmitted from the communication terminal device 20 via the network 10, the talker language identification unit 13 identifies the language used by the talker of the communication terminal device 20. I do. Next, when one of the speech recognition units 11-1 to 11-N corresponding to the identified caller's language is selected by the caller language identification unit 13 and the speech data is given, the selected speech recognition unit 11- 1 to 11-N recognize voice data by voice.
[0026]
The text data generation units 12-1 to 12-N convert the text data through the translation unit 14 that translates the speech recognition result into a language (designated language) used by the caller or a preset language (designated language). It is generated and distributed to the communication terminal device 20 in real time. Hereinafter, the text data received by the communication terminal device 20 is displayed on the display unit 21 as in the first embodiment. By distributing the text data translated into the designated language by the translator 14 in this way, the communication terminal device 20 can also confirm in real time the contents of a call between persons of different languages as text data.
[0027]
The translation unit 14 is provided with an authentication function of a designated language, a designated language used for each communication terminal device 20 is specified in advance, and when the communication is started, the designation of the communication terminal device 20 is designated by the authentication function of the translation unit 14. The language may be specified, and the setting of the language specified by the caller can be omitted.
[0028]
As described above, according to the third embodiment, the translation unit that translates the speech recognition results of speech recognition units 11-1 to 11-N into a specified language and provides the result to text data generation units 12-1 to 12-N. Since the network 14 is provided, even if the callers of different languages make a voice call, it is possible to obtain the effect that the contents of the call can be confirmed as text data.
[0029]
Embodiment 4 FIG.
In the first to third embodiments, the text data of the call content is distributed in real time. However, the call content may be stored in the network as text data and distributed after the call is completed.
[0030]
FIG. 4 is a diagram showing a configuration of a communication system according to Embodiment 4 of the present invention. The same reference numerals as those in FIG. 1 indicate the same or corresponding components. In FIG. 4, 15 is an address book, and 16 is a storage unit.
[0031]
Next, the operation will be described.
At the start of a call, an address of a mail delivery destination is specified from the communication terminal device 20 to the address book 15 of the network 10. The accumulating unit 16 accumulates text data of the contents of a call sequentially obtained during the call, and performs mail distribution of the text data to a mail distribution destination address specified in the address book 15 after the call is completed. By doing so, after the call is over, the text data delivered by e-mail can be read to confirm the contents of the call, and can be used as minutes.
The fourth embodiment may be combined with the first to third embodiments to construct a communication system, or may be independently applied to a communication system in which text data is not distributed in real time. In either case, the text data delivered by e-mail is useful as the minutes of all call contents.
[0032]
As described above, according to the fourth embodiment, the address book 15 in which the address of the mail delivery destination is specified from the communication terminal device 20, and the voice recognition unit 11 and the text data generation unit 12 obtain the information during a call. The network 10 is provided with a storage unit 16 for storing text data of the contents of the call and, after the end of the call, performing a mail distribution of the text data to the mail delivery destination address specified in the address book 15, so that the call is terminated. Later, it is possible to obtain the effect that the text data distributed by e-mail can be read and the contents of the call can be confirmed.
[0033]
【The invention's effect】
As described above, according to the present invention, when voice data is received from a communication terminal device, a voice recognition unit that voice-recognizes the voice data, and a text that generates text data from the voice recognition result and distributes the text data in real time to the communication terminal device The communication terminal device includes a data generation unit and a display unit for displaying text data distributed from the text data generation unit, and the communication terminal device transmits voice data of the call content to the voice recognition unit during the call. Thus, an effect is obtained that the caller can check the contents of the call in real time during the call without providing the voice recognition unit in the communication terminal device.
[Brief description of the drawings]
FIG. 1 is a diagram showing a configuration of a communication system according to a first embodiment of the present invention.
FIG. 2 is a diagram showing a configuration of a communication system according to a second embodiment of the present invention.
FIG. 3 is a diagram showing a configuration of a communication system according to a third embodiment of the present invention.
FIG. 4 is a diagram showing a configuration of a communication system according to a fourth embodiment of the present invention.
FIG. 5 is a diagram showing a configuration of a conventional communication system.
[Explanation of symbols]
Reference Signs List 10 network, 11, 11-1 to 11-N voice recognition unit, 12, 12-1 to 12-N text data generation unit, 13 talker language identification unit, 14 translation unit, 15 address book, 16 storage unit, 20 Communication terminal device, 21 display unit.

Claims

In a communication system including a network that mediates communication, and a communication terminal device that communicates with others via the network,
When receiving voice data from the communication terminal device, the voice recognition unit recognizes the voice data and a text data generation unit that generates text data from the voice recognition result and distributes the text data to the communication terminal device in real time. In preparation for
A communication system, comprising: a display unit for displaying text data distributed from the text data generation unit; and transmitting the voice data of the content of the call to the voice recognition unit during a call. .

When voice data is transmitted from the communication terminal device, a caller language identification unit that identifies a language used by a caller of the communication terminal device,
A plurality of voice recognition units selected corresponding to the language of the identified caller and voice-recognizing voice data from the caller language identification unit;
A network is provided with a plurality of text data generators provided corresponding to the plurality of voice recognizers, generating text data from the voice recognition result of the voice recognizer, and delivering the text data to the communication terminal device in real time. The communication system according to claim 1, wherein

3. The communication system according to claim 2, wherein a translation unit is provided in the network for translating a speech recognition result of the speech recognition unit into a designated language and providing the result to the text data generation unit.

In a communication system including a network that mediates communication, and a communication terminal device that communicates with others via the network,
An address book in which an address of a mail distribution destination is specified from the communication terminal device, a voice recognition unit that recognizes the voice data when voice data is received from the communication terminal device, and text data generated from the voice recognition result A text data generating unit that stores the text data, and a storage unit that performs mail distribution of the text data to the address of the mail distribution destination specified in the address book after the call is completed, provided in the network.
A communication system, comprising: a display unit for displaying text data distributed from the storage unit; and transmitting the voice data of the content of the call to the voice recognition unit during a call.