JP3825589B2

JP3825589B2 - Multimedia terminal equipment

Info

Publication number: JP3825589B2
Application number: JP27244199A
Authority: JP
Inventors: 達雄古賀; 敦郎西垣
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1999-09-27
Filing date: 1999-09-27
Publication date: 2006-09-27
Anticipated expiration: 2019-09-27
Also published as: JP2001094965A

Description

【０００１】
【発明の属する技術分野】
本発明は、複数種類のコンテンツを受信するマルチメディア端末機器に関するもので、特に、デジタル放送を受信し、又、インターネットへの接続が可能であるマルチメディア端末機器に関する。
【０００２】
【従来の技術】
近年、デジタル技術が発達し、パーソナルコンピュータなどのマルチメディア端末機器において、受信したデジタル放送などの映像を、そのモニタ上に再生することが可能なマルチメディア端末機器が提案されている。このようなマルチメディア端末機器は、一台で、コンピュータ端末という役割とＡＶ（Audio Visual）機器としての役割とを持ちあわせる。そのため、従来のように、インターネットを行うためのコンピュータ端末と映像放送を受信するためのＡＶ機器とを別々に所有する必要がなくなる。
【０００３】
【発明が解決しようとする課題】
しかしながら、このようなマルチメディア端末機器において、音声を出力するスピーカシステムが１つしかない場合、デジタルテレビ番組やホームページなどの音声信号及び映像信号よりなる複数種類のコンテンツを受信した際、それぞれのコンテンツによる音声を同時に出力することができない。又、このような複数のコンテンツから１つの音声を使用者が、そのコンテンツの番組内容より判断し選択する必要がある。さらに、このスピーカシステムより複数のコンテンツの音声を混合して出力した場合、台詞などの話し言葉が聞き取れないという問題がある。
【０００４】
上記のような問題を鑑みて、本発明は、複数種類のコンテンツを受信するとともに、この複数のコンテンツの音声信号から１つを自動的に選択して出力することが可能なマルチメディア端末機器を提供することを目的とする。
【０００５】
【課題を解決するための手段】
上記の目的を達成するため、請求項１に記載のマルチメディア端末機器は、音声信号及び映像信号よりなる複数種類のコンテンツをそれぞれ復号化する複数の復号化手段と、前記各コンテンツの映像信号より映像を表示する表示手段とを有するマルチメディア端末機器において、前記コンテンツの音声信号より音声を出力する音声出力手段と、前記複数種類のコンテンツの音声信号より、前記音声出力手段より出力する音声信号を１つ選択する選択手段と、
前記各コンテンツの番組情報に含まれる番組ジャンルに対応して、これらの各コンテンツの音声信号に優先度を与える番組情報判定手段と、を有し、
前記複数の復号化手段で復号化された前記複数種類のコンテンツの映像信号より複数の映像を前記表示手段で同時に表示するとともに、前記選択手段において、前記複数の復号化手段でそれぞれ復号化された前記各コンテンツの音声信号から、前記番組情報判定手段で与えられた優先度が高い音声信号を選択し、前記音声出力手段で音声として出力することを特徴とする。
【０００６】
このような構成のマルチメディア端末機器において、複数の復号化手段で復号される複数種類のコンテンツが同時に処理される際に、番組情報判定手段において、前記コンテンツの番組情報に含まれる番組ジャンルに応じて、そのコンテンツの音声信号に優先度を与え、選択手段において、前記複数の復号化手段でそれぞれ復号化された前記各コンテンツの音声信号から前記番組情報判定手段で与えられた優先度の高い音声信号を選択して音声出力手段から出力することができる。
【０００７】
又、請求項２に記載するように、番組情報判定手段が、前記コンテンツの番組情報に含まれる番組ジャンルが音楽番組であると判断するとき、そのコンテンツの音声信号に高い優先度を与えるようにすることによって、前記選択手段において、前記複数の復号化手段でそれぞれ復号化された前記各コンテンツの音声信号から番組ジャンルが音楽番組であるコンテンツの音声信号を選択して音声出力手段から出力することができる。
【００１４】
【発明の実施の形態】
本発明の実施形態について、図面を参照して説明する。図１は、本発明のマルチメディア端末機器の内部構造を示すブロック図である。又、図２は、図１のマルチメディア端末機器の動作を示すフローチャートである。
【００１５】
図１に示すマルチメディア端末機器１は、デジタル放送信号を受信するアンテナ２と、アンテナ２から送信される放送信号を復調するとともにこの放送信号を多重分離してユーザーが選択したチャンネルのコンテンツやＥＰＧ（Electronic Program Guide）情報を獲得するチューナ３と、インターネットなどのネットワーク４を介してユーザーの選択したホームページなどのコンテンツを獲得するネットワーク制御手段５とを備えている。
【００１６】
又、マルチメディア端末機器１は、チューナ３で獲得したコンテンツより映像信号と音声信号を復号する復号手段６と、ネットワーク制御手段５で獲得したコンテンツより映像信号と音声信号を復号する復号手段７と、復号手段６，７で復号された映像信号に基づいて映像を表示する表示装置８とを有する。この表示装置８において、各コンテンツを表示するためのウィンドウが開かれ、そのウィンドウ上に各コンテンツの映像を表示することによって、同時に表示をすることができる。
【００１７】
又、マルチメディア端末機器１は、復号手段６，７で復号された音声信号のうち一方の音声信号を音声を再生するためのスピーカ１１に送出するとともに他方の音声信号を音声を文字に変換する音声／文字変換手段１０に送出する音声信号選択手段９と、音声信号を文字信号に変換したテキストファイルを作成するとともにこのテキストファイルを表示装置８に与える音声／文字変換手段１０と、スピーカ１１とを備える。更に、音声信号選択手段９を制御するための制御信号を与える番組ジャンル取得手段１２、音声性質分析手段１３、及びユーザー指示取得手段１４と、ユーザーがマルチメディア端末機器１を操作するためのキーボードやマウスのような入力手段１５とが設けられる。
【００１８】
この番組ジャンル取得手段１２は、チューナ３で得たＥＰＧ情報やネットワーク制御手段５で獲得した番組ガイドサイトよりユーザーの選択したチャンネルのコンテンツの番組ジャンルを取得し、各番組ジャンルの優先度とともに記憶する。又、音声性質分析手段１３は、復号手段６，７で復号した音声信号において、その音声信号に無音部分となる低いレベルの信号が多く含まれているか否かが判断され、この無音部分が多い音声信号は話言葉的な音声信号と判断し、その優先度を下げる。逆に、この無音部分となる途切れが少ない音声信号は音楽的な音声信号と判断して優先度を上げる。更に、ユーザー指示取得手段１４では、ユーザーによって入力手段１５で入力された各コンテンツの出力形式が記憶される。
【００１９】
即ち、番組ジャンル取得手段１２、音声性質分析手段１３、及びユーザー指示取得手段１４において、以下のように、各コンテンツの音声信号の優先度及び出力形式が決定する。番組ジャンルによる優先度が、例えば、図３のように、音楽番組は”３”、バラエティ番組及びドラマは”２”、スポーツ及び天気予報は”１”、ニュースは”０”として、番組ジャンル取得手段１２に記憶される。
【００２０】
尚、この優先度は、数字が大きい番組ジャンルほど、スピーカ１１に優先的に出力されるものである。又、番組ジャンル取得手段１２は、定期的にアンテナ２より受信したＥＰＧ情報やネットワーク４を介して番組ガイドサイトに接続して獲得した番組情報を記憶することによって、各コンテンツの番組ジャンルを予め記憶する。又、ネットワーク４を介して受信したコンテンツに付随した番組情報となるデータ信号よりその番組ジャンルを識別することができる。
【００２１】
又、音声性質分析手段１３では、復号手段６，７から送出される音声信号のレベルが閾値レベルより低い音声信号の割合が計測され、この閾値レベルの低いレベルの音声信号の割合が多いとき、無音部分が多いと判断し、話言葉的音声信号として、この音声信号の優先度を下げる。又、逆に、閾値レベルの低いレベルの音声信号の割合が少ない音声信号は、途切れが少ないと判断し、音楽的音声信号として、この音声信号の優先度を上げる。更に、ユーザーが入力キーなどの入力手段１５を操作することによって、図４のように、例えば、コンテンツＡは“スピーカ出力”、コンテンツＢは“テキスト出力”、コンテンツＣは“自動”として、各コンテンツの出力形式がユーザー指示取得手段１４に記憶される。
【００２２】
以下に、図２のフローチャートを参照して、マルチメディア端末機器１の動作について説明する。今、ネットワーク制御手段５によってネットワーク４から受信したコンテンツａの映像を表示装置８のウィンドウに表示するとともに、音声がスピーカ１１から出力されているときに、アンテナ２を介してチューナ３によってデジタル放送のコンテンツｂを受信を開始したものとする（ＳＴＥＰ１）。まず、ユーザー指示取得手段１４によって、コンテンツａの出力形式が判断される（ＳＴＥＰ２及びＳＴＥＰ３）。このとき、コンテンツａの出力形式が“自動”でない（Ｎｏ）ときはＳＴＥＰ３に移行し、又、コンテンツａの出力形式が“自動”（Ｙｅｓ）のときはＳＴＥＰ４に移行する（ＳＴＥＰ２）。
【００２３】
ＳＴＥＰ３では、コンテンツａの出力形式が、“スピーカ出力”か否かが判別される。このとき、コンテンツａの出力形式が“スピーカ出力”（Ｙｅｓ）のときは、ＳＴＥＰ９でコンテンツａをスピーカ１１より出力するとともに、コンテンツｂを音声／文字変換手段１０で文字変換してテキストファイルとし、このテキストファイルに基づいて表示装置８にその文字列を表示する。又、コンテンツａの出力形式が“テキスト出力”（Ｎｏ）のときは逆に、ＳＴＥＰ１０でコンテンツｂをスピーカ１１より出力するとともに、コンテンツａを音声／文字変換手段１０で文字変換してテキストファイルとし、このテキストファイルに基づいて表示装置８にその文字列を表示する。
【００２４】
次に、ユーザー指示取得手段１４によって、コンテンツｂの出力形式が判断される（ＳＴＥＰ４及びＳＴＥＰ５）。このとき、コンテンツｂの出力形式が“自動”でない（Ｎｏ）ときはＳＴＥＰ５に移行し、又、コンテンツｂの出力形式が“自動”（Ｙｅｓ）の時はＳＴＥＰ６に移行する（ＳＴＥＰ４）。
【００２５】
ＳＴＥＰ５では、コンテンツｂの出力形式が、“スピーカ出力”か否かが判別される。このとき、コンテンツｂの出力形式が“スピーカ出力”（Ｙｅｓ）のときは、ＳＴＥＰ１０でコンテンツｂをスピーカ１１より出力するとともに、コンテンツａを音声／文字変換手段１０で文字変換してテキストファイルとし、このテキストファイルに基づいて表示装置８にその文字列を表示する。又、コンテンツｂの出力形式が“テキスト出力”（Ｎｏ）のときは逆に、ＳＴＥＰ９でコンテンツａをスピーカ１１より出力するとともに、コンテンツｂを音声／文字変換手段１０で文字変換してテキストファイルとし、このテキストファイルに基づいて表示装置８にその文字列を表示する。
【００２６】
ＳＴＥＰ６では、番組ジャンル取得手段１２によって、予め記憶したＥＰＧ情報からコンテンツｂの番組ジャンルを判別するとともに、予め記憶した番組ガイドサイトの番組情報又はコンテンツａに付随したデータ信号からコンテンツａの番組ジャンルを判別する。そして、このコンテンツａ，ｂの番組ジャンルより、予め記憶した番組ジャンルの優先度を照合して、コンテンツａ，ｂの優先度を比較し、等しくない（Ｎｏ）ときはＳＴＥＰ７へ移行し、等しい（Ｙｅｓ）ときはＳＴＥＰ８に移行する。
【００２７】
ＳＴＥＰ７では、番組ジャンル取得手段１２で得たコンテンツａ，ｂの番組ジャンルによる優先度が比較され、コンテンツａの優先度が高い（Ｙｅｓ）とき、ＳＴＥＰ９でコンテンツａをスピーカ１１より出力するとともに、コンテンツｂを音声／文字変換手段１０で文字変換してテキストファイルとし、このテキストファイルに基づいて表示装置８にその文字列を表示する。又、コンテンツｂの優先度が高い（Ｎｏ）ときは逆に、ＳＴＥＰ１０でコンテンツｂをスピーカ１１より出力するとともに、コンテンツａを音声／文字変換手段１０で文字変換してテキストファイルとし、このテキストファイルに基づいて表示装置８にその文字列を表示する。
【００２８】
ＳＴＥＰ８では、音声性質分析手段１３で判定したコンテンツａ，ｂの音声信号の性質より得た優先度が比較され、コンテンツａの優先度が高い、即ち、コンテンツａの無音部分の比率が少ない（Ｙｅｓ）とき、ＳＴＥＰ９でコンテンツａをスピーカ１１より出力するとともに、コンテンツｂを音声／文字変換手段１０で文字変換してテキストファイルとし、このテキストファイルに基づいて表示装置８にその文字列を表示する。又、コンテンツｂの優先度が高い、即ち、コンテンツｂの無音部分の比率が少ない（Ｎｏ）ときは逆に、ＳＴＥＰ１０でコンテンツｂをスピーカ１１より出力するとともに、コンテンツａを音声／文字変換手段１０で文字変換してテキストファイルとし、このテキストファイルに基づいて表示装置８にその文字列を表示する。
【００２９】
ところで、図２のフローチャートにおいて、ＳＴＥＰ１で、アンテナ２を介してチューナによってデジタル放送のコンテンツｂを新たに受信を開始したものとしたが、この場合に限られるものでなく、コンテンツｂの映像及び音声を表示装置８及びスピーカ１１より出力している際に、コンテンツａを受信したときも上記と同様の動作を行う。又、コンテンツａ，ｂの番組ジャンルを獲得することができない場合には、ＳＴＥＰ６からＳＴＥＰ８に移行して、それぞれの音声の部分の比率による優先度より選択されるようにしても構わない。
【００３０】
更に、請求の範囲における無音部比率判定手段及び番組情報判定手段は、それぞれ、本実施形態における音声性質分析手段及び番組ジャンル取得手段に相当する。又、請求の範囲における操作手段は、本実施形態における入力手段及びユーザー指示取得手段に相当する。
【００３１】
尚、本実施形態のマルチメディア端末機器は、復号手段で復号された音声信号を用いて、その信号の信号レベルを検知したが、チューナ又はネットワーク制御手段から送出される信号のうち音声データフレーム内のスケールファクタなどよりその音声データの信号レベルの情報を獲得することによって、各コンテンツの音声信号の性質を音声性質分析手段で判定しても構わない。
【００３２】
又、本実施形態のマルチメディア端末機器は、入力手段によってユーザーが各コンテンツの出力形式を決定し、この指示された各コンテンツの出力形式をユーザー指示取得手段に記憶するようにしたが、例えば、ユーザーがマウスなどの入力手段によって、表示装置上の各コンテンツの映像を表示しているウィンドウを指示し、そして、その出力形式を入力することによって、現在処理されているコンテンツのうち１つのコンテンツの音声信号を選択してスピーカより出力するようにしても構わない。更に、このとき、自動的に、入力手段で指示されていない他のコンテンツの音声信号を上記したように、音声／文字変換手段で変換して、文字列として表示装置に表示するようにできるようにしても構わない。
【００３３】
又、本実施形態において、コンテンツをデジタル放送やインターネットなどから受信したコンテンツとしたが、このようなネットワークを介して受信して得られるコンテンツに限らず、例えば、磁気ディスクや光ディスクなどの記録媒体を再生することによって得ることのできるコンテンツでも構わない。このような記録媒体から得られるコンテンツは、その記録媒体に記録されたデータから音楽的なコンテンツか映画などが記録された台詞などの多い話言葉的な音声信号を持つコンテンツかを番組ジャンル取得手段で判断することができる。又、音声性質分析手段によっても、このコンテンツの音声信号の信号レベルを調べることで、この音声信号が音楽的なものか話言葉的なものか判定することができる。
【００３４】
更に、本実施形態のマルチメディア端末機器は、番組ジャンル取得手段、音声性質分析手段、及びユーザー指示取得手段の３つの手段を有したものとしたが、このようなマルチメディア端末機器に限定されるものでなく、番組ジャンル取得手段、音声性質分析手段、及びユーザー指示取得手段のうちいずれか１つの手段しか有していないようなマルチメディア端末機器でも構わない。又、番組ジャンル取得手段、音声性質分析手段、及びユーザー指示取得手段のうちいずれか２つの手段を有するようなマルチメディア端末機器としても構わない。
【００３５】
【発明の効果】
本発明のマルチメディア端末機器によると、映像信号及び音声信号よりなる複数のコンテンツを処理する際に、復号した音声信号を１つ選択して音声出力手段より出力するので、音声出力手段を複数必要としない。又、復号したコンテンツの数に応じた音声出力手段を有していなくても、その音声信号を混合して出力することなく有効に出力することができる。更に、選択手段で選択されない音声信号を文字信号に変換した後、文字列として表示手段に表示することができるので、音声出力手段より出力されなかった音声信号を使用者が表示手段の文字列より読みとって理解することができる。
【００３６】
又、無音部比率判定手段や番組情報判定手段を設けることによって、音楽的音声の多いコンテンツと話言葉的音声の多いコンテンツとを処理している際、自動的に、文字列として表現するのが困難な音楽的音声の多いコンテンツの音声信号を音声出力手段により出力するとともに、話言葉的音声の多いコンテンツの音声信号を文字列として表示手段に表示することができる。よって、使用者が、各コンテンツの音声信号を選択することから解放され、コンテンツの視聴に集中することができる。
【図面の簡単な説明】
【図１】本発明のマルチメディア端末機器の内部構造を示すブロック図。
【図２】図１のマルチメディア端末機器の動作を示すフローチャート。
【図３】番組ジャンルによる優先度の１例。
【図４】ユーザー指示による出力形式の１例。
【符号の説明】
１マルチメディア端末機器
２アンテナ
３チューナ
４ネットワーク
５ネットワーク制御手段
６，７復号手段
８表示装置
９音声信号選択手段
１０音声／文字変換手段
１１スピーカ
１２番組ジャンル取得手段
１３音声性質分析手段
１４ユーザー指示取得手段
１５入力手段[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a multimedia terminal device that receives a plurality of types of content, and more particularly to a multimedia terminal device that receives digital broadcasts and can be connected to the Internet.
[0002]
[Prior art]
2. Description of the Related Art In recent years, digital technology has been developed, and multimedia terminal devices such as personal computers that can play received digital broadcast images on a monitor have been proposed. One such multimedia terminal device has a role as a computer terminal and a role as an AV (Audio Visual) device. Therefore, it is not necessary to separately have a computer terminal for performing the Internet and an AV device for receiving video broadcasts as in the prior art.
[0003]
[Problems to be solved by the invention]
However, in such a multimedia terminal device, when there is only one speaker system that outputs audio, when receiving a plurality of types of content composed of audio signals and video signals such as digital TV programs and homepages, the respective contents are received. Cannot be output simultaneously. Further, it is necessary for the user to select and select one sound from such a plurality of contents based on the program contents of the contents. Furthermore, when a plurality of content sounds are mixed and output from this speaker system, there is a problem that spoken words such as lines cannot be heard.
[0004]
In view of the above problems, the present invention provides a multimedia terminal device capable of receiving a plurality of types of content and automatically selecting and outputting one of the plurality of content audio signals. The purpose is to provide.
[0005]
[Means for Solving the Problems]
In order to achieve the above object, the multimedia terminal device according to claim 1 includes a plurality of decoding units that respectively decode a plurality of types of content including an audio signal and a video signal, and a video signal of each content. In a multimedia terminal device having display means for displaying video, an audio output means for outputting audio from the audio signal of the content, and an audio signal output from the audio output means from the audio signals of the plurality of types of content A selection means for selecting one;
Corresponding to the program genre included in the program information of each content, program information determination means for giving priority to the audio signal of each content ,
A plurality of videos are simultaneously displayed on the display unit from the video signals of the plurality of types of content decoded by the plurality of decoding units, and each of the plurality of decoding units is decoded by the selection unit in the selection unit. A high-priority audio signal given by the program information determination unit is selected from the audio signals of the contents, and the audio output unit outputs the audio signal as audio .
[0006]
In a multimedia terminal device having such a configuration, when a plurality of types of content decoded by a plurality of decoding means are processed at the same time, the program information determining means determines the program genre included in the program information of the content. The priority is given to the audio signal of the content, and in the selection means, the high-priority audio given by the program information determination means from the audio signals of the contents respectively decoded by the plurality of decoding means A signal can be selected and output from the audio output means.
[0007]
According to a second aspect of the present invention, when the program information determining means determines that the program genre included in the program information of the content is a music program, a high priority is given to the audio signal of the content. Then, the selecting means selects the audio signal of the content whose program genre is a music program from the audio signals of the respective contents decoded by the plurality of decoding means, and outputs them from the audio output means Can do.
[0014]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing the internal structure of the multimedia terminal device of the present invention. FIG. 2 is a flowchart showing the operation of the multimedia terminal device of FIG.
[0015]
The multimedia terminal device 1 shown in FIG. 1 has an antenna 2 that receives a digital broadcast signal, a demodulator of the broadcast signal transmitted from the antenna 2, and demultiplexes the broadcast signal to demultiplex the broadcast content or EPG. (Electronic Program Guide) A tuner 3 for acquiring information and a network control means 5 for acquiring contents such as a home page selected by the user via a network 4 such as the Internet.
[0016]
The multimedia terminal device 1 includes a decoding unit 6 that decodes the video signal and the audio signal from the content acquired by the tuner 3, and a decoding unit 7 that decodes the video signal and the audio signal from the content acquired by the network control unit 5. And a display device 8 for displaying video based on the video signals decoded by the decoding means 6 and 7. In the display device 8, a window for displaying each content is opened, and by displaying the video of each content on the window, it is possible to display simultaneously.
[0017]
The multimedia terminal device 1 sends one of the audio signals decoded by the decoding means 6 and 7 to the speaker 11 for reproducing the audio and converts the other audio signal into a character. A voice signal selection means 9 to be sent to the voice / character conversion means 10; a voice / character conversion means 10 for creating a text file obtained by converting the voice signal into a character signal and giving the text file to the display device 8; Is provided. Furthermore, a program genre acquisition unit 12 that provides a control signal for controlling the audio signal selection unit 9, an audio property analysis unit 13, and a user instruction acquisition unit 14, a keyboard for the user to operate the multimedia terminal device 1, Input means 15 such as a mouse is provided.
[0018]
The program genre acquisition unit 12 acquires the program genre of the content of the channel selected by the user from the EPG information obtained by the tuner 3 or the program guide site acquired by the network control unit 5, and stores it together with the priority of each program genre. . Further, the voice property analyzing means 13 judges whether or not the voice signals decoded by the decoding means 6 and 7 contain many low level signals which become silent parts, and there are many silent parts. The audio signal is determined to be a spoken language audio signal, and its priority is lowered. Conversely, an audio signal with few interruptions that becomes a silent portion is determined as a musical audio signal, and the priority is increased. Further, the user instruction acquisition unit 14 stores the output format of each content input by the user through the input unit 15.
[0019]
That is, in the program genre acquisition unit 12, the audio property analysis unit 13, and the user instruction acquisition unit 14, the priority and output format of the audio signal of each content are determined as follows. For example, as shown in FIG. 3, the priority according to the program genre is “3” for music programs, “2” for variety programs and dramas, “1” for sports and weather forecasts, and “0” for news. It is stored in the means 12.
[0020]
This priority is output preferentially to the speaker 11 as the program genre has a larger number. The program genre acquisition unit 12 stores the program genre of each content in advance by storing EPG information periodically received from the antenna 2 and program information acquired by connecting to the program guide site via the network 4. To do. Further, the program genre can be identified from the data signal that is the program information attached to the content received via the network 4.
[0021]
Further, the voice property analyzing means 13 measures the ratio of the voice signal whose level of the voice signal transmitted from the decoding means 6 and 7 is lower than the threshold level, and when the ratio of the voice signal having the low threshold level is large, It is determined that there are many silent parts, and the priority of the voice signal is lowered as a spoken voice signal. On the other hand, an audio signal having a low percentage of audio signals with a low threshold level is judged to have few interruptions, and the priority of the audio signal is increased as a musical audio signal. Further, when the user operates the input means 15 such as an input key, for example, as shown in FIG. 4, the content A is “speaker output”, the content B is “text output”, and the content C is “automatic”. The output format of the content is stored in the user instruction acquisition unit 14.
[0022]
The operation of the multimedia terminal device 1 will be described below with reference to the flowchart of FIG. Now, the video of the content a received from the network 4 by the network control means 5 is displayed on the window of the display device 8, and when the audio is output from the speaker 11, the digital broadcast of the digital broadcast is performed by the tuner 3 via the antenna 2. It is assumed that reception of content b has started (STEP 1). First, the output format of the content a is determined by the user instruction acquisition unit 14 (STEP 2 and STEP 3). At this time, when the output format of the content a is not “automatic” (No), the process proceeds to STEP 3, and when the output format of the content a is “automatic” (Yes), the process proceeds to STEP 4 (STEP 2).
[0023]
In STEP 3, it is determined whether or not the output format of the content a is “speaker output”. At this time, when the output format of the content a is “speaker output” (Yes), the content a is output from the speaker 11 at STEP 9 and the content b is converted into a text file by the voice / character conversion means 10. The character string is displayed on the display device 8 based on the text file. On the contrary, when the output format of the content a is “text output” (No), the content b is output from the speaker 11 in STEP 10 and the content a is converted into characters by the voice / character converting means 10 to be a text file. The character string is displayed on the display device 8 based on the text file.
[0024]
Next, the output format of the content b is determined by the user instruction acquisition unit 14 (STEP 4 and STEP 5). At this time, when the output format of the content b is not “automatic” (No), the process proceeds to STEP5, and when the output format of the content b is “automatic” (Yes), the process proceeds to STEP6 (STEP4).
[0025]
In STEP 5, it is determined whether or not the output format of the content b is “speaker output”. At this time, when the output format of the content b is “speaker output” (Yes), the content b is output from the speaker 11 in STEP 10 and the content a is converted into a text file by the voice / character conversion means 10. The character string is displayed on the display device 8 based on the text file. On the contrary, when the output format of the content b is “text output” (No), the content a is output from the speaker 11 at STEP 9 and the content b is converted into a text file by the voice / character converting means 10. The character string is displayed on the display device 8 based on the text file.
[0026]
In STEP 6, the program genre acquiring unit 12 determines the program genre of the content b from the EPG information stored in advance, and the program genre of the content a from the program information stored in the program guide site or the data signal accompanying the content a. Determine. Then, the priorities of the program genres stored in advance are compared with the program genres of the contents a and b, the priorities of the contents a and b are compared, and if they are not equal (No), the process proceeds to STEP 7 and is equal ( If yes), go to STEP8.
[0027]
In STEP 7, the priorities of the contents a and b obtained by the program genre acquisition means 12 are compared according to the program genre. When the priority of the contents a is high (Yes), the contents a are output from the speaker 11 in STEP 9, and the contents Characters b are converted into a text file by the voice / character converting means 10, and the character string is displayed on the display device 8 based on the text file. On the other hand, when the priority of the content b is high (No), the content b is output from the speaker 11 at STEP 10 and the content a is converted into a text file by the voice / character converting means 10 to be a text file. The character string is displayed on the display device 8 based on the above.
[0028]
In STEP 8, the priorities obtained from the audio signal properties of the contents a and b determined by the audio property analysis means 13 are compared, and the priority of the content a is high, that is, the ratio of the silent part of the content a is small (Yes ) When the content a is output from the speaker 11 at STEP 9, the content b is converted into a text file by the voice / character conversion means 10, and the character string is displayed on the display device 8 based on the text file. On the other hand, when the priority of the content b is high, that is, the ratio of the silent part of the content b is small (No), the content b is output from the speaker 11 at STEP 10 and the content a is converted into the voice / character converting means 10. The character is converted into a text file, and the character string is displayed on the display device 8 based on the text file.
[0029]
In the flowchart of FIG. 2, it is assumed that the digital broadcast content b is newly started to be received by the tuner via the antenna 2 in STEP 1. However, the present invention is not limited to this. Is output from the display device 8 and the speaker 11, and the same operation as described above is performed when the content a is received. If the program genres of the contents a and b cannot be acquired, the process may be shifted from STEP 6 to STEP 8 and selected according to the priority based on the ratio of the respective audio parts.
[0030]
Furthermore, the silent part ratio determining means and the program information determining means in the claims correspond to the sound property analyzing means and the program genre acquiring means in the present embodiment, respectively. The operation means in the claims corresponds to the input means and the user instruction acquisition means in the present embodiment.
[0031]
The multimedia terminal device according to the present embodiment detects the signal level of the signal using the audio signal decoded by the decoding unit. By acquiring the information of the signal level of the audio data from the scale factor, the audio signal property of each content may be determined by the audio property analysis means.
[0032]
In the multimedia terminal device of the present embodiment, the user determines the output format of each content by the input means, and stores the output format of each instructed content in the user instruction acquisition means. The user designates a window displaying a video of each content on the display device by an input means such as a mouse, and inputs the output format of one of the currently processed contents. An audio signal may be selected and output from a speaker. Further, at this time, the audio signal of other content not instructed by the input means can be automatically converted by the voice / character converting means and displayed as a character string on the display device as described above. It doesn't matter.
[0033]
Further, in the present embodiment, the content is content received from digital broadcasting or the Internet. However, the content is not limited to content obtained by receiving via such a network. For example, a recording medium such as a magnetic disk or an optical disk is used. It may be content that can be obtained by reproduction. Whether the content obtained from such a recording medium is a content having a spoken speech signal such as a musical content or a dialogue in which movies are recorded from the data recorded on the recording medium, is a program genre acquisition means Can be judged. Also, the sound property analysis means can determine whether the sound signal is musical or spoken language by examining the signal level of the sound signal of the content.
[0034]
Furthermore, although the multimedia terminal device of this embodiment has three means of program genre acquisition means, audio property analysis means, and user instruction acquisition means, it is limited to such multimedia terminal equipment. Instead, it may be a multimedia terminal device having only one of program genre acquisition means, audio property analysis means, and user instruction acquisition means. Further, it may be a multimedia terminal device having any two means of program genre acquisition means, audio property analysis means, and user instruction acquisition means.
[0035]
【The invention's effect】
According to the multimedia terminal device of the present invention, when processing a plurality of contents composed of video signals and audio signals, one decoded audio signal is selected and output from the audio output means. And not. Even if there is no audio output means corresponding to the number of decrypted contents, the audio signals can be effectively output without being mixed and output. Further, since the voice signal not selected by the selection means is converted into a character signal and can be displayed on the display means as a character string, the user can read the voice signal that has not been output from the voice output means from the character string on the display means. Can read and understand.
[0036]
In addition, by providing silence ratio determination means and program information determination means, when processing content with a lot of musical sound and content with a lot of spoken speech, it is automatically expressed as a character string. The audio signal of the content with a lot of difficult musical sound can be output by the audio output means, and the audio signal of the content with a lot of spoken speech can be displayed on the display means as a character string. Thus, the user is freed from selecting the audio signal of each content, and can concentrate on viewing the content.
[Brief description of the drawings]
FIG. 1 is a block diagram showing an internal structure of a multimedia terminal device according to the present invention.
FIG. 2 is a flowchart showing an operation of the multimedia terminal device of FIG. 1;
FIG. 3 shows an example of priorities according to program genres.
FIG. 4 shows an example of an output format according to a user instruction.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 Multimedia terminal device 2 Antenna 3 Tuner 4 Network 5 Network control means 6, 7 Decoding means 8 Display device 9 Audio signal selection means 10 Voice / character conversion means 11 Speaker 12 Program genre acquisition means 13 Audio property analysis means 14 User instruction acquisition Means 15 Input means

Claims

In a multimedia terminal device comprising a plurality of decoding means for decoding each of a plurality of types of content comprising an audio signal and a video signal, and a display means for displaying a video from the video signal of each content,
Audio output means for outputting audio from the audio signal of the content;
Selecting means for selecting one audio signal to be output from the audio output means from the audio signals of the plurality of types of content;
Corresponding to the program genre included in the program information of each content, program information determination means for giving priority to the audio signal of each content ,
A plurality of videos are simultaneously displayed on the display unit from the video signals of the plurality of types of content decoded by the plurality of decoding units, and each of the plurality of decoding units is decoded by the selection unit in the selection unit. A multimedia terminal device , wherein an audio signal having a high priority given by the program information determining means is selected from the audio signals of the respective contents and outputted as audio by the audio output means .

2. The multimedia terminal device according to claim 1, wherein when the program genre included in the program information of the content is a music program, the program information determination unit gives a high priority to the audio signal of the content. .