JP2003235041A

JP2003235041A - Real time picture encoding apparatus

Info

Publication number: JP2003235041A
Application number: JP2002363351A
Authority: JP
Inventors: Takao Yamaguchi; 孝雄山口; Go Kamogawa; 郷鴨川; Kazuo Nobori; 一生登
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1996-08-07
Filing date: 2002-12-16
Publication date: 2003-08-22

Abstract

<P>PROBLEM TO BE SOLVED: To control the coding quantity depending on the processing situation at the terminal when decoding or compositing a plurality of pictures or sounds simultaneously. <P>SOLUTION: The decoding apparatus of this invention comprises: a reception control unit 11 for receiving the information; a separation unit 12 for analyzing and separating the received information; a priority decision unit 14 for determining the priority of processing of the pictures separated in the separation unit 12; a picture expanding unit 18 for expanding the pictures according to the determined priority; a picture compositing unit 19 for compositing the pictures on the basis of the expanded pictures; a composition result accumulating unit 22 for accumulating the composited pictures, a reproduction time control unit 23 for controlling the time for starting reproduction; and an output unit 24 for delivering the result of composition according to the information of the reproduction time control unit 23. <P>COPYRIGHT: (C)2003,JPO

Description

【発明の詳細な説明】【０００１】【発明の属する技術分野】本発明は、リアルタイム画像
符号化装置に関するものである。【０００２】【従来の技術】従来より、自分側空間の風景の画像中か
ら、例えば人物画像を抽出し、その画像と、相手側から
送られてきた人物画像と、予め記憶されている相手側と
共通的に表示するための仮想的な空間の画像とを重畳し
て表示することにより、相手が自分の前にいるという実
在感を充足し、臨場感のある映像通信を目指したものが
ある（特公平４−２４９１４号公報、「ハイパーメディ
アシステムパーソナルコミュニケーションシステム」
（Fukuda, K., Tahara, T., Miyoshi, T. :"Hypermedia
Personal Computer Communication System: Fujitsu H
abitat", FUJITSU Sci. Tech. J., 26, 3, pp.197-206
(October 1990).）、中村：「ネットワーク対応仮想現
実感による分散協同作業支援」、情報処理学会オーディ
オビジュアル複合情報処理研究会（１９９３））。特
に、従来の技術では画像合成を行うための高速化、メモ
リーを低減する方法に関する発明が行われている（例え
ば、特公平５−４６５９２：画像合成装置、特開平６−
１０５２２６：画像合成装置）。【０００３】【発明が解決しようとする課題】しかしながら、従来の
技術では、２次元の静止画や３次元のＣＧデータを合成
する画像合成システムが提案されていたが、複数の動画
や音声を同時に復号化（伸長）して、合成し表示させる
システムの実現方法については述べられていなかった。
特に、複数の映像、音声を同時に復号、合成、表示でき
る端末装置において、端末の能力の不足や処理能力の変
動に対して破綻を来さない映像や音声の再生方法につい
ては述べられていなかった。加えて、課金状況に応じて
複数の映像を復号、合成、表示する方法については述べ
られていなかった。【０００４】具体的には、（１）複数の画像、音声の情報、複数の画像と音声との
関係を記述した情報、及び処理結果の情報を管理する方
法。（２）端末の処理状態が過負荷である場合の複数の画像
や音声の復号、合成、表示の優先度の決定方法、再生お
よび課金に関する方法。【０００５】更に、複数の映像、音声を同時に復号、合
成、表示できる環境下で、受信端末側の状態や受信端末
での復号、合成、表示の優先度に応じて画像の圧縮方法
を変更して、符号化量を制御する方法に関しては考慮さ
れていない。【０００６】【課題を解決するための手段】本発明は、従来のこのよ
うな課題を考慮し、同時に複数の映像や音声の復号、合
成を行う場合に、端末の処理状況に応じて符号化量を制
御でき、また、課金状況に応じて複数の映像や音声の復
号、合成、表示の制御ができる画像音声復号化装置と画
像音声符号化装置及び情報伝送システムを提供すること
を目的とするものである。【０００７】本発明は、２次元の画像合成だけに限定さ
れない。２次元の画像と３次元の画像を組み合わせた表
現形式でもよいし、広視野画像（パノラマ画像）のよう
に複数の画像を隣接させて画像合成するような画像合成
方法も含めてもよい。【０００８】本発明で対象としている通信形態は、有線
の双方向ＣＡＴＶやＢ−ＩＳＤＮだけではない。例え
ば、センター側端末から家庭側端末への映像や音声の伝
送は電波（例えば、ＶＨＦ帯、ＵＨＦ帯）、衛星放送
で、家庭側端末からセンター側端末への情報発信はアナ
ログの電話回線やＮ−ＩＳＤＮであってもよい（映像、
音声、データも必ずしも多重化されている必要はな
い）。また、ＩｒＤＡ、ＰＨＳ（パーソナル・ハンディ
ー・ホン）や無線ＬＡＮのような無線を利用した通信形
態であってもよい。【０００９】また、対象とする端末は、携帯情報端末の
ように携帯型の端末であっても、セットトップＢＯＸ、
パーソナルコンピュータのように卓上型の端末であって
もよい。【００１０】具体的に説明すると、請求項１記載の本発
明は、画像を人力する１つ以上の画像入力手段と、その
画像入力手段の制御状態を管理する画像入力管理手段
と、受信端末の受信状況を管理する他端末制御要求管理
手段と、少なくともその管理された受信端末の受信状況
もしくは前記画像入力手段の制御状態に応じて、画像の
符号化方法を決定する符号処理決定手段と、その符号処
理決定手段の決定結果に従って、前記入力画像を符号化
する画像符号化手段と、その符号化された画像を出力す
る出力手段とを備えたことを特徴とするリアルタイム画
像符号化装置である。【００１１】なお、本願には次のような項目１〜１９の
技術開示がある。【００１２】１．画像情報の符号化を行う画像符号化
手段及び、その符号化された種々の情報を送信もしくは
記録する送信管理手段を有する画像符号化装置と、符号
化された種々の情報を受信する受信管理手段、その受信
された種々の情報の復号を行う画像復号手段、その復号
された１つ以上の画像を合成する画像合成手段及び、そ
の合成された画像を出力する出力手段を有する画像復号
化装置とを備えたことを特徴とする画像復号化符号化装
置。【００１３】２．音声情報の符号化を行う音声符号化
手段及び、その符号化された種々の情報を送信もしくは
記録する送信管理手段を有する音声符号化装置と、符号
化された種々の情報を受信する受信管理手段、その受信
された種々の情報の復号を行う音声復号手段、その復号
された１つ以上の音声を合成する音声合成手段及び、そ
の合成された音声を出力する出力手段を有する音声復号
化装置とを備えたことを特徴とする音声復号化符号化装
置。【００１４】３．項目１の前記画像復号化符号化装置
と、項目２記載の前記音声復号化符号化装置とを備え、
前記画像符号化装置及び／又は前記音声符号化装置は、
符号化された情報の過負荷時の処理の優先度を予め決め
られた基準で決定し、前記符号化された情報と前記決定
された優先度を対応づける優先度付加手段を有し、前記
画像復号化装置及び／又は前記音声復号化装置は、受信
された種々の情報の過負荷時の優先度に従って、処理の
方法を決定する優先度決定手段を有することを特徴とす
る画像音声符号化復号化装置。【００１５】４．前記優先度付加手段および前記優先
度決定手段は、少なくとも画像の符号化方式、画像のサ
イズ、コントラスト、画像の合成比率、量子化ステッ
プ、フレーム番号、フレーム数、フレーム間符号化とフ
レ一ム内符号化の違い、表示位置、表示時刻、有音区間
と無音区間の違いのうち１つ以上の情報に基づいて、符
号化された画像や音声の復号、合成、表示の順序、有無
の処理方法を決定する優先度の付加方法を決定したり、
処理対象とすべき優先度を決定することを特徴とする項
目３記載の画像音声符号化復号化装置。【００１６】５．前記優先度付加手段および前記優先
度決定手段は、画像の符号化の際に復号にかかった時間
や、符号化にかかった時間に基づいて、符号化された情
報に付加する優先度を決定したり、復号化の際に処理対
象とすべき優先度を決定することを特徴とする項目３記
載の画像音声符号化復号化装置。【００１７】６．前記優先度付加手段および前記優先
度決定手段は、画像の復号、合成、表示の処理を行う実
行回数を規定する実施率を定義し、その実施率に基づい
て、符号化された情報に付加する優先度を決定したり、
復号化の際に処理対象とすべき優先度を決定することを
特徴とする項目３記載の画像音声符号化復号化装置。【００１８】７．少なくともフレーム内符号化のフレ
ームもしくは、１フレーム目もしくは最終フレーム、も
しくはシーンチェンジのフレームの過負荷時の処理の優
先度を高く設定することを特徴とする項目４記載の画像
音声符号化復号化装置。【００１９】８．フレーム間符号化された画像は同一
の優先度を割り当てることを特徴とする項目４記載の画
像音声符号化復号化装置。【００２０】９．フレーム内符号化された画像に複数
段階の優先度を割り当てることを特徴とする項目４記載
の画像音声符号化復号化装置。【００２１】１０．前記画像復号手段は、１フレーム
よりも小さい所定単位で画像の復号処理を行うことを特
徴とする項目１記載の画像符号化復号化装置。【００２２】１１．項目１の前記画像復号化符号化装
置と、項目２記載の前記音声復号化符号化装置とを備
え、少なくとも、課金に関する情報、サービスの内容を
示す情報、パスワード、利用者コード、国別コード、合
成、表示の順位を示す情報、復号の順位を示す情報、利
用者の指示、端末の処理能力、再生時刻のうち１っ以上
の情報に基づいて、復号、合成、表示すべき画像や音声
の順番、有無、再生方法を決定することを特徴とする画
像音声符号化復号化装置。【００２３】１２．項目１の前記画像復号化符号化装
置と、項目２記載の前記音声復号化符号化装置とを備
え、前記受信管理手段は、前記種々の情報のうち画像情
報同士、音声情報同士の関係を記述した情報を、前記画
像情報や音声情報とは別の情報として独立して扱うこと
を特徴とする画像音声符号化復号化装置。【００２４】１３．前記画像情報同士、前記音声情報
同士の関係を記述するための記述方法を識別するための
識別子により記述方法を識別することを特徴とする項目
１２記載の画像音声符号化復号化装置。【００２５】１４．項目１の前記画像復号化符号化装
置と、項目２記載の前記音声復号化符号化装置とを備
え、前記画像合成手段もくしくは前記音声合成手段は、
送信側から復号結果を破棄する指示が来るまで、復号結
果を保持して管理、利用することを特徴とする画像音声
符号化復号化装置。【００２６】１５．項目１の前記画像復号化符号化装
置と、項目２記載の前記音声復号化符号化装置とを備
え、画像情報同士や音声情報同士の関係を記述した情報
に基づき、画像や音声を合成する際に、必要とする復号
された画像や音声が用意されていなくて、合成出来ない
画像や音声が存在することを利用者に提示することを特
徴とする画像音声符号化復号化装置。【００２７】１６．画像を人力する１つ以上の画像入
力手段と、その画像入力手段の制御状態を管理する画像
入力管理手段と、受信端末の受信状況を管理する他端末
制御要求管理手段と、少なくともその管理された受信端
末の受信状況もしくは前記画像入力手段の制御状態に応
じて、画像の符号化方法を決定する符号処理決定手段
と、その符号処理決定手段の決定結果に従って、前記入
力画像を符号化する画像符号化手段と、その符号化され
た画像を出力する出力手段とを備えたことを特徴とする
リアルタイム画像符号化装置。【００２８】１７．前記符号処理決定手段は、前記画
像入力手段の制御状態に応じて、少なくとも符号化する
優先度、過負荷時の処理の優先度情報、符号化方式、量
子化ステップの値、フレーム数、符号化すべき画像の大
きさ、符号化の有無のいずれかを決定することを特徴と
する項目１６記載のリアルタイム画像符号化装置。【００２９】１８．項目１の前記画像復号化符号化装
置及び項目２記載の前記音声復号化符号化装置のうち少
なくともいずれかを受信端末とし、項目１の前記画像復
号化符号化装置及び項目２記載の前記音声復号化符号化
装置及び項目１６の前記リアルタイム画像符号化装置の
うち少なくともいずれかを送信端末として、それら端末
間を通信路で接続した情報伝送システムであって、少な
くとも前記受信端末の負荷、前記受信端末の前記優先度
決定手段で決定された処理対象とすべき符号化された情
報の優先度に関する情報、前記受信端末でのフレームス
キップの状況のいずれかを前記送信端末に送信すること
により、前記送信端末において、画像もしくは音声の符
号化の有無、符号化の優先度、符号化方式、符号化すべ
き画像サイズ、量子化ステップの値、フレーム数、受信
端末の過負荷時の処理の優先度のいずれかを決定するこ
とを特徴とする情報伝送システム。【００３０】１９．項目１の前記画像復号化符号化装
置及び項目２記載の前記音声復号化符号化装置を受信端
末とし、項目１の前記画像復号化符号化装置及び項目２
記載の前記音声復号化符号化装置及び項目１６の前記リ
アルタイム画像符号化装置を送信端末として、それら端
末間を通信路で接続した情報伝送システムであって、画
像の伝送は再送を行う伝送方法で行い、音声の伝送は再
送を行わない伝送方法で行い、少なくとも画像の再送回
数、受信された情報の誤り率、及び廃棄率に関するいず
れかの情報を前記送信端末に伝送することにより、前記
符号化処理決定手段は、符号化方式、量子化ステップの
値、フレーム数、符号化すべき画像の大きさ、符号化の
有無、及び受信端末の過負荷時の処理の優先度の少なく
ともいずれかを決定することを特徴とする情報伝送シス
テム。【００３１】【発明の実施の形態】以下に、本発明をその実施の形態
を示す図面に基づいて説明する。本発明で使用する「画
像」の意味は静止画と動画の両方を含む。また、対象と
する画像は、コンピュータ・グラフィックス（ＣＧ）の
ような２次元画像とワイヤーフレーム・モデルから構成
されるような３次元の画像データが混合したものであっ
てもよい。この場合、画像間の関係はワイヤーフレーム
モデルに相当する。記述するためのスクリプト言語とし
てはＪＡＶＡ（登録商標）やＶＲＭＬなどが挙げられ
る。【００３２】図１及び図２は、本発明の一実施の形態に
おける画像復号化符号化装置の概略構成図である。図１
は、音声の再生機能をもたない場合の構成であり、図２
は、画像と音声の再生機能をもつ場合の構成である。当
然のことながら音声だけの場合も、同様に構成できる。【００３３】図１あるいは図２の本装置は、符号化装置
及び復号化装置から構成され、図１の場合の符号化装置
は、符号化された画像の過負荷時の処理の優先度を予め
決められた基準で決定し、その符号化画像と優先度とを
対応づける優先度付加部１０１、画像を符号化する画像
符号化部１０２、優先度が付加された符号化情報を送信
あるいは記録する送信管理部１０３、及び符号化された
情報を受信する受信管理部１０４から構成されている。
また、図２の場合の符号化装置は、更に、音声を符号化
する音声符号化部１０５が設けられている。【００３４】一方、復号化装置において、情報を受信す
る受信管理部１１と情報を送信する送信管理部１３は、
同軸ケーブル、ＣＡＴＶ、ＬＡＮ、モデム等の情報を伝
送する手段である。端末の接続形態としては、ＴＶ電話
やＴＶ会議システムのように端末間で双方向で映像情報
を送受信する形態や、衛星放送やＣＡＴＶ、インターネ
ット上での放送型（片方向）の映像放送の形態が挙げら
れる。本発明では、このような端末の接続形態について
考慮している。【００３５】分離部１２は、符号化（圧縮）された受信
情報を解析し、分離する手段である（圧縮装置の場合
は、逆操作で多重化部になる）。たとえば、ＭＰＥＧ１
やＭＰＥＧ２、Ｈ．３２０端末（Ｎ−ＩＳＤＮを利用し
たＴＶ電話／会議装置の規約）ではＨ．２２１が、Ｈ．
３２４端末（アナログ電話回線を利用したＴＶ電話／会
議装置の規約）ではＨ．２２３がビデオ／音声／データ
を多重化、分離する規約である。本発明は、規約に準じ
た構成で実現してもよいし、規約に準じない構成で実現
してもよい。また、Ｈ．３２３やインターネットで行わ
れているように、映像と音声はそれぞれ別ストリームで
独立して伝送してもよい。【００３６】優先度決定部１４は、分離部１２から得ら
れた情報（例えば映像、音声、管理情報）を、以下の方
法で、端末が過負荷である場合の復号（以後、「伸長」
を用いる）の優先度を決定して画像の伸長や音声の伸長
を行う（処理の優先度の決定方法は、予め受信端末装置
で取り決めしておいてもよいし、送信側端末（符号化装
置）で記録メディアや送信パケットなどに、下記の方法
で決定された優先度に関する情報を付加して伝送、記録
フォーマットとして付加しておいてもよい。優先度に関
する表現方法としては、優先度「大」、「中」、「小」
といった数値化していない表現や１、２、３といった数
値化した表現でもよい）。【００３７】複数の画像もしくは音声フレームから構成
されるストリーム単位でのデータの扱いをするための識
別子を用いて、送信側と受信側とでデータの送受信の処
理を行うことで、受信側のバッファの管理や送信側のデ
ータの送信のスケジューリングが可能となる。つまり、
必要に応じて送信側から送付するストリームの識別子を
通知して受信側の受け入れ状況を調べたり、必要としな
いストリームの識別子の受信端末への通知、受信側から
必要なストリームを要求したりすることが可能となる。【００３８】符号化された情報の過負荷時の処理の優先
度を前述した基準で決定し、符号化された情報と決定さ
れた優先度とを対応づける優先度付加手段を画像符号化
装置や音声符号化装置に備え、受信された種々の情報の
過負荷時の優先度に従って、処理の方法を決定する優先
度決定手段で、処理すべき優先度の画像フレームや音声
を決定し、復号、合成処理を行う。尚、画像フレームに
関しては、フレームスキップが行えるようにフレーム内
符号化（Ｉフレーム）を行ったフレームを定期的に挿入
する必要がある。【００３９】優先度を付加する単位としては、映像や音
声の各フレーム単位（フレーム間同士の優先度の比
較）、複数のフレームから構成されるストリーム単位で
あってよい（ストリーム間同士の優先度の比較）。【００４０】画像の特徴に着目した方法としては、画像
の圧縮形式（例えば、Ｈ．２６３とランレングスならラ
ンレングスを優先させる）、画像のサイズ（例えば、Ｃ
ＩＦとＱＣＩＦならばＱＣＩＦを優先させる）、コント
ラスト（例えば、コントラストの明るいものを優先させ
る〉、画像の合成比率（例えば、合成比率の高いものを
優先させる）、量子化ステップ（例えば、量子化ステッ
プの小さな値のものを優先させる）、フレーム間符号化
とフレーム内符号化の違い（例えば、フレーム内符号化
を優先させる）、表示位置（例えば、表示位置が中央の
ものを優先させる。また、３次元画像であれば、画像が
奥に配置される場合は、優先度を低く、手前に表示され
る場合には優先度を高く設定する）、フレーム番号（第
１フレームと最終フレームは優先度を高くする、シーン
チェンジのフレームの優先度を高める等）やフレーム数
（例えば、再生すべきフレーム数が少ない画像は優先度
を高くする。フレーム番号はＨ．２６３の場合、テンポ
ラリー・リファレンス（ＴＲ）に該当し、ＴＲの値の変
化に基づいて判断すればよい）、有音区間と無音区間、
表示時刻（ＰＴＳ）、復号時刻（ＤＴＳ）に基づく方法
が挙げられる。【００４１】加えて、フレーム間符号化されたＰフレー
ムやＢフレームは同一の優先度を割り当てる。また、フ
レーム内符号化された画像に複数段階の優先度を割り当
てることにより、スキップする頻度を制御できる。【００４２】また、メディアの違いに着目した例として
は、音声の伸長を画像の伸長よりも優先的に行う方法が
挙げられる。これにより、音声を途切らすことなく音声
の再生を行うことができる。【００４３】さらに、受信側端末で管理している再生の
許可情報をもとに、伸長すべき情報（画像、音声）の決
定を行ってもよいし、送信側より制御情報として送る再
生許可の情報をもとに、伸長すべき情報の選択を行って
もよい。再生許可の情報は、具体的には、課金に関する
情報（例えば、課金が行われていなければ、伸長、合
成、表示の処理を行わない。受信端末側で、課金に関す
る情報を管理してもよいし、送信側で課金情報を管理し
てもよい）、サービスの内容を示す情報（例えば、成人
向きの放送で端末側で再生の許可が出ていなければ、伸
長、合成、表示の処理を行わない。再生の許可は受信側
端末で管理してもよいし、送信側端末で管理してもよ
い）、パスワード（例えば、特定の番組にはパスワード
を入力しなければ、伸長、合成、表示を行わない。パス
ワードは受信側端末で管理してもよいし、送信側端末で
管理してもよい）、利用者コード（例えば、許可が与え
られている利用者でなければ、伸長、合成、表示は行わ
ない。利用者コードは受信側端末で管理してもよいし、
送信側端末で管理してもよい）、国別コード（例えば、
国によって、伸長、合成、表示すべき画像や音声、再生
方法を変更する。国別コードは、送信側で管理してもよ
いし、受信側で管理してもよい。国別コードで再生方法
を変えることによってスクランブルが実現できる）。【００４４】課金に関する情報、サービスの内容を示す
情報、パスワード、利用者コードといった画像や音声の
再生許可の制限をかけた再生方法としては、画像の合
成、表示を行う際に故意に位置や画素をずらしたり、画
像の拡大・縮小、画像のサンプリング（たとえばローパ
スをかけるとか）を変更、画素反転、コントラストの変
更、カラーパレットの変更、フレームのスキップを行う
方法などが挙げられる。これら画像の再生方法（画像の
伸張、合成、表示）は、１フレーム毎に制約をかけても
よい。あるいは、画像圧縮の１つであるＨ．２６３で定
義されるような１フレームよりも小さく、独立して処理
できる単位であるＧＯＢ（ＧｒｏｕｐＯｆＢｌｏｃ
ｋ）単位で、画像の伸張、合成、表示方法に制約をかけ
てもよく、これにより、従来から行われている画面全体
を乱す手法よりも柔軟な制御が可能になる。つまり、Ｇ
ＯＢ単位で処理することにより、画面の一部分だけにス
クランブルをかけることができるため、画像合成を使っ
たソフトのようにインタラクティブなソフトに対する評
価が可能となる。【００４５】同様に、音の再生方法としては、音の大き
さを変更させる、音の方向を変更させる、音の周波数を
変更させる、音のサンプリングを変更させる、異なる画
像や音声を挿入する方法が挙げられる（いずれの方法
も、あらかじめ送信側で処理する方法と、受信側で処理
する方法が挙げられる）。【００４６】画像と音声の再生方法としては、画像と音
の同期をはずす方法が挙げられる。合成、表示の順位を
示す情報（予め表示する順序を受信側の端末で決めてお
く、例えばＣＩＦや静止画を優先するなど、また、送信
側で、送信情報に表示する順序を優先度に関する情報と
して付加しておく方法も挙げられる）、伸長の順位を示
す情報（予め伸長する順序を受信側の端末で決めてお
く、たとえばＱＣＩＦや、フレーム内符号化の画像デー
タを優先させるなど、ＢＧＭよりも会話音を優先して伸
長するなどが挙げられる。同様に、送信側で、送信情報
に表示する順序を付加しておく方法も挙げられる）、利
用者の指示（たとえば、利用者の指示により、伸長、合
成、表示すべき画像や音声情報を選択させるか、要望に
応じて選択した情報をもとに、伸長、合成、表示すべき
画像や音声情報を決定する）、端末の処理能力（たとえ
ば、現在もしくは過去の一定期間のＣＰＵの処理の占有
時間を計測することにより、処理時間がかかりそうな画
像や音声の伸長、合成、表示を抑制する。処理時間の推
定方法としては、圧縮を行う際にローカル・デコードに
かかった時間や、圧縮にかかった時間を圧縮した画像情
報とともに対応づけて管理することにより、伸長、合
成、表示の有無、優先度の決定を行うことができる）、
再生時刻（たとえば、再生時刻を過ぎた画像、音声情報
の伸長、合成、表示は中止する）や復号時刻により、伸
長すべき画像や音声の優先度、有無を決定してもよい。【００４７】加えて、特定の画像や音声だけが優先的に
伸長、表示されるのを防ぐための方法として、画像や音
声の伸長、合成、表示の処理を行う実施率に関する情報
に基づいて、伸長、合成、表示すべき画像の順番や有無
を決定することができる。例えば、伸長を行う１０回の
うち１回はＣＩＦサイズの画像の伸長を行うと受信端末
側で設定しておくか、送信側で画像や音声の伸長、合
成、表示の実施率を規定してそれに基づいて画像情報や
音声情報を送信する方法が考えられる。実施率は具体的
には、Ｉフレーム（フレーム内符号化したフレーム）の
挿入間隔で定義できる。これにより、特定の画像や音声
オブジェクトのみが伸長、合成、表示されることはなく
なる。【００４８】これら伸長、合成、表示を制御する優先度
に関する情報の付加は送信側の装置だけではなく、中継
を行う装置で付加、制御してもよい。また、受信端末の
復号装置の優先度決定部１４で決定した優先度に関する
情報を、送信管理部１３を通じて送信先に送信すること
で、優先度決定部１４の決定状況に応じた画像、音声伝
送を行うことが可能となる（選択されにくい画像オブジ
ェクトのＩＤを送信側へ送ることにより、無駄に送信さ
れることがなくなる）。尚、受信端末が過負荷である場
合の処理の優先度を示す情報は、受信端末装置で取り決
めてもよいし、伝送フォーマットとして伝送してもよい
し、ＣＤ−ＲＯＭやハードディスクのような記録メディ
アに記録するためのフォーマットとしてＭＰＥＧ２のト
ランスポートストリームを拡張してもよいし、標準化を
考慮しない伝送、記録フォーマット形式であってもよ
い。また、メディア毎（映像、音声、映像と音声の関係
を記述した情報）に別々のストリームとして、多重化を
行わずに伝送、記録してもよい。【００４９】画像復号手段としての画像伸長部１８は画
像の伸長処理を行う手段であり（以降、符号化装置の場
合は符号化手段）、画像伸長部１８で扱う画像フォーマ
ットとしてはＭＰＥＧ１やＭＰＥＧ２、Ｈ．２６１、
Ｈ．２６３等が挙げられる。画像の伸長は１フレーム単
位で行っても、Ｈ．２６３で規定されているＧＯＢ単位
の処理であってもよい。１フレーム単位で処理する場
合、フレーム間符号化を行う場合、前フレームの伸長状
態を画像伸長部１８に記憶しておく必要がある。ＧＯＢ
単位での画像伸長を行った場合、画像の伸長の順序関係
は問題ではなくなる。従って、ＧＯＢ単位で伸長処理を
行う場合、複数の画像伸長部１８を受信装置に持つ必要
はなく、１つの画像伸長部１８で複数の映像の伸長を行
うことが可能となる。反面、伸長結果を蓄えておく必要
がある。【００５０】図２の音声復号手段としての音声伸長部２
０は音声の伸長を行う手段であり、音声伸長部２０で扱
う音声フォーマットとしてはＧ．７２１やＧ．７２３等
が挙げられる。処理のための方法としては、ＤＳＰや汎
用ＣＰＵによるソフトウェア処理や専用のハードウェア
による処理が挙げられる。【００５１】ソフトウェアで実現する場合は、画像およ
び音声の伸長処理をそれぞれ１つのプロセスあるいはス
レッドの単位で管理し、伸長すべき画像や音声が同時に
複数ある場合、処理できる範囲の数のプロセスあるいは
スレッドで時分割して処理する。【００５２】画像伸長管理部１５は画像の伸長の状態を
管理する手段である。また音声伸長管理部１６は音声の
伸長の状態を管理する手段である。例えば、これら管理
部を、ソフトウェアで実現する場合は、分離部１２から
得た圧縮された情報を決められた手順（例えば、最初に
音声伸長部２０から実行し、次に画像伸長部１８で実行
する）で、画像伸長部１８、音声伸長部２０に引き渡
し、伸長の状態を監視する。すべての伸長が完了すれ
ば、画像合成部１９もしくは音声合成部２１に、伸長さ
れた情報を引き渡す。ソフトウェアでは共有メモリーと
セマフォを用いることで、引き渡す情報を制限したり、
伸長処理が終了したことを知る（詳細については後述す
る）。【００５３】時間情報管理部１７は時間に関する情報を
管理する手段である。例えば、システムをパーソナルコ
ンピュータで実現する場合には、時間情報はパーソナル
コンピュータのタイマーを利用して実現すればよい。【００５４】画像合成部１９は、伸長された画像データ
をもとに画像合成を行う。複数の画像の合成を行う場
合、それぞれの画像の合成比率（α値）をもとに画像合
成を行う。例えば、２つの画像を合成する場合で、前景
画像の合成比率がαの場合、背景画像のＲＧＢ値を１−
α、前景画像をαの割合で混合する。尚、伸長すべき画
像は１フレーム単位で処理の管理を行うことにより、表
示時刻を用いて複数の画像を合成する場合にシステムの
構成と実装が簡単化できる。また、画像合成部１９もし
くは音声合成部２１で、送信側から伸長結果を破棄する
指示が来るまで、伸長結果を保持して管理、利用するこ
とで、送信側から同一パターンの情報を繰り返し送信す
る必要をなくすことができる。【００５５】画像同士や音声同士の関係を記述した情報
に基づき、画像や音声を合成する際に、必要とする復号
された画像や音声が用意されていなくて、合成できない
画像や音声が存在することを提示することで、利用者は
合成の状態を知ることができる。そこで、利用者が必要
な画質を選択したり、合成したい画像を予め選択するな
どの指示を行うことで、必要な情報を取りこぼさずに合
成することが可能となる。尚、復号化された画像や音声
のデータをバッファに蓄積、管理する方法としては、到
着順に古いものから順に消去してゆくか、画像同士、音
声同士の関係を記述したスクリプトをみて、全体として
の復号化された画像や音声のデータの使用状況をみて消
去する方法が考えられる。【００５６】音声伸長管理部１６は、少なくとも１っ以
上の音声の伸長を行う音声伸長部２０の伸長状態を管理
する。【００５７】音声合成部２１は、伸長された情報をもと
に音声合成を行う手段であり、合成結果蓄積部２２は、
画像合成部１９が合成した画像と、音声合成部２１が合
成した音声を蓄積する手段である。【００５８】再生時刻管理部２３は、再生を開始すべき
時刻に、合成した画像や音声を再生する手段である。【００５９】出力部２４は合成結果を出力する手段（例
えば、ディスプレイ、プリンタなどである）、入力部２
５は情報を入力する手段（例えば、キーボード、マウ
ス、カメラ、ビデオなどである）である。端末制御部２
６は、これら各部を管理する手段である。【００６０】図３は、通信、記録フォーマットで優先度
に関する情報を付加する場合の例を説明する図である。【００６１】図３（ａ）の例は、完全にすべてのメディ
ア（映像、音声、制御情報）を多重化している例であ
る。制御情報として、過負荷時の処理を決定するための
優先度（本発明で指している優先度）や表示の順序を示
す優先度が示されている。また、制御情報としては、画
像同士、音声同士、画像と音声との関係（時間的、位置
的なもの）に関する情報を記述しておいてもよい。図３
（ａ）の例では、たとえば、ＭＰＥＧ１／２の多重化、
Ｈ．２２３のような制御情報とデータ（映像、音声）を
混在させるパケット多重の適用に向いている。尚、過負
荷時の処理の優先度はフレーム単位もしくはストリーム
単位で付加する。【００６２】図３（ｂ）の例は、メディア毎に情報を多
重化している例である。この例では、制御情報、画像情
報、音声情報は別々の通信ポートから送信される。画像
同士、音声同士、画像と音声との関係に関する情報は制
御情報として、画像や音声とは別の通信ポートから送信
すればよい。Ｈ．３２３やインターネットのように複数
の通信ポートを同時に確立できる場合の適用に向いてお
り、図３（ａ）と比べて多重化の処理が簡略化できるの
で、端末の負荷が軽減できる。【００６３】画像同士と音声同士の記述方法として、Ｊ
ＡＶＡ、ＶＲＭＬといった記述言語などで対応が可能で
あると思われるが、スクリプトの記述言語の仕様が一意
に定まらない状況も考えられる。そこで画像同士、音声
同士の関係（例えば、位置的な情報、時間的な情報（表
示期間など））を記述した情報の記述方法を識別するた
めの識別子を設けることで、複数種類の記述方法に対応
することができる。情報の記述方法を識別するための識
別子の付加方法としては、例えば、ＭＰＥＧ２において
は、ＭＰＥＧ２−ＴＳのストリームを管理するプログラ
ム・マップテーブルに設けるか、スクリプトを記述した
ストリームに設けることで対応できる。過負荷時の処理
の優先度は画像と音声との対応関係を記述した情報とと
もに付加する（制御情報）。尚、ＭＰＥＧ２において
は、ＭＰＥＧ２−ＴＳ（トランスポート・ストリーム）
のビデオ・ストリーム、オーディオ・ストリームを関係
づけるプログラム・マップテーブルで管理できるよう
に、画像と音声との対応関係づけを行う構造情報・スト
リームを定義して管理すれば、ＭＰＥＧ２でもデータと
独立して伝送することができる。【００６４】図４は、ソフトウェアで木発明を構成した
場合の例を説明する図である。マルチタスク・オペレー
ションが可能なオペレーティング・システム上で本発明
を実現した場合、図１や図２で説明した各処理は、プロ
セス、スレッドといったソフトウェアの実行モジュール
単位に分けられ、各プロセス、スレッド間は共有メモリ
ーにより情報の交換を行い、セマフォ（図４の例では、
実線で示された部分がセマフォに対応する）によって共
有する情報の排他制御を行う。以下に、各プロセス、ス
レッドの機能について述べる。【００６５】ＤＥＭＵＸスレッド３１はネットワークや
ディスクから多重化された情報（映像、音声、制御情
報）を読み取り、音声、映像及び、音声と映像との対応
関係と再生時間に関する情報とを記述した監視用テーブ
ル（詳細は後述する）に分離する。ＤＥＭＵＸスレッド
３１は前述の分離部１２に対応する。ＤＥＭＵＸスレッ
ド３１で分離された情報は、音声用のリングバッファ３
２、映像用のリングバッファ３３、監視用のリングバッ
ファ３４にそれぞれ送出される。音声情報である場合、
リングバッファ３２に送出された情報は、音声デコード
スレッド３５（前述の音声伸長部２０に対応する）で伸
長される。映像情報である場合、リングバッファ３３に
送出された情報は、デコードプロセス３６で伸長され
る。【００６６】監視用テーブルに関しては、リングバッフ
ァ３４に送出され、映像を伸長するための順序を決定す
るために監視スレッド３７（前述の端末制御部２６、画
像伸長管理部１５、音声伸長管理部１６に対応する）で
利用される。また、同じ監視用テーブルが画像合成のた
めに画像合成スレッド３９で利用される。監視スレッド
３７で利用された監視用テーブルは、すべての音声、画
像の伸長が終わった時点で、次のテーブルをリングバッ
ファ３４から読み出す。デコード・プロセス３６（前述
の画像伸長部１８に対応する）で伸長された画像情報は
映像用シングルバッファ３８に送出される。送出された
画像情報が揃った時点で、画像合成スレッド３９（前述
の画像合成部１９に対応する）にて、監視用テーブルで
管理される画像合成の比率を用いて画像合成を行う。合
成結果は、合成用バッファ４１（前述の合成結果蓄積部
２２に対応する）に蓄積され、表示監視スレッド４２で
表示時間になるまで表示待ちの状態で待機する（前述の
再生時刻管理部２３に対応する）。【００６７】図５は、図４の構成で用いられる情報の構
造について説明する図である。図５の例では、ディスク
もしくはネットワークから受信した情報は１８８ｂｙｔ
ｅの固定長である（Ｂ）。ＤＥＭＵＸスレッド３１で分
離された音声情報の構造は、パケット同期用のコード、
再生時刻、再生すべき音声の長さを示すフレーム長、音
声データからなる（Ｃ）。映像情報の構造は、パケット
同期用のコード、画像を識別するためのフレーム番号、
画像情報の大きさを示すフレーム長、画像データからな
る（Ｄ）。本発明は１フレーム単位での処理である必要
はなく、マクロブロック単位のような小さなブロック単
位での処理を行っても構わない。【００６８】監視用テーブルの構造は、画像の表示時
間、１フレームで表示（合成）すべき画像の数、各画像
のＩＤ、フレーム番号、伸長や表示を行う優先度、フレ
ームのタイプを示す識別子（Ｉピクチャ、Ｐピクチャ、
Ｂピクチャ）、表示の水平位置、表示の垂直位置、合成
の比率を示す階層の各情報から構成される（Ｅ）。な
お、画像の合成比率と音声の合成比率を対応づけて変化
させてもよい。例えば、画像、２種類が、それぞれ音声
２種類に対応する場合、画像の合成比率がα：１−αで
ある場合、対応する音声の合成比率もα：１−αで対応
づけてもよい。画像情報同士の関係だけではなく、音声
同士の関係も記述してもよい（例えば、方向、種類（Ｂ
ＧＭ、会話音））。【００６９】図６は、ＤＥＭＵＸスレッド３１の動作に
ついて説明する図である。ファイルもしくは、ネットワ
ークから１８８バイトの固定長のデータを読み込む（５
−１）。読み込んだデータを分析し、前述の音声、映
像、監視用テーブルの構造の型にセットする（５−
２）。リングバッファヘの書き込みが可能であれば、音
声、映像、監視用テーブルをそれぞれのリングバッファ
に書き込みを行う。画像オブジェクトＩＤと複数ある画
像伸長手段との対応関係をとる。例では、若い番号のオ
ブジェクトＩＤから若いリングバッファ番号の共有メモ
リーへ順に書き出す（５−３）。書き込んだバッファの
ライトポインタを更新する（５−４）。監視用テーブル
１つ分の映像、音声の情報を書き込んだら監視スレッド
制御用セマフォのカウンターを進める（５−５）。この
ようにＤＥＭＵＸにより監視スレッドの制御を行う。【００７０】図７は、監視スレッド３７の動作について
説明する図である。監視用のテーブルを読み込みリード
ポインタを進める（６−１）。過負荷時のオブジェクト
の優先度をチェックして、優先度の高い画像フレームを
調べる（６−２）。監視用テーブルの内容を合成側のス
レッドへ渡す（６−３）。ＤＥＭＵＸからの監視用テー
ブル１個分のデータの作成を待つ（６−４）。処理の優
先度の高い順に、表示を行う画像のフレーム番号をデコ
ードプロセスに書き（６−５）、現在の時刻と表示すべ
き時刻を比べて、間に合っていなかったらＩフレームを
スキップせずに、ＰＢのフレームだけをスキップする
（６−６）。対応するデコード・プロセスの実行を許可
し（６−７）、処理が完了するまで待つ（６−８）。【００７１】図８は、デコード・プロセス３６の動作に
ついて説明する図である。監視スレッド３７から実行の
許可が出るまで待機する（７−１）。入力画像の状態を
チェックし、画像のシリアル番号、入力されるフレーム
はスキップすべき画像かどうかを調べる（７−２）。デ
コードすべき画像データがリングバッファに溜まるまで
待つ（７−３）。監視スレッドから指示された画像のシ
リアル番号に対応する画像データがなければ、デコード
をスキップし、リードポインタを進める（７−４）。入
力画像のスキップでなければ、デコードの処理を実行
し、リードポインタを進める（７−５）。デコードの結
果を出力し（７−６）、監視スレッド３７に処理が終了
したことを通知する（７−７）。【００７２】同じプロセス（スレッドであってもよい。
ハードウェアである場合はプロセッサ）を利用して異な
る種類の画像オブジェクトを伸長する場合、デコード・
プロセス３６内で過去に伸長した画像のフレーム番号と
伸長される前の画像とを対応づけて管理することによ
り、同時にたくさんのプロセスを生成して利用する必要
がなくなる（最低、直前のフレームに関する情報だけで
もよい。また、Ｉ、Ｐ、Ｂというように異なるタイプの
フレーム画像が存在する場合は、管理される順序と出力
すべき順序とが異なるのでデコード・プロセス３６にお
けるこのような管理は必要となる）。【００７３】図９は、画像合成スレッド３９の動作につ
いて説明する図である。監視スレッド３７から監視用テ
ーブルを待つ（８−１）。処理する画像の優先度をチェ
ックする（８−２）。優先度の高い順にデコード結果の
画像を待つ（８−３）。表示位置にあわせた画像の合成
を行う（８−４）。合成結果を合成用バッファ４１に書
き込む（８−５）。表示を行うべき画像情報の選択は画
像伸長手段もしくは画像合成手段で行うことができる。
表示すべきではない画像オブジェクトＩＤをスキップす
る場合、画像合成手段へは伸長結果が出力されないこと
を通知する必要がある。音声に関しても再生すべき音声
情報の選択を音声伸長手段もしくは音声合成手段で行う
ことができる。【００７４】図１０は、表示監視スレッド４２の動作に
ついて説明する図である。合成画像が書き込まれるのを
待つ（９−１）。初めての表示である場合、表示を開始
した時刻を取得し（９−２）、表示を行うべき時刻との
対応関係を管理する。表示時刻に達していなければ、達
していない時間だけ待機し、合成画像の表示を遅らせる
（９−３）。【００７５】図１１を用いて本発明の画像合成装置のユ
ーザインターフェースについて説明する。【００７６】図１１の例では、背景画像に、前景画像が
合成され、遠くに位置する建物が合成比率０．５で半透
明に画像合成されている。図１１に示したように、使用
する画像は２次元画像でなくてもよい。前景に３次元画
像としてヘリコプターと気球が、２次元の画像である背
景と合成されている。なお、前景のヘリコプターと気球
は必ずしも常に３次元の画像である必要はない。遠くに
位置する場合（画面上に２次元として表示される大きさ
で定義しておけばよい。たとえば２０ドット×２０ドッ
トの大きさよりも小さければ対象物は遠くに存在すると
定義しておけばよい）には、２次元で表現しておき、近
くに位置する場合には３次元で表現してもよい。また、
３次元画像のワイヤーフレーム・モデルにマッピングす
る画像も静止画だけではなく、動画像であってもよい。
画質に関しては中心部分の画質は高く、周辺部分へいく
ほど荒くすることで、ユーザの望む必要な情報を優先的
に、選択して伝送することができる（このように、画像
が合成される位置に応じて、画質を変更することで応答
性の向上が期待できる）。また、３次元画像である場
合、遠方に表示される画像の優先度は低く、近くに表示
される画像の優先度は高く設定すればよい。なお、画質
の制御に関しては量子化ステップを変更することにより
実現できる。【００７７】図１２は、受信側端末の能力の変動に応じ
た画像伝送を行う方法について説明した図である。次
に、伝送される画像が多くなることにより、受信端末の
処理が過負荷になるのを防ぐために、圧縮装置を含め
て、管理、制御する方法について述べる。例えば、ハー
ドウェアで実現されているＭＰＥＧ２ベースのビデオ・
オン・デマンドシステムでは、送信側の端末は受信側の
端末の性能（たとえば、画像圧縮できる方式やサイズ、
通信プロトコル）を、映像情報を送信、受信する前にお
互いに確認する。このため、送信側端末では、受信側端
末の処理能力がほぼ確定しているため、受信側端末の受
信状況や再生の状況を逐次、モニターする必要はない。【００７８】一方、ハードウェアで画像の圧縮と伸長を
実現する場合は、端末で画像の圧縮と伸長を行える個数
は固定である。しかし、ソフトウェアで画像の圧縮と伸
長を実現する場合は、端末で画像の圧縮と伸長が行える
個数を動的に可変にできる。又、ソフトウェアでマルチ
タスク環境下で画像の圧縮と伸長を行う場合、画像サイ
ズや、画像圧縮を行うための量子化パラメータ、対象と
する画像（フレーム内符号化かフレーム間符号化、撮影
された画像の内容）等によって大きく影響し、端末で処
理（圧縮、伸長）できる画像サイズ、同時に処理できる
画像の数は時間的に変化する。また、これに伴って送信
側端末では、逐次、受信側端末の受信状況（たとえば、
受信バッファの容量や映像の再生の優先度、受信確認の
応答時間）に応じた画像の圧縮方法（画像圧縮の方式、
画像圧縮の有無、量子化ステップ、圧縮の優先度、圧縮
すべき画像サイズなど）、受信端末が過負荷時の優先度
の決定を検討していかなければ受信側の能力を上回って
しまい破綻を来す。【００７９】例えば、図１２（ｂ）に示すように、受信
側端末の受信バッファの容量が８０％を超えた場合、送
信側へ受信バッファがあふれそうになっていることを通
知し、画像圧縮の方式（たとえばＭＰＥＧ１からランレ
ングスへ変化させて、圧縮画像の送出量を減らす）、画
像圧縮の有無（画像圧縮して、送信するのを一時中断さ
せる）、圧縮の優先度の変更（圧縮すべきプロセスが複
数ある場合、圧縮するための優先度を下げて、圧縮され
る圧縮画像の送出量を減らす）、画像サイズの変更（Ｃ
ＩＦからＱＣＩＦへと圧縮すべきサイズを小さく変更し
て圧縮画像の送出量を減らす）、量子化ステップの変更
（画質の変更によって圧縮画像の送出量を減らす）によ
る送出量を制限させる方法、フレーム数を調整する方法
（処理を行うフレーム数を減らす）、受信端末が過負荷
時の優先度を決定する方法を適宜、選択、組み合わせて
実施する。これにより受信側端末の受信バッファのオー
バーフローを回避させる。【００８０】同様に、受信側の受信バッファの容量が２
０％を下回った場合、送信側の端末へ受信側端末の受信
バッファがアンダーフローになりかけている旨を通知し
て、前述とは逆の方法で、送信側の端末で、画像圧縮の
方式、画像圧縮の有無、画像圧縮の優先度、画像のサイ
ズ、量子化ステップ、フレーム数を適宜、選択、組み合
わせて実施する。このように送出量を増大させる方法を
実施することにより、受信側端末の受信バッファのアン
ダーフローを回避させることができる。【００８１】受信バッファの状態の監視以外にも、受信
側端末での再生能力が限られていて、再生すべき画像が
複数ある場合、受信側端末で、優先して再生すべき画像
を利用者が明示的に決定するか、端末側で、優先して再
生すべき画像を自動的に決定する必要がある（予め、利
用者により優先して再生すべき画像はどれであるかを、
ルールとして受信端末に登録しておく必要がある。例え
ば、画像サイズの小さいものは優先であるとか、背景の
画像として表示させているものは再生の間隔はゆっくり
であってもよいとか）。例えば、受信側端末の負荷（た
とえば、再生に必要なＣＰＵの占有時間）を送信側の端
末へ通知してやることにより、簡単に実現可能である。【００８２】受信側の端末の再生の負荷が端末の処理能
力の８０％を超えれば、その受信側端末が過負荷になっ
ていることを送信側へ通知し、送信側ではそのことをう
けて、上述と同様の方法で、受信側端末の処理すべき負
荷が下がるように、画像圧縮の方式（たとえば、ＭＰＥ
Ｇ１からランレングスへ変更させて処理量を減らす）、
画像圧縮の有無（画像圧縮して、送信するのを一時中断
させる）、圧縮の優先度の変更（重要度の低い画像に対
しては、圧縮するための優先度を下げて、重要度の高い
画像を優先して圧縮して送出する）、画像サイズの変更
（ＣＩＦからＱＣＩＦへと圧縮すべきサイズを変更し
て、再生側の負荷を減らす）、量子化ステップの変更
（画質の変更によって圧縮画像の送出量を減らす）の方
法、フレーム数を調整する方法、過負荷時の処理の優先
度に基づいて処理する方法を適宜、選択もしくは組み合
わせて実施することによって受信側の端末での処理量を
軽減させる。【００８３】逆に、負荷が受信側端末の処理能力の２０
％を下回った場合は、受信側の端末の処理能力に余裕が
あるものとして、前述とは逆の方法で、送信側の端末
で、画像圧縮の方式、画像圧縮の有無、画像圧縮の優先
度、画像のサイズ、量子化ステップ、フレーム数を適
宜、選択、組み合わせて実施することにより、高画質
で、フレーム間隔の短い画像を受信側端末へ送出する。
これにより、受信側端末の能力を活かした画像伝送が可
能になる。【００８４】最後に、受信側端末の処理状況を知る方法
としては、受信側の画像合成装置からの受信確認の応答
時間によって知ることができる。例えば、送信側の端末
から受信側端末へ画像データを送出した場合に、受信側
端末が画像データを受信したことや復号処理、合成や表
示処理が完了したことを送信側端末へ応答する場合、そ
の応答時間が、例えば、通常値として１秒以内である場
合、受信側端末の負荷の増大により、その応答時間は、
５秒といったように長くなる（通常値は、端末接続時に
一度、測定してもよいし、通信時に定期的に測定しても
よいし、利用者が指示してもよい。また、応答時間の測
定は周期的に行ってもよいし、端末の負荷や前回の応答
時間の結果に関連させて測定間隔を変化させてもよ
い）。この応答時間の変化により、前述した画像圧縮の
方式、画像圧縮の有無、画像圧縮の優先度、画像のサイ
ズ、量子化ステップを適宜、選択、組み合わせて実施す
ることにより、受信端末での負荷を低減させることがで
きるので、応答時間を短縮させることができる（図１６
のケース１参照）。受信端末での再生時刻もしくは復号
時刻を受信して上記と同様の処理を行ってもよい。【００８５】尚、受信側の端末の状態を考慮した方法と
して、前述した受信側の端末の受信バッファの容量、受
信側端末の負荷、受信側の端末の応答時間を測定する方
法をそれぞれ単独に用いるのではなく、適宜、選択し
て、組み合わせて用いてもよい（音声に関しても同様の
方法が適用できる）。また、受信側の端末で優先度情報
に基づいて処理した画像や音声に関する情報（複数の、
画像ストリーム、音声ストリームが存在するとき、受信
側端末で実際に処理された画像、音声ストリームは、ど
のストリームであり、再生された画像ストリームは毎秒
何フレームであったかという情報）を、通信路を通じて
送信先に送信することで、送信側から受信側の端末への
画像データ送信が、受信端末の処理量をこえるような量
になることを未然に防ぐことができる（図１６のケース
２参照、実際に処理された画像データについて知ること
で、送信側の量子化パラメータ、画像サイズなどの情報
量を調整することが可能となる。なお、この例では、フ
レーム単位で処理のフィードバックを返しているが、前
述したように、例えば、Ｈ．２６３ならばＧＯＢのよう
に独立して扱えるような画像単位であってもよい）。以
上の方法は、同様に音声に対しても適用できる。【００８６】図１３は、本発明の一実施の形態の画像圧
縮装置について説明する図である。尚、本実施の形態
は、画像に対しての例を説明しているが、音声の圧縮に
対しても適用できる。図１３の例では、画像入力手段１
２０７毎に量子化ステップを変化させたり、画像入力手
段１２０７に対する制御によって受信側端末での受信状
況が変化した場合に、量子化ステップを追随させて変化
させることにより、圧縮画像の発生量の増大を低減させ
ようとするものである。図１３の画像圧縮装置は、量子
化ステップに関する情報を管理する量子化ステップ管理
部１２０１、画像入力手段１２０７の制御状態を管理す
る画像入力管理部１２０２、受信側端末装置の受信バッ
ファの状況を監視する他端末制御要求管理部１２０３、
制御の時間的な推移を記録、管理する操作管理部１２０
４、画像圧縮を行う手段である画像圧縮部１２０５、圧
縮結果を通信路や記憶装置に出力する出力部１２０６、
画像入力を行う画像入力手段１２０７及び、これら各部
を管理し、また管理する制御を行う画像処理決定制御手
段１２０８から構成される。【００８７】尚、画像圧縮の方法としては、ＪＰＥＧ、
ＭＰＥＧ１／２、Ｈ．２６１、Ｈ．２６３のような標準
化されている方式でもよいし、ウェーブレットやフラク
タルのような標準化されていない方式であってもよい。
画像入力手段１２０７はカメラであっても、ビデオ、オ
プティカル・ディスクのような記録装置であってもよ
い。【００８８】この画像圧縮装置の利用方法としては、画
像入力手段１２０７がカメラである場合、受信側端末に
より送信側の端末のカメラが操作されたときや送信側で
カメラ操作が行われたとき、画質が大きく変化するため
に、送出される符号化量は変動する。例えば、カメラの
コントラストを上げた場合、画像は見やすくなるが、送
出すべき符号化量は増える。そこで、コントラストの向
上とともに前述したように符号化量を低減させるため
に、画像圧縮の方式、画像圧縮の有無、画像圧縮の優先
度、画像のサイズ、量子化ステップ、フレーム数を適
宜、選択、組み合わせて実施することにより、符号化量
を抑えることができる。【００８９】ここで述べているカメラ操作とは、カメラ
を移動させる方向（パン、チルト、ズーム）、コントラ
スト、フォーカス、カメラ位置（たとえば、図面を撮影
する場合はカメラを下向きに向け、人物を撮影するとき
は水平にする）が挙げられる。画像圧縮の方式を変更す
る方法としては、カメラを下向きに向けた場合は、文書
画像を撮影しているものと判断して、ランレングスで画
像を伝送し、カメラが水平方向にむいている場合は、人
物の顔の様子を撮影しているものとして、Ｈ．２６１で
撮影して画像伝送を行う方法が挙げられる。これによ
り、不必要な情報の伝送を低減させることが可能とな
る。【００９０】また、複数のカメラが存在し、複数のカメ
ラから得られる映像を伝送する必要がある場合に、通信
容量が限られている場合は、利用者が着目しているカメ
ラの映像の画質やフレーム数を多くして見やすくし、着
目していないカメラの画質やフレーム数は低減してやる
方法が考えられる。着目しているカメラから得られる映
像の画質やフレーム数を操作することにより、情報量が
増大するため、それに応じて着目していないカメラから
得られる映像を制限して発生情報量を調整する必要があ
る。発生する情報量を調整する方法としては、画像サイ
ズ、量子化ステップの値、フレーム数などを調整する方
法が挙げられる。尚、複数のカメラを用いて広視野画像
を作成する場合の例については、図１５を用いて後述す
る。【００９１】図１４は、操作管理部１２０４が管理する
情報の例である。図１４の例では、画像サイズ、カメラ
制御、他端末の制御要求、量子化ステップ、図示しない
フレーム数について管理されている。これらの管理情報
に基づいて、受信側端末の受信バッファがオーバーフロ
ーしないように、量子化ステップとカメラ操作の関係を
履歴情報として記録、管理することで、カメラ操作に対
する制限を利用者に加えることができる。また、量子化
ステップや画像サイズ、フレーム数などを自動的に変更
させることで、カメラ操作に伴う受信側端末の受信バッ
ファのオーバーフローやアンダーフローを未然に防ぐこ
とができる。【００９２】図１５に、上記画像圧縮装置を広視野画像
を作成する用途に応用した例を示す。図１５の例では、
複数のカメラから入力された画像を入力部１４０７で取
得する。その得られた複数の画像を受信端末１４０８側
でつなぎ目なく接合（合成）するとき、受信端末１４０
８が過負荷になると端末が破綻を来すので、それを防ぐ
ために、受信端末１４０８における過負荷時の処理を行
うべき画像の順序を定義した優先度を画像に付加する。
これにより、受信端末１４０８側が過負荷になることを
防ぐことができる。【００９３】図１５に示す画像圧縮装置は、複数のカメ
ラ（Ｎ台）を備えた入力部１４０７と、その入力部１４
０７で得られたそれぞれの画像に対して優先度の付加を
行う優先度決定制御部１４０１と、利用者が（特に、着
目して見たいと思って）カメラを指示、操作した操作履
歴を管理する操作履歴管理部１４０２と、画像の画質を
制御する画質制御部１４０３と、カメラから得られた画
像を優先度に基づいて合成する画像合成部１４０４（優
先度の低い画像は合成しなくてもよい）と、合成結果を
出力する出力部１４０５と、それら各部を制御する圧縮
制御部１４０６とから構成される。出力部１４０５は通
信路を介して受信端末１４０８に接続されている。【００９４】出力部１４０５の出力先は、記録装置であ
っても通信路であってもよい。また、画像の合成は必ず
しも送信側の端末で行う必要はない。優先度が付加され
た画像を通信路を通して、受信側端末へ送信し、受信端
末側で合成してもよい。なお、得られた複数の画像を送
信側端末で合成して、受信側端末で再生を行う場合、得
られた画像を送信側で受信端末で必要となる（表示の）
優先度の高い順に合成して、伝送路を使って合成画像を
受信端末装置に伝送する。【００９５】優先度の付加方法としては、利用者が指示
したカメラで得られた画像、過去に指示の多かったカメ
ラで得られた画像から順に高い優先度、高い画質（たと
えば、フレーム数を多く、解像度を高く）なるようにす
ればよい（必ずしも、高い優先度の画像を高画質にする
必要はない）。これにより利用者の着目度合いの大きい
画像が高画質で、優先的に表示される。画像に付加され
た優先度に応じて送信側端末からの画像伝送を制御した
り、受信側端末での画像の伸張や表示を制御することに
より、利用者における端末の応答性を確保することがで
きる。【００９６】また、優先度、画質の高い画像、フレーム
枚数の多い画像から順に、隣接する接合された画像に対
して段階的に、優先度や画質を下げてゆく（優先度の管
理は、送信側端末で管理しておいてもよいし、受信側端
末で管理しておいてもよい）。優先度の決定方法として
は、必ずしもカメラの操作履歴に基づくものでなくても
よい。前述したように、圧縮する際にかかったローカル
・デコードの時間に基づいて優先度の決定を行ってもよ
いし、優先度、画質の高い画像、フレーム枚数の多い画
像から順に、周辺の画像に対して、処理の実施回数を規
定する実施率を定義してもよい。さらに、音声に関して
も、複数あるカメラ毎にマイクを設け、音声の圧縮の有
無を制御することで、利用者の着目している方向の画像
に対応する音声のみを合成することが可能となる。【００９７】また、前述したように、送信側端末と受信
側端末との間での応答時間を参照して、量子化ステップ
やフレーム数を決定してもよい。また、受信側端末で過
負荷時に優先度情報に基づいて処理された画像に関する
情報を、通信路を通じて送信先に送信することで、送信
側から受信側端末への画像データ送信を受信端末の処理
量をこえるような量になることを未然に防ぐことができ
る。また、受信端末でのフレームスキップの状態を送信
側へ伝送することにより、その状態に応じてデータ量を
調節することができる。【００９８】更に、画像は再送を行う伝送方法で伝送
し、音声は再送を行わない伝送方法で伝送して、受信側
端末が、画像の再送回数、受信された音声の誤り率、廃
棄率に関する情報のいずれかの情報を送信側端末に伝送
する構成とする。そうして送信側端末で画像の圧縮方
式、量子化ステップの値、フレーム数、圧縮すべき画像
の大きさ、画像圧縮の有無のいずれかを決定すること
で、画像が乱れることなく、音声の伝送の遅延を小さく
するような制御が可能となる。例えば、ＴＣＰ／ＩＰを
用いた通信では、画像の伝送はＴＣＰで、音声の伝送は
ＵＤＰで行うことで実現できる（映像と音声は物理的に
同じ伝送路にあってもよいし、なくてもよい）。尚、通
信の方式はＴＣＰ／ＩＰだけに限定されない。この方式
は、複数の映像や音声を同時に伝送する場合、それぞれ
の音声毎に廃棄率や誤り率を定義して、複数の映像の圧
縮方法や伝送方法を制御してもよい。【００９９】最後に、通常、アナログ電話回線を用いた
低ビットレートの画像伝送や、画像の内容が大きく変動
する場合、画像に大きなブロックノイズ、もあれが発生
する。このような場合に圧縮処理だけで画像の品質を保
つのは難しい。そこで、画像の出力側のモニターに低域
の信号のみを透過させるフィルター（例えば、画像処理
によるローパス・フィルター、あるいは物理的な偏光フ
ィルター）を用いれば、画像はぼやけた感じにはなるも
のの、ノイズや、もあれが気にならない画像が得られ
る。【０１００】【発明の効果】以上述べたところから明らかなように本
発明は、同時に複数の映像や音声の復号、合成を行う場
合に、端末の負荷状況に応じて優先度に基づいて処理量
を制御できるという長所を有する。【０１０１】また、本発明は、課金状況に応じて複数の
映像や音声を合成できるという利点がある。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a real-time image
The present invention relates to an encoding device. 2. Description of the Related Art Heretofore, it has been difficult to determine whether or not an image of
Extract a person image, for example, and
The sent person image and the pre-stored partner
Superimposes the image of the virtual space to be displayed in common
To show that the other party is in front of you.
One that satisfies the presence and aims for realistic video communication
(Japanese Patent Publication No. 4-24914, "Hypermedia
ASystem Personal Communication System "
(Fukuda, K., Tahara, T., Miyoshi, T.: "Hypermedia
Personal Computer Communication System: Fujitsu H
abitat ", FUJITSU Sci. Tech. J., 26, 3, pp.197-206
(October 1990).), Nakamura: "Network-enabled virtual reality
Distributed Collaborative Work Support by Reality ", Information Processing Society of Japan
Research Group on Ovisual Composite Information Processing (1993)). Special
In addition, in the conventional technology, speedup for performing image synthesis,
Inventions have been made on methods for reducing
For example, Japanese Patent Publication No. Hei 5-46592: Image Compositing Apparatus,
105226: image synthesizing device). [0003] However, the conventional
In technology, 2D still images and 3D CG data are synthesized
Was proposed, but multiple video
And audio are simultaneously decoded (decompressed), synthesized and displayed.
It did not describe how to implement the system.
In particular, it can decode, synthesize, and display multiple video and audio simultaneously.
Terminal equipment lacks capacity or changes in processing capacity.
Video and audio playback methods that do not break down
Was not stated. In addition, depending on the billing situation
It describes how to decode, combine and display multiple videos.
Had not been. Specifically, (1) information of a plurality of images and sounds, and
Those who manage the information describing the relationship and the information of the processing result
Law. (2) Multiple images when the processing state of the terminal is overloaded
For determining the priority of decoding, synthesis, and display of
And billing methods. [0005] Further, a plurality of videos and sounds are simultaneously decoded and combined.
In the environment that can be created and displayed, the state of the receiving terminal and the receiving terminal
Image compression method according to decoding, synthesis, and display priority
And consider how to control the amount of coding.
Not. SUMMARY OF THE INVENTION The present invention provides a conventional
To simultaneously decode and combine multiple video and audio
When performing coding, the amount of coding is controlled according to the processing status of the terminal.
Control of multiple video and audio depending on the billing status.
Video and audio decoding device that can control the
Providing an image / speech coding apparatus and an information transmission system
The purpose is. The present invention is limited to only two-dimensional image synthesis.
Not. Table combining two-dimensional and three-dimensional images
The current format may be used, or a wide-field image (panoramic image)
Combining multiple images adjacent to each other
Methods may also be included. The communication form targeted by the present invention is a wired form.
Not only bidirectional CATV and B-ISDN. example
For example, video and audio transmission from the center terminal to the home terminal
Transmission is by radio wave (for example, VHF band, UHF band), satellite broadcasting
Information transmission from the home terminal to the center terminal is
It may be a telephone line or N-ISDN for logs (video,
Voice and data need not necessarily be multiplexed.
No). In addition, IrDA, PHS (Personal Handy
Communication type using wireless such as -phone) and wireless LAN
It may be in a state. The target terminal is a portable information terminal.
Even a portable terminal like this, a set-top box,
A desktop terminal like a personal computer
Is also good. Specifically, the present invention according to claim 1
Akira has one or more image input means for manually inputting images, and
Image input management means for managing the control state of the image input means
And other terminal control request management to manage the receiving status of the receiving terminal
Means and at least the reception status of the managed receiving terminal
Or, according to the control state of the image input means,
Encoding processing determining means for determining an encoding method;
Encoding the input image in accordance with a result determined by the logic determining means
Image encoding means for outputting the encoded image.
Real-time image, comprising:
An image encoding device. In this application, the following items 1 to 19
There is technology disclosure. 1. Image coding for coding image information
Means and transmitting the encoded various information
An image encoding device having transmission management means for recording, and a code
Management means for receiving various types of information, and its reception
Decoding means for decoding decoded various information, and decoding thereof
Image synthesizing means for synthesizing one or more images thus obtained, and
Decoding having output means for outputting a combined image of images
Image decoding coding apparatus, comprising:
Place. 2. Speech coding for coding speech information
Means and transmitting the encoded various information
Speech coding apparatus having transmission management means for recording, and code
Management means for receiving various types of information, and its reception
Decoding means for decoding decoded various information, and decoding thereof
Voice synthesizing means for synthesizing one or more generated voices, and
Decoding having output means for outputting synthesized speech
Coding apparatus characterized by comprising a coding device
Place. 3. Item 1 of the image decoding / encoding device
And the speech decoding / encoding device according to item 2,
The image encoding device and / or the audio encoding device,
Predetermine the priority of processing when the coded information is overloaded
The encoded information and the decision
Priority adding means for associating the assigned priority,
The image decoding device and / or the audio decoding device receive
Processing of various types of information according to the priority at the time of overload.
Characterized by having priority determining means for determining a method.
Video / audio encoding / decoding device. [0015] 4. The priority adding means and the priority
The degree determining means includes at least an image coding method and an image
Size, contrast, image composition ratio, quantization
Group, frame number, number of frames, interframe coding and
Difference in intra-frame encoding, display position, display time, sound interval
Based on one or more of the differences between
Decoding, synthesis, display order, presence / absence of encoded images and audio
To determine how to handle priorities,
A feature that determines the priority to be processed
4. The video / audio coding / decoding device according to item 3. 5. The priority adding means and the priority
The degree determination means calculates the time taken for decoding when encoding the image.
And the encoded information based on the time it took to encode.
To determine the priority to be added to the
Item 3 characterized by determining the priority to be set as the elephant
Video / audio coding / decoding device. 6. The priority adding means and the priority
The degree determining means performs the processing of decoding, synthesizing, and displaying the image.
Define an implementation rate that stipulates the number of rows, and based on that implementation rate
To determine the priority to be added to the encoded information,
Determining the priority to be processed at the time of decryption
Item 3. The video / audio coding / decoding device according to Item 3, which is characterized by the following. 7. At least the intra-frame coding frame
Or the first or last frame
In other words, the processing when the scene change frame is overloaded is excellent.
An image according to item 4, wherein the priority is set high.
Audio encoding / decoding device. 8. Inter-coded images are the same
Of the item 4, characterized by assigning the priority of
Video / audio coding / decoding device. [9] Multiple in intra-coded images
Item 4 is characterized by assigning priority levels
Video audio encoding / decoding device. 10. The image decoding means is configured to output one frame
Specially, perform image decoding in smaller units.
Item 3. The image encoding / decoding device according to Item 1. 11. Item 1. The image decoding / encoding apparatus according to Item 1.
And the speech decoding / encoding device according to item 2.
At least information on billing and service content
Information, password, user code, country code,
Information indicating the order of generation and display, information indicating the order of decoding,
One or more of user's instruction, terminal processing capacity, and playback time
Image and audio to be decoded, synthesized, and displayed based on the
The order, presence / absence, and playback method
Video / audio coding / decoding device. 12. Item 1. The image decoding / encoding apparatus according to Item 1.
And the speech decoding / encoding device according to item 2.
In addition, the reception management means may include an image information among the various information.
Information describing the relationship between reports and audio information
Treat it independently as image information and audio information
A video / audio coding / decoding device characterized by the above-mentioned. 13. The image information, the audio information
To identify the description method for describing the relationship between
An item characterized by a description method identified by an identifier
13. The video / audio coding / decoding device according to claim 12. 14. Item 1. The image decoding / encoding apparatus according to Item 1.
And the speech decoding / encoding device according to item 2.
The image synthesizing means or the voice synthesizing means,
Until the sending side instructs to discard the decryption result,
Video and audio characterized by holding, managing and using results
Encoding / decoding device. 15. Item 1. The image decoding / encoding apparatus according to Item 1.
And the speech decoding / encoding device according to item 2.
Information that describes the relationship between image information and audio information
Decoding required when synthesizing images and audio based on
Cannot be synthesized because there is no prepared image or sound
It is especially important to show users that images and sounds are present.
A video / audio coding / decoding device to be referred to. 16. One or more image input
Force means and an image for managing the control state of the image input means
Input management means and other terminals for managing the reception status of the receiving terminal
Control request management means and at least the managed receiving end
Response to the current reception status or the control status of the image input means.
Encoding processing determining means for determining an image encoding method
According to the determination result of the code processing determining means.
Image encoding means for encoding the force image, and the encoded
And output means for outputting an image.
Real-time image coding device. 17. The code processing determining means is configured to
Encode at least according to the control state of the image input means
Priority, priority information of overload processing, coding method, amount
Substep value, number of frames, size of image to be encoded
And decide whether or not to encode or not.
Item 16. The real-time image encoding device according to Item 16. 18. Item 1. The image decoding / encoding apparatus according to Item 1.
And a small number of the audio decoding / encoding devices according to item 2.
At least one of them as a receiving terminal, and
3. A coding and encoding apparatus and the speech decoding and coding according to item 2.
Device and the real-time image encoding device of item 16
At least one of them as the sending terminal
An information transmission system in which communication is
At least the load of the receiving terminal, the priority of the receiving terminal
The encoded information to be processed, determined by the determination means
Information on the priority of the
Transmitting any of the conditions of the kip to the transmitting terminal;
In the transmitting terminal, the image or audio code
Coding, coding priority, coding method, coding
Image size, quantization step value, number of frames, reception
One of the processing priorities for terminal overload
An information transmission system characterized by the following. 19. Item 1. The image decoding / encoding apparatus according to Item 1.
The speech decoding / encoding device according to item 2 and
And the image decoding / encoding device of item 1 and item 2
17. The speech decoding / encoding device according to item 16, and
The real-time image coding device is used as the transmitting terminal,
An information transmission system in which the ends are connected by a communication path.
Image transmission is performed by a retransmission method, and audio transmission is performed by retransmission.
Use a transmission method that does not perform transmission, and at least retransmit the image.
Number, error rate of received information, and discard rate
By transmitting any of the information to the transmitting terminal,
The encoding process determining means includes an encoding method and a quantization step.
Value, number of frames, size of image to be encoded,
Presence / absence and low priority of processing when receiving terminal is overloaded
Information transmission system characterized by determining either
Tem. DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described below with reference to the embodiments.
Will be described with reference to the drawings showing "Image" used in the present invention
The meaning of "image" includes both still images and moving images. In addition,
The image to be created is of computer graphics (CG).
Consisting of two-dimensional images and wireframe models
Is a mixture of three-dimensional image data
You may. In this case, the relationship between the images is wireframe
Equivalent to model. Script language for writing
Are Java (registered trademark) and VRML.
You. FIGS. 1 and 2 show an embodiment of the present invention.
1 is a schematic configuration diagram of an image decoding / encoding device in the present embodiment. FIG.
FIG. 2 shows a configuration in the case where there is no audio playback function.
Is a configuration having a function of reproducing images and sounds. This
Needless to say, the same configuration can be applied to the case of only voice. The present apparatus shown in FIG. 1 or FIG.
And a decoding device, the coding device in the case of FIG.
Sets the priority of processing when the encoded image is overloaded in advance.
It is determined based on the determined standard, and the encoded image and priority are determined.
Priority assigning unit 101 for associating images to be encoded
The encoding unit 102 transmits the encoded information with the priority added.
Alternatively, the transmission management unit 103 for recording, and the encoded
It comprises a reception management unit 104 for receiving information.
In addition, the encoding device in the case of FIG.
The audio encoding unit 105 is provided. On the other hand, in the decoding device, information is received.
The reception management unit 11 for transmitting information and the transmission management unit 13 for transmitting information
Transmits information on coaxial cable, CATV, LAN, modem, etc.
Means to send. The terminal connection form is a TV phone
Video information between terminals like a TV or TV conference system
Transmission and reception, satellite broadcasting, CATV, Internet
Broadcast-type (one-way) video broadcasting on the Internet
It is. In the present invention, such connection forms of terminals
Take into account. The separation unit 12 receives the encoded (compressed)
It is a means to analyze and separate information
Becomes a multiplexing unit by the inverse operation). For example, MPEG1
And MPEG2, H.264. 320 terminals (using N-ISDN
TV phone / conference device rules). 221 is H.
324 terminals (TV telephone / meeting using analog telephone line)
H. 223 is video / audio / data
Is a protocol for multiplexing and demultiplexing. The present invention complies with the rules.
May be realized with a configuration that does not conform to the rules.
May be. H. 323 or on the Internet
Video and audio are in separate streams,
It may be transmitted independently. The priority determining section 14 obtains the
Information (eg video, audio, management information)
Decoding when the terminal is overloaded (hereinafter referred to as "decompression")
) And expand image and audio
(The method of determining the priority of the process is determined in advance by the receiving terminal device.
May be negotiated, or the transmitting terminal (encoding device)
The following method is used for recording media and transmitted packets.
Transmission and recording with the information on the priority determined in
It may be added as a format. Regarding priority
Are expressed as priority "large", "medium", "small"
Expressions that are not quantified and numbers such as 1, 2, 3
Valued expression may be used). Composed of a plurality of images or audio frames
For handling data on a stream-by-stream basis
The data transmission / reception process between the sender and the receiver using the identifier
Process to manage the buffer on the receiving side and the data on the transmitting side.
Data transmission can be scheduled. That is,
Specify the identifier of the stream sent from the sender if necessary.
Notify the receiving status of the receiver and
Notification of the stream identifier to the receiving terminal, from the receiving side
It is possible to request a necessary stream. Priority of processing when coded information is overloaded
Degree is determined based on the criteria described above, and the
Priority adding means for image coding
Of various information received in the device or speech coding device.
Priority to determine processing method according to priority at overload
Image frame or audio of priority to be processed
Is determined, and decoding and combination processing are performed. In addition, in the image frame
About the frame so that the frame can be skipped.
Periodically insert encoded (I-frame) frames
There is a need to. The unit to which the priority is added is video or sound.
Voice frame units (priority ratio between frames)
Comparison), in units of streams composed of multiple frames
Yes (comparison of priorities between streams). As a method focusing on the characteristics of an image, an image
Compression format (for example, H.263 and run-length
Length), the size of the image (for example, C
IF and QCIF, give priority to QCIF), control
Last (for example, prioritize bright contrast
), The image composition ratio (for example,
Priority), quantization step (for example, quantization step
Priority is given to small values of
And intra-frame coding (eg, intra-frame coding
Priority), the display position (for example, if the display position is
Give priority to things. If the image is a three-dimensional image,
When placed in the back, the priority is lower and displayed in the front.
Higher priority), the frame number (number
Scenes where the first and last frames have higher priority
Change frame priority, etc.) and number of frames
(For example, images with a small number of frames to be played
Higher. The frame number is H. 263, tempo
Rally reference (TR) and changes in TR value
Judgment based on the conversion to sound), sound and silence sections,
Method based on display time (PTS) and decoding time (DTS)
Is mentioned. In addition, the interframe encoded P frame
And the B frame are assigned the same priority. Also,
Assigning multiple levels of priority to intra-frame coded images
By doing so, the frequency of skipping can be controlled. As an example focusing on the difference between media,
Is a method that gives priority to audio decompression over image decompression.
No. This allows audio to be streamed without interruption
Can be played. Further, the reproduction of the reproduction managed by the receiving terminal is performed.
Determination of information (image, sound) to be expanded based on permission information
May be performed, or a retransmission may be sent from the transmitting side as control information.
Select the information to be expanded based on the raw permission information
Is also good. The reproduction permission information specifically relates to billing.
Information (for example, if no billing has taken place,
Do not perform processing of display and display. On the receiving terminal side,
Information may be managed, or charging information may be managed on the transmitting side.
Information) indicating the content of the service (for example,
If the terminal does not have permission to play in the broadcast
Does not perform length, composition, or display processing. Playback permission is on the receiving side
It may be managed at the terminal or at the sending terminal.
Password) (e.g. password for certain programs)
If no is input, decompression, synthesis, and display are not performed. path
The word may be managed at the receiving terminal,
May be managed), user code (eg,
If it is not a user who has been expanded, decompression, synthesis and display are performed
Absent. The user code may be managed by the receiving terminal,
May be managed by the sending device), country code (e.g.,
Depending on the country, images and sounds to be expanded, synthesized, displayed, and playback
Change the way. The country code may be managed by the sender.
Alternatively, it may be managed on the receiving side. Play by country code
The scramble can be realized by changing. Information on billing and contents of service
Image and audio such as information, password, user code
Reproduction methods with restrictions on playback permission include
When creating or displaying images, the position or pixels may be shifted
Image scaling, image sampling (eg,
, Pixel inversion, contrast change, etc.
Update, change color palette, skip frames
And the like. The playback method of these images (image
Stretching, compositing, and display)
Good. Alternatively, H.264, which is one of image compression, is used. Set at 263
Smaller than one frame as defined and processed independently
GOB (Group Of Bloc)
k) Restrict image expansion, synthesis, and display methods in units
This allows the entire screen to be
Control that is more flexible than the method that disturbs the That is, G
By processing in OB units, only a part of the screen is scanned.
Because you can crumble,
Reputation for interactive software
Value becomes possible. Similarly, as a method of reproducing the sound,
Change the sound direction, change the sound frequency,
Change, change the sampling of the sound,
There is a method of inserting images and sounds (any method
Also, processing in advance on the sending side and processing in the receiving side
Method). Image and sound reproduction methods include image and sound
Out of sync. Combination, display order
Information to be displayed (The display order is determined in advance by the receiving terminal.
For example, priority is given to CIF and still images,
On the side, the order displayed in the transmission information and the information about the priority
The extension order is also indicated.)
Information (Decompression order is determined in advance by the receiving terminal.
For example, QCIF or intra-frame encoded image data
Priority is given to conversation sounds over BGM
Longer. Similarly, on the sending side, send information
May be added to the display order).
User instructions (for example, extension,
Select the image and audio information to be displayed or displayed, or
Decompress, combine and display based on the information selected accordingly
Determine the image and audio information), the processing power of the device (e.g.
Occupy the processing of CPU for a certain period of time now or in the past
By measuring the time, images that are likely to take
Suppress expansion, synthesis, and display of images and sounds. Estimate processing time
The method is to use local decoding when performing compression.
The time taken and the time taken to compress
Information and manage it in conjunction with
Configuration, whether to display, priority can be determined),
Playback time (for example, images, audio information
(Decompression, synthesis, and display are stopped) and decoding time.
The priority and presence or absence of the image or sound to be lengthened may be determined. In addition, only specific images and sounds have priority.
As a way to prevent stretching and display,
Information on the rate of voice expansion, synthesis, and display processing
The order and presence or absence of images to be decompressed, synthesized and displayed based on
Can be determined. For example, ten times of stretching
One of them is the receiving terminal that expands the CIF size image once.
Set on the sending side, or decompress or
And the implementation rate of the display
A method of transmitting audio information is conceivable. Implementation rate is specific
Of the I frame (frame encoded within the frame)
It can be defined by the insertion interval. This allows for specific images and audio
Only objects are not stretched, composited or displayed
Become. Priority for controlling these expansion, synthesis, and display
Information about the relay is added not only to the sending device, but also to the relay device.
May be added and controlled by a device that performs In addition, the receiving terminal
Regarding the priority determined by the priority determining unit 14 of the decoding device
Transmitting information to a destination via the transmission management unit 13;
The video and audio transmission according to the determination status of the priority determination unit 14
(Image objects that are difficult to select)
By sending the project ID to the sender,
No longer). If the receiving terminal is overloaded,
Information indicating the priority of the
May be transmitted as a transmission format.
Recording media such as CD-ROMs and hard disks.
Of MPEG2 as a format for recording
Transport streams may be expanded or standardized
Transmission and recording format not considered
No. Also, for each media (video, audio,
Multiplexed as separate streams into
The transmission and recording may be performed without performing. The image decompression unit 18 as image decoding means
A means for performing image decompression processing (hereinafter referred to as the encoding device
If the encoding means), the image format handled by the image decompression unit 18
MPEG1, MPEG2, H.264 261,
H. 263 and the like. Image expansion is single frame
H. GOB unit specified in 263
May be performed. When processing one frame at a time
When performing inter-frame coding, the
The state needs to be stored in the image decompression unit 18. GOB
When images are expanded in units, the order of image expansion
Is no longer a problem. Therefore, decompression processing is performed in GOB units.
When performing, it is necessary to have a plurality of image decompression units 18 in the receiving device.
Rather, one video decompression unit 18 decompresses multiple videos.
It becomes possible. On the other hand, it is necessary to store the growth results
There is. Audio decompression unit 2 as audio decoding means in FIG.
0 is a means for expanding the audio,
The audio format is G. 721 and G. 723 etc.
Is mentioned. Processing methods include DSP and general
Software processing and dedicated hardware by CPU
Processing. In the case of realizing by software, images and
And audio decompression processing are performed by one process or one process.
Images and audio to be expanded at the same time are managed in units of red
If there are multiple processes, the number of processes or
Process in a time-sharing manner with threads. The image decompression management unit 15 checks the state of image decompression.
It is a means to manage. Also, the audio decompression management unit 16
This is a means for managing the state of extension. For example, these management
When the unit is realized by software, the separation unit 12
Obtained compressed information is determined according to a defined procedure (for example,
Executed by the audio decompression unit 20 and then executed by the image decompression unit 18
Is delivered to the image decompression unit 18 and the audio decompression unit 20
And monitor the extension status. When all elongation is completed
For example, the image synthesizing unit 19 or the voice synthesizing unit 21
Hand over the information. In software, shared memory
By using semaphores, you can limit the information to be passed,
It knows that the decompression process has been completed (details will be described later).
). The time information management unit 17 stores time information.
It is a means to manage. For example, if the system is
When implementing on a computer, time information is personal
What is necessary is just to realize using the timer of a computer. The image synthesizing section 19 expands the image data.
Image synthesis is performed based on. A place to combine multiple images
Image combination based on the composition ratio (α value) of each image.
Performing For example, when combining two images, the foreground
When the composition ratio of the image is α, the RGB value of the background image is set to 1−
α, foreground images are mixed at a ratio of α. The image to be expanded
The image is displayed by managing the processing in units of one frame.
When synthesizing multiple images using the indicated time,
Configuration and implementation can be simplified. Also, if the image synthesizing unit 19
Or the speech synthesizer 21 discards the decompression result from the transmitting side.
Until instructions are received, manage and use the decompression results.
With the sending side, the same pattern information is repeatedly transmitted from the transmitting side.
Need to be eliminated. Information describing the relationship between images and between sounds
Decoding required when synthesizing images and audio based on
Cannot be synthesized because there is no prepared image or sound
By presenting the presence of images and sounds, users can
You can know the composition status. So you need a user
Do not select an image quality that is
Which instructions will allow you to
Can be achieved. In addition, the decoded image and sound
As a method of storing and managing data in a buffer,
Delete images in order of arrival,
Look at the script that describes the relationship between the voices, as a whole
Check the usage status of the decoded image and audio data
There is a way to leave. The audio decompression management unit 16 includes at least one
Manages the expanded state of the audio expander 20 that expands the above audio
I do. The voice synthesizer 21 uses the expanded information
Means for synthesizing the speech, and the synthesis result accumulation unit 22
The image synthesized by the image synthesis unit 19 and the voice synthesis unit 21 are combined.
This is a means for storing generated speech. The reproduction time management section 23 should start reproduction
This is a means for reproducing the synthesized image or sound at the time. The output unit 24 outputs a composite result (example:
For example, display, printer, etc.), input unit 2
5 means for inputting information (for example, keyboard, mouse
, Cameras, videos, etc.). Terminal control unit 2
Reference numeral 6 denotes a unit for managing these units. FIG. 3 shows the priorities in communication and recording formats.
FIG. 9 is a diagram for explaining an example of a case in which information about information is added. The example shown in FIG. 3 (a) shows that all media
A) In this example, video (audio, control information) is multiplexed.
You. The control information is used to determine the overload processing.
Indicates the priority (priority pointed out in the present invention) and the display order
Priority is indicated. The control information includes
Relationship between images, between voices, and relationship between images and voices (temporal, location
May be described. FIG.
In the example of (a), for example, multiplexing of MPEG1 / 2,
H. Control information and data (video, audio) like 223
Suitable for application of mixed packet multiplexing. In addition,
Priority of loading process is frame unit or stream
Add in units. In the example of FIG. 3B, a large amount of information is
This is an example of overlapping. In this example, control information, image information
Information and voice information are transmitted from different communication ports. image
Information on the relationship between voices, between voices, and between
Sent as control information from a communication port separate from images and audio
do it. H. Multiple like 323 or internet
For applications where communication ports can be established simultaneously.
Therefore, the multiplexing process can be simplified as compared with FIG.
Thus, the load on the terminal can be reduced. As a method of describing images and audios, J
It can be supported by description languages such as AVA and VRML.
It seems that there is a unique scripting language specification
It is possible that the situation is not fixed. So between images, audio
Relationships (eg, positional information, temporal information (table
Period))).
Support multiple types of description methods by providing an identifier for
can do. Knowledge to identify how information is described
As a method of adding a separate element, for example, in MPEG2
Is a program that manages MPEG2-TS streams.
Provided in the system map table or written a script
This can be handled by providing it in the stream. Overload handling
Priority and information describing the correspondence between images and audio
(Control information). In MPEG2
Is MPEG2-TS (transport stream)
Related video and audio streams
So that it can be managed with the program map table
In addition, structural information and strikes for associating images with audio
If a stream is defined and managed, MPEG2
Can be transmitted independently. FIG. 4 shows a tree structure constructed by software.
It is a figure explaining the example of the case. Multitasking operation
The invention on operating systems capable of
Is realized, the processes described in FIG. 1 and FIG.
Software execution modules such as processes and threads
Divided into units, shared memory between each process and thread
The information is exchanged by the semaphore (in the example of FIG. 4,
The part shown by the solid line corresponds to the semaphore).
Exclusive control of the stored information is performed. The following describes each process and process.
The function of Red will be described. The DEMUX thread 31 is connected to a network or
Information (video, audio, control information) multiplexed from the disc
Information), audio, video and the correspondence between audio and video
Surveillance table describing relationships and information about playback time
(Details will be described later). DEMUX thread
Reference numeral 31 corresponds to the separation unit 12 described above. DEMUX Thread
The information separated in the buffer 31 is stored in the ring buffer 3 for audio.
2. Ring buffer 33 for video, ring buffer for monitoring
To each other. If it is audio information,
The information sent to the ring buffer 32 is
Expanded by a thread 35 (corresponding to the above-described audio expansion unit 20)
Lengthened. If it is video information, the
The transmitted information is decompressed by the decoding process 36.
You. Regarding the monitoring table, the ring buffer
To determine the order for decompressing the video.
Monitoring thread 37 (the terminal control unit 26 described above,
(Corresponding to the image expansion management unit 15 and the audio expansion management unit 16)
Used. Also, the same monitoring table is used for image synthesis.
This is used in the image composition thread 39. Monitoring thread
The surveillance table used at 37 is for all audio and video
When the image has been expanded, the next table is
Read from the file 34. Decoding process 36 (described above)
The image information expanded by the image expansion unit 18)
The data is sent to the video single buffer 38. Sent out
When the image information is complete, the image synthesizing thread 39 (described above)
In the monitoring table)
Image synthesis is performed using the managed image synthesis ratio. Combination
The synthesis result is stored in the synthesis buffer 41 (the synthesis result accumulation unit described above).
22 corresponding to the display monitoring thread 42
Wait until the display time comes and wait for the display
(Corresponds to the playback time management unit 23). FIG. 5 shows the structure of information used in the structure of FIG.
It is a figure explaining structure. In the example of FIG.
Or the information received from the network is 188 bytes
e is a fixed length (B). DEMUX thread 31 minutes
The structure of the released audio information is a code for packet synchronization,
Playback time, frame length indicating the length of audio to be played, sound
It consists of voice data (C). The structure of video information is packet
Code for synchronization, frame number to identify the image,
Frame length indicating the size of image information,
(D). The present invention needs to be performed in units of one frame.
Not a small block unit such as a macro block unit.
It is permissible to perform processing at the order. The structure of the monitoring table is as follows when an image is displayed.
The number of images to be displayed (combined) in one frame, each image
ID, frame number, priority for decompression and display, frame
Identifiers (I-pictures, P-pictures,
B picture), display horizontal position, display vertical position, composition
(E). What
Change the image composition ratio and audio composition ratio
You may let it. For example, two types of images are sound
In the case of two types, the image composition ratio is α: 1−α.
In some cases, the corresponding voice synthesis ratio is also α: 1-α
May be attached. Not only the relationship between image information, but also audio
The relationship between them may also be described (eg, direction, type (B
GM, conversation sounds)). FIG. 6 shows the operation of the DEMUX thread 31.
FIG. File or network
Read fixed-length data of 188 bytes from the
-1). Analyzes the read data,
Set to the type of image and structure of the monitoring table (5-
2). If writing to the ring buffer is possible,
Ring buffer for voice, video and monitoring tables
Write to. Image object ID and multiple images
Correspondence with the image expansion means is taken. In the example,
Shared memo of young ring buffer number from object ID
(5-3). Of the written buffer
The write pointer is updated (5-4). Monitoring table
Monitoring thread after writing video and audio information for one
The counter of the control semaphore is advanced (5-5). this
As described above, the monitoring thread is controlled by the DEMUX. FIG. 7 shows the operation of the monitoring thread 37.
FIG. Read monitoring table and read
The pointer is advanced (6-1). Overloaded objects
Check the priority of
Check (6-2). The contents of the monitoring table are
Give it to Red (6-3). Monitoring tape from DEMUX
Wait for creation of data for one bull (6-4). Processing excellence
Decode the frame numbers of the images to be displayed in descending order of priority.
(6-5) Write the current time and display
Compare the times, and if it is not in time,
Skip only PB frames without skipping
(6-6). Allow execution of the corresponding decoding process
(6-7), and waits until the processing is completed (6-8). FIG. 8 shows the operation of the decoding process 36.
FIG. From the monitoring thread 37
Wait for permission (7-1). Change the state of the input image
Check the image serial number, input frame
Checks whether the image should be skipped (7-2). De
Until image data to be coded accumulates in the ring buffer
Wait (7-3). The image thread specified by the monitoring thread
If there is no image data corresponding to the real number, decode
Is skipped and the read pointer is advanced (7-4). Entering
If not skip skipping image, execute decoding process
Then, the read pointer is advanced (7-5). Decoding result
The result is output (7-6), and the processing ends in the monitoring thread 37.
Is notified (7-7). The same process (may be a thread).
If the hardware is a different processor)
When decompressing image objects of any type,
The frame number of the image decompressed in the past in the process 36
By associating and managing images before being expanded
Need to create and use many processes at the same time
Is lost (at least, only information about the previous frame
Is also good. Also, different types such as I, P, B
If frame images exist, the order and output to be managed
Since the order to be performed is different, the decoding process 36
Such management is necessary). FIG. 9 shows the operation of the image synthesizing thread 39.
FIG. The monitoring thread 37
Wait for a cable (8-1). Check the priority of the image to process.
(8-2). Decoding results in descending order of priority
Wait for an image (8-3). Combining images according to the display position
(8-4). Write the synthesis result to the synthesis buffer 41
(8-5). The selection of image information to be displayed is
This can be performed by image expansion means or image synthesis means.
Skip image object IDs that should not be displayed
The decompression result is not output to the image synthesis means.
Need to be notified. Audio to be played for audio
Selection of information is performed by voice expansion means or voice synthesis means
be able to. FIG. 10 shows the operation of the display monitoring thread 42.
FIG. That the composite image is written
Wait (9-1). Start display if this is the first display
(9-2), and the time to display
Manage correspondence. If the display time has not been reached,
Wait for the time that is not done and delay the display of the composite image
(9-3). Referring to FIG. 11, a user of the image synthesizing apparatus of the present invention will be described.
The user interface will be described. In the example of FIG. 11, a foreground image is added to a background image.
Buildings that are synthesized and are located far away are semi-transparent at a synthesis ratio of 0.5
The image is clearly synthesized. As shown in FIG.
The image to be processed need not be a two-dimensional image. 3D drawing in foreground
Helicopter and balloon are two-dimensional images
It is combined with the scenery. The helicopter and balloon in the foreground
Need not always be a three-dimensional image. Far away
If located (the size displayed as two-dimensional on the screen
Should be defined in For example, 20 dots x 20 dots
If it is smaller than the size of the object, the object is far away
If it is defined, it is expressed in two dimensions,
If they are located in different places, they may be expressed in three dimensions. Also,
Map to 3D image wireframe model
The moving image may be not only a still image but also a moving image.
As for the image quality, the image quality in the central part is high and goes to the peripheral part
Moderately rough, giving priority to the information required by the user
Can be selected and transmitted (in this way, the image
Responds by changing the image quality according to the position where
Can be expected to improve). In the case of a three-dimensional image,
Lower priority for images displayed far away and closer
The priority of the image to be performed may be set higher. In addition, image quality
By changing the quantization step
realizable. FIG. 12 is a graph showing the variation in the capability of the receiving terminal.
FIG. 4 is a diagram for explaining a method of performing image transmission. Next
In addition, the number of transmitted images increases,
In order to prevent overloading of the process,
Then, how to manage and control is described. For example, Her
MPEG2-based video implemented in hardware
In an on-demand system, the transmitting terminal is
Device performance (for example, the method and size of image compression,
Communication protocol) before transmitting or receiving video information.
Check with each other. For this reason, the transmitting terminal
Since the processing capacity at the receiving end is almost fixed,
There is no need to monitor the signal status and playback status sequentially. On the other hand, image compression and decompression are performed by hardware.
If realizing, the number of images that can be compressed and decompressed by the terminal
Is fixed. However, compression and decompression of images with software
If you want to achieve long, you can compress and decompress the image on the terminal
The number can be dynamically changed. In addition, multi
When compressing and decompressing images in a task environment,
And quantization parameters for image compression,
Image (intra-frame or inter-frame encoding, shooting
Greatly affected by the
Image size that can be processed (compressed, decompressed), and can be processed simultaneously
The number of images changes over time. Also sent along with this
In the receiving terminal, the receiving status of the receiving terminal (for example,
Receive buffer capacity, video playback priority,
Image compression method (image compression method,
Whether image compression, quantization step, compression priority, compression
Priority when receiving terminal is overloaded
If you do not consider the decision of the
It comes down to failure. For example, as shown in FIG.
If the receiving buffer capacity of the side terminal exceeds 80%,
Notify the receiving side that the receive buffer is about to overflow.
Knows the image compression method (for example,
To reduce the amount of compressed images sent)
Whether image compression is performed (image compression and transmission
Change the compression priority (the process to be compressed
If there are, reduce the priority for compression and
The compressed image transmission amount), and change the image size (C
Change the size to be compressed from IF to QCIF
To reduce the amount of compressed images sent) and change the quantization step
(Reduce the amount of compressed images sent by changing the image quality)
To limit the amount of data to be transmitted, and to adjust the number of frames
(Reduce the number of frames to be processed), the receiving terminal is overloaded
Select and combine methods for determining the priority of time as appropriate
carry out. As a result, the receiving buffer of the receiving terminal is
Avoid bar flow. Similarly, if the capacity of the receiving buffer on the receiving side is 2
If the value falls below 0%, the receiving terminal receives the data from the receiving terminal.
Notifies that the buffer is about to underflow
Then, in the opposite way to the above, the image compression
Method, image compression, image compression priority, image size
, Quantization step and number of frames as appropriate
It will be implemented together. How to increase the amount of transmission in this way
By performing this, the reception buffer of the receiving terminal is unencrypted.
Underflow can be avoided. In addition to monitoring the status of the reception buffer,
The playback capability on the local terminal is limited,
If there is more than one, the image to be played back preferentially on the receiving terminal
Is explicitly determined by the user or the terminal
It is necessary to automatically determine the image to be generated (the
Which image should be played preferentially by the user?
It must be registered in the receiving terminal as a rule. example
For example, if the image size is small,
Images displayed as images have slow playback intervals
It may be.) For example, the load (
For example, the CPU occupation time required for playback)
By notifying the end, it can be easily realized. The reproduction load of the receiving terminal depends on the processing capability of the terminal.
If the power exceeds 80%, the receiving terminal is overloaded.
Is notified to the sender, and the sender
However, in the same manner as described above, the negative
To reduce the load, the image compression method (for example, MPE
Change from G1 to run length to reduce processing amount),
Whether image compression is performed (temporarily suspend image transmission
Change compression priority (for images with low importance)
In order to reduce the priority for compression,
Compress and send images with priority), change image size
(Change the size to be compressed from CIF to QCIF
To reduce the load on the playback side), and change the quantization step
(Reduce the amount of compressed images sent by changing the image quality)
Method, method of adjusting the number of frames, priority of overload processing
Select or combine methods based on degree
To reduce the amount of processing at the receiving terminal.
To reduce. On the other hand, when the load is 20% of the processing capability of the receiving terminal.
%, There is a margin in the processing capacity of the receiving terminal.
In some cases, the sending terminal
, The image compression method, the presence or absence of image compression, the priority of image compression
Image quality, image size, quantization step and number of frames.
High quality by selecting, combining and implementing
Then, an image with a short frame interval is transmitted to the receiving terminal.
This enables image transmission utilizing the capabilities of the receiving terminal.
It will work. Finally, a method of knowing the processing status of the receiving terminal
As a response to the reception confirmation from the receiving image combining device
You can know by time. For example, the sending terminal
When image data is sent from the
The terminal receives image data, decodes, combines,
When a response is made to the transmitting terminal that the
For example, if the response time of the
In this case, the response time becomes
5 seconds, etc. (The normal value is
It may be measured once or periodically during communication.
Yes, or the user may instruct. In addition, measurement of response time
May be performed periodically, or the terminal load or previous response
The measurement interval may be changed in relation to the time result.
No). Due to this change in response time, the above-described image compression
Method, image compression, image compression priority, image size
And quantization steps as appropriate.
This reduces the load on the receiving terminal.
Therefore, the response time can be reduced (see FIG. 16).
Case 1). Playback time or decryption at the receiving terminal
The same processing as described above may be performed by receiving the time. A method considering the state of the terminal on the receiving side and
Then, the capacity of the receiving buffer of the receiving
How to measure the load on the receiving terminal and the response time of the receiving terminal
Method is not used independently,
May be used in combination.
Method is applicable). In addition, the priority information is
Information about images and sounds processed based on
Received when image stream and audio stream exist
The image and audio streams actually processed by the
Stream, and the played image stream is
Information about the number of frames)
By transmitting to the transmission destination, the transmission side
The amount that image data transmission exceeds the processing amount of the receiving terminal
Can be prevented beforehand (the case of FIG. 16).
2 See, to know about image data actually processed
Information on the quantization parameters, image size, etc. on the transmitting side
The amount can be adjusted. Note that in this example,
Processing feedback is returned on a frame-by-frame basis.
As described, for example, 263 is like GOB
Image units that can be handled independently of each other). Less than
The above method can be applied to speech as well. FIG. 13 shows an image pressure according to an embodiment of the present invention.
It is a figure explaining a compression device. Note that the present embodiment
Describes an example for images, but for audio compression
Also applicable to: In the example of FIG.
The quantization step can be changed for each
The receiving status at the receiving terminal by controlling the stage 1207
If the situation changes, follow the quantization step and change
To reduce the increase in the amount of generated compressed images.
Is to try. The image compression device in FIG.
Step management that manages information about quantization steps
Unit 1201 manages the control state of image input unit 1207
Image input management unit 1202, the receiving buffer of the receiving terminal device.
Other terminal control request management unit 1203 for monitoring the status of the
Operation management unit 120 that records and manages the transition of control over time
4. Image compression unit 1205, which is a means for performing image compression,
An output unit 1206 that outputs a compression result to a communication path or a storage device;
Image input means 1207 for inputting an image and each of these units
Image processing decision control system that manages and controls
It consists of a stage 1208. The image compression method includes JPEG,
MPEG1 / 2, H.264. 261, H .; Standard like 263
Or wavelet or flux
It may be a non-standardized method such as a barrel.
Even if the image input means 1207 is a camera,
It can be a recording device like a optical disc
No. The method of using this image compression apparatus is as follows.
When the image input unit 1207 is a camera,
When the camera of the sending terminal is operated or
When camera operation is performed, image quality changes greatly
In addition, the transmitted coding amount varies. For example, a camera
When the contrast is increased, the image becomes easier to see, but
The amount of coding to be output increases. So, the direction of the contrast
To reduce the amount of coding as described above with the above
Image compression method, presence or absence of image compression, priority of image compression
Image quality, image size, quantization step and number of frames.
The amount of coding by selecting, combining and implementing
Can be suppressed. The camera operation described here refers to the camera operation.
Direction (pan, tilt, zoom), contra
Strike, focus, camera position (e.g.
The camera downwards when shooting a person
Is horizontal). Change the image compression method
If you point the camera downwards,
It is determined that the image is being shot, and the image is run-length.
Transmit the image and if the camera is facing horizontally,
Assuming that the state of an object's face is being photographed, 261
There is a method of transmitting an image by photographing. This
Transmission of unnecessary information can be reduced.
You. Also, if there are a plurality of cameras and a plurality of
Communication when it is necessary to transmit the video obtained from
If the capacity is limited, the camera
The image quality and the number of frames of the
I will reduce the image quality and the number of frames of the camera I do not see
There is a method. Image obtained from the camera of interest
By manipulating the image quality and the number of frames, the amount of information
From the cameras that are not paying attention accordingly
It is necessary to adjust the amount of generated information by limiting the
You. One way to adjust the amount of information that
To adjust noise, quantization step value, number of frames, etc.
Law. In addition, wide-field image using multiple cameras
An example in the case of creating a file will be described later with reference to FIG.
You. FIG. 14 is managed by the operation management unit 1204.
It is an example of information. In the example of FIG.
Control, control request of other terminals, quantization step, not shown
The number of frames is managed. These management information
The receiving buffer of the receiving terminal overflows
The relationship between the quantization step and the camera operation so that
By recording and managing it as history information,
Restrictions can be placed on the user. Also, quantization
Automatically change steps, image size, number of frames, etc.
The receiving terminal's receiving buffer associated with the camera operation.
To prevent overflow and underflow
Can be. FIG. 15 shows the image compression apparatus having a wide-field image.
Here is an example in which the present invention is applied to the purpose of creating. In the example of FIG.
Images input from a plurality of cameras are captured by the input unit 1407.
Get. The obtained plural images are transmitted to the receiving terminal 1408 side.
When joining (combining) seamlessly with the
If terminal 8 is overloaded, terminal will fail, so prevent it
Therefore, processing at the time of overload in the receiving terminal 1408 is performed.
The priority defining the order of the images to be added is added to the images.
This prevents the receiving terminal 1408 from being overloaded.
Can be prevented. The image compression apparatus shown in FIG.
Input unit 1407 provided with an input unit 14 (N units)
Add priority to each image obtained in 07
The priority determination control unit 1401 to be performed and the user
Command to operate the camera
The operation history management unit 1402 for managing the history and the image quality of the image
An image quality control unit 1403 for controlling
An image synthesizing unit 1404 that synthesizes images based on priorities (excellent
It is not necessary to combine low-priority images.)
Output unit 1405 to output and compression to control each unit
And a control unit 1406. The output unit 1405
It is connected to the receiving terminal 1408 via a communication path. The output destination of the output unit 1405 is a recording device.
Or a communication path. Also, be sure to combine images
It does not need to be performed at the transmitting terminal. Priority is added
The transmitted image is transmitted to the receiving terminal through the
You may synthesize | combine at the terminal side. Sending the obtained images
When combining on the receiving terminal and playing back on the receiving terminal,
The required image is needed at the receiving terminal on the sending side (for display)
Combine images in descending order of priority, and use the transmission path to combine images.
Transmit to the receiving terminal device. As a method of adding the priority, the user instructs
Image obtained by a camera that has been instructed
From the image obtained in the order of higher priority, higher image quality (
For example, increase the number of frames and increase the resolution).
(Necessary to make high priority images high quality)
Not necessary). As a result, the degree of user attention is high
Images are displayed with high quality and priority. Attached to the image
Image transmission from the sending terminal was controlled according to the priority
Control the image expansion and display on the receiving terminal.
More responsiveness of the terminal for the user.
Wear. In addition, priority, high quality image, frame
In order from the image with the largest number of images,
And gradually reduce the priority and image quality (priority management
The processing may be managed by the transmitting terminal or the receiving terminal.
May be managed at the end). As a method of determining priority
Is not necessarily based on the operation history of the camera.
Good. As mentioned earlier, the local
-Priority may be determined based on decoding time.
Images with high priority, high image quality, and many frames
In order from the image, the number of times of processing is
May be defined. In addition, regarding audio
Also, microphones are provided for each of a plurality of cameras to enable audio compression.
By controlling the absence, the image in the direction of the user's attention
Can be synthesized only. As described above, the transmitting terminal and the receiving terminal
Quantization step with reference to the response time between the
Or the number of frames may be determined. Also, if the receiving terminal
For images processed based on priority information at the time of load
Sending information by sending it to the destination via a communication channel
Processing of image data transmission from receiver to receiver terminal by receiver terminal
It can be prevented from exceeding the amount
You. Also, transmit the frame skip status at the receiving terminal
The data amount according to the status.
Can be adjusted. Further, the image is transmitted by a retransmission method.
The voice is transmitted by a transmission method that does not perform retransmission, and
The terminal determines the number of image retransmissions, the error rate of the received
Transmits any information about the drop rate to the sender terminal
Configuration. Then the image compression method at the sending terminal
Expression, quantization step value, number of frames, image to be compressed
The size of the image and whether or not to use image compression
And reduce the delay of audio transmission without disturbing the image.
Control can be performed. For example, TCP / IP
In the communication used, image transmission is TCP, and audio transmission is
This can be achieved by using UDP (video and audio are physically
It may or may not be on the same transmission path). In addition,
The communication method is not limited to TCP / IP. This method
When transmitting multiple video and audio simultaneously,
Define the discard rate and error rate for each audio
The compression method and the transmission method may be controlled. Finally, usually, an analog telephone line was used.
Low bit rate image transmission and large fluctuations in image content
Image, large block noises and cracks occur in the image
I do. In such a case, the image quality can be maintained only by the compression process.
One is difficult. Therefore, monitor the output side of the image
Filter that transmits only the signals of the
Low-pass filter or physical polarization filter
Filter makes the image look blurry
However, you can get an image that does not bother noise or
You. As apparent from the above description, the present invention
The invention is intended for decoding and synthesizing a plurality of video and audio simultaneously.
The processing amount based on the priority according to the load status of the terminal.
Has the advantage that it can be controlled. Further, according to the present invention, a plurality of
There is an advantage that video and audio can be synthesized.

【図面の簡単な説明】【図１】本発明の一実施の形態における画像復号化符号
化装置の概略構成図である。【図２】同実施の形態における別の例を示す画像音声復
号化符号化装置の概略構成図である。【図３】通信、記録フォーマットで優先度に関する情報
を付加する場合の例を説明する図である。【図４】ソフトウェアで本発明の構成をした場合の例を
説明する図である。【図５】情報の構造について説明する図である。【図６】ＤＥＭＵＸスレッドの動作について説明する図
である。【図７】監視スレッドの動作について説明する図であ
る。【図８】デコード・プロセスの動作について説明する図
である。【図９】画像合成スレッドの動作について説明する図で
ある。【図１０】表示監視スレッドの動作について説明する図
である。【図１１】画像合成装置のユーザインターフェースにつ
いて説明する図である。【図１２】受信側端末の能力の変動に応じた画像伝送を
行う方法について説明した図である。【図１３】本発明の一実施の形態の画像圧縮装置につい
て説明する図である。【図１４】操作管理部が管理する情報について説明する
図である。【図１５】広視野画像を作成する場合の画像圧縮装置を
説明する図である。【図１６】送信端末と受信端末との応答状況を説明する
図である。【符号の説明】１１受信管理部１２分離部１３送信管理部１４優先度決定部１７時間情報管理部１８画像伸長部１９画像合成部２０音声伸長部２１音声合成部３１ＤＥＭＵＸスレッド３６デコード・プロセス３７監視スレッド３９画像合成スレッド４２表示監視スレッド１２０４操作管理部１２０５画像圧縮部１２０８画像処理決定制御手段１４０１優先度決定制御部１４０２操作履歴管理部１４０４画像合成部１４０７入力部BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a schematic configuration diagram of an image decoding / encoding device according to an embodiment of the present invention. FIG. 2 is a schematic configuration diagram of a video / audio decoding / encoding device showing another example of the embodiment. FIG. 3 is a diagram illustrating an example of a case where information regarding priority is added in a communication and recording format. FIG. 4 is a diagram illustrating an example in which the configuration of the present invention is configured by software. FIG. 5 is a diagram illustrating a structure of information. FIG. 6 is a diagram illustrating the operation of a DEMUX thread. FIG. 7 is a diagram illustrating an operation of a monitoring thread. FIG. 8 is a diagram illustrating the operation of a decoding process. FIG. 9 is a diagram illustrating an operation of an image composition thread. FIG. 10 is a diagram illustrating an operation of a display monitoring thread. FIG. 11 is a diagram illustrating a user interface of the image composition device. FIG. 12 is a diagram illustrating a method of performing image transmission according to a change in the capability of the receiving terminal. FIG. 13 is a diagram illustrating an image compression device according to an embodiment of the present invention. FIG. 14 is a diagram illustrating information managed by an operation management unit. FIG. 15 is a diagram illustrating an image compression device when a wide-field image is created. FIG. 16 is a diagram illustrating a response situation between a transmitting terminal and a receiving terminal. [Description of Signs] 11 Reception management unit 12 Separation unit 13 Transmission management unit 14 Priority determination unit 17 Time information management unit 18 Image decompression unit 19 Image synthesis unit 20 Voice decompression unit 21 Voice synthesis unit 31 DEMUX thread 36 Decoding process 37 Monitoring thread 39 Image composition thread 42 Display monitoring thread 1204 Operation management section 1205 Image compression section 1208 Image processing decision control section 1401 Priority decision control section 1402 Operation history management section 1404 Image composition section 1407 Input section

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 5C059 KK35 MA00 MA05 MA14 MA24 MA43 MC11 PP01 PP05 PP06 PP07 PP19 SS07 SS08 TA07 TA17 TA46 TC20 TC25 UA02 UA05 5J064 AA00 BA00 BB10 BB12 BC02 BC29 BD02 ────────────────────────────────────────────────── ─── Continuation of front page F term (reference) 5C059 KK35 MA00 MA05 MA14 MA24 MA43 MC11 PP01 PP05 PP06 PP07 PP19 SS07 SS08 TA07 TA17 TA46 TC20 TC25 UA02 UA05 5J064 AA00 BA00 BB10 BB12 BC02 BC29 BD02

Claims

Claims: 1. An image input means for manually inputting an image, an image input management means for managing a control state of the image input means, and another terminal for managing a reception state of a reception terminal. Control request management means, at least according to the reception status of the managed receiving terminal or the control state of the image input means,
Encoding processing determining means for determining an image encoding method, image encoding means for encoding the input image according to the result of the encoding processing determining means, and output means for outputting the encoded image. A real-time image encoding device, comprising: