JP2004166156A

JP2004166156A - Image transmitter, network system, program, and storage medium

Info

Publication number: JP2004166156A
Application number: JP2002332424A
Authority: JP
Inventors: Shogo Oneda; 章吾大根田; Keiichi Suzuki; 啓一鈴木; Yukio Kadowaki; 幸男門脇; Yutaka Sano; 豊佐野; Toru Suino; 水納　　亨; Takanori Yano; 隆則矢野; Minoru Fukuda; 実福田
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2002-11-15
Filing date: 2002-11-15
Publication date: 2004-06-10

Abstract

<P>PROBLEM TO BE SOLVED: To transmit an image to a delivery destination of moving picture data by dynamically changing the scalability in accordance with the congested traffic of a communication path. <P>SOLUTION: Each client 2 detects the read quantity of code strings received from a server 1 per unit time with a data read quantity detection means 27 and transmits the reading quantity to the server 1. In the server 1, an integration means 25 integrates the read quantity of the code strings received from each client 2 per unit time and a parameter creation means 26 calculates error quantity from an integration result. In the respective code string creation means 231 to 23n, a quantization table is selected based on the error quantity, a new code string is created according to the quantization table and transmitted to the client 2. The error quantity is the one of data between the code strings before creation and the new code strings after creation in the code string creation means 231 to 23n. The quantization table consists of a plurality of tables formed by arranging the number of quantization bits by every wavelet transformation coefficient in an order of visual degradation degree about the respective wavelet transformation coefficients of the code strings. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、画像送信装置、ネットワークシステム、プログラム及び記憶媒体に関する。
【０００２】
【従来の技術】
従来、サーバから動画像データを供給し、クライアントからは動画像データの転送要求をサーバに対して行なって、受け取ったデータをもとにクライアントで動画像を再生する、動画像配信技術が存在する。かかる技術において、クライアント側の再生装置がサーバから動画像データを受信して再生する手法には、大別してダウンロード再生とストリーム再生が知られている。
【０００３】
ダウンロード再生とは、サーバからクライアントのバッファ上にダウンロードされたデータを再生するものであり、このタイプは、一旦バッファにデータをとりこんだ後に再生するため、バッファの記憶容量の制限により動画像データを再生できる時間が短くなるという短所があるが、すべてのデータを受け取ってから再生すれば、サーバ側の処理負担や伝送路の速度や混雑状況によらず、動画像の再生が行なえる長所がある。
【０００４】
一方、ストリーミング再生とは、クライアント側の再生装置がサーバから連続的にデータを要求し、そのデータをバッファ上に取得する作業と並行して動画像の再生処理を行なうものである。このタイプは、クライアントが連続的に動画像データを受け取るため、自己のバッファ上のデータを再生したら放棄する一方で、新たなデータを上書きしていく。従って、バッファの記憶容量の制限を受けることなく長時間の動画像の再生が可能であるという長所がある。
【０００５】
しかし、サーバに同時にアクセスしているクライアント数の増加によるサーバ負荷の増大や、伝送線路の速度に影響を受けやすいという短所がある。よって、サーバの負荷増大や、伝送線路の速度低下により動画像の再生がストップしてしまうというような重大な問題を引き起こす可能性がある。一般に、このような重大な影響を回避する手法として、動画像データの容量を変化させるスケーラビリティという手法が知られている。
【０００６】
また、従来から、画像圧縮伸長アルゴリズムとして、動画像専用のＭＰＥＧ１／ＭＰＥＧ２／ＭＰＥＧ４や、静止画像を連続したフレームとして扱うＭｏｔｉｏｎＪＰＥＧが使用されているが、最近では、後者のＭｏｔｉｏｎ静止画像の符号化については、国際標準としてＭｏｔｉｏｎＪＰＥＧ２０００という新しい方式が規格化されつつある。
【０００７】
【特許文献１】
特開２００１−２７４８６１公報
【０００８】
【発明が解決しようとする課題】
しかし、従来の動画像配信技術では、一般にサーバ上に配信コンテンツをあらかじめ複数のスケーラビリティで保存しておき、通信路の送信能力やクライアントの再生能力に応じて、最もふさわしいスケーラビリティをユーザが選択してストリーミング再生を行なうのが普通である。
【０００９】
この場合は、クライアントを操作するユーザが通信路の制約等を考慮してスケーラビリティを選択するが、通信路の状態が途中で変化すること、あるいは与えられた選択肢の中に最適な条件があるとは限らず、制約に対して最適な画像を得られるとは限らない。これに対応するためには、用意された元画像を元にサーバの負荷等に応じて動的にスケーラビリティを変化させることが考えられるが、データ量だけに着目して制御すると、複雑なシーンでは画質が悪くなり、複雑でないシーンでは必要以上の符号量を使ってしまい、通信路などの利用効率が悪くなってしまう。
【００１０】
この発明の目的は、送信側における処理の負荷に応じて動的にスケーラビリティを変化させた画像の送信を可能とすることである。
【００１１】
この発明の別の目的は、動的にスケーラビリティを変化させつつ画質の劣化を防止することである。
【００１２】
【課題を解決するための手段】
請求項１に記載の発明は、動画像データをフレームごとに１又は複数の小領域に分割してこの小領域ごとに階層的に圧縮符号化した符号列を対象として、その符号列データの構文を解析する構文解析手段と、この解析結果に基づいて前記符号列から新たな符号列を作成する処理を同時並行的に複数実行できる符号列変換手段と、この作成した各符号列をネットワークを介して各送信先に送信する送信手段と、前記各送信先からそれぞれ当該送信先で前記送信した符号列を受信した際における単位時間当たりのデータ読込量の情報を受信する受信手段と、この受信した各データ読込量の情報を統合して前記ネットワークのトラフィックの混雑状況を検出する統合化手段と、前記作成前の符号列と前記新たな符号列とのデータの誤差量を前記統合の結果に応じて指定して、この指定した誤差量となるように前記符号列変換手段に前記作成を行なわせる誤差量指定手段と、を備えている画像送信装置である。
【００１３】
したがって、ネットワークのトラフィックが混雑してきたときは誤差量を大きくして送信する符号列のデータ量を低減することにより、ネットワークのトラフィックに応じて動的にスケーラビリティを変化させて画像の送信を行なうことができる。
【００１４】
請求項２に記載の発明は、請求項１に記載の画像送信装置において、前記誤差量指定手段は、前記誤差量の指定を前記ネットワークを介して前記符号列の送信先から受信する。
【００１５】
したがって、ネットワークのトラフィックに応じて送信側で指定してきた誤差量を用いて、ネットワークのトラフィックに応じて動的にスケーラビリティを変化させて画像の送信を行なうことができる。
【００１６】
請求項３に記載の発明は、請求項１又は２に記載の画像送信装置において、前記符号列変換手段は、前記作成前の符号列に含まれている当該符号列の符号を破棄することによる前記誤差量を表わすデータを読取って前記作成を行なう。
【００１７】
したがって、新たな符号列の作成前の符号列に含まれている当該符号列の符号を破棄することによる誤差量を表わすデータをヘッダ情報などから読取って新たな符号列を作成し、ネットワークのトラフィックに応じて動的にスケーラビリティを変化させて画像の送信を行なうことができる。
【００１８】
請求項４に記載の発明は、請求項１〜３の何れかの一に記載の画像送信装置において、前記符号列変換手段は、前記作成前の符号列の作成にウェーブレット変換が用いられている場合に、当該符号列の各ウェーブレット変換係数についてウェーブレット変換係数ごとの量子化ビット数を前記作成後の符号列の画像についての視覚的劣化度順に並べた複数のテーブルから、前記誤差量指定手段で指定した前記誤差量に応じてテーブルを選択し、このテーブルのデータに基づいて前記作成を行なう。
【００１９】
したがって、ウェーブレット変換係数を調節することで、ネットワークのトラフィックに応じて動的にスケーラビリティを変化させて画像の送信を行なうことができる。
【００２０】
請求項５に記載の発明は、請求項１〜４の何れかの一に記載の画像送信装置において、前記符号列変換手段は、指定された画像のフレームレートに応じて前記作成を行なう。
【００２１】
したがって、ネットワークのトラフィックのみならず、フレームレートに応じて動的にスケーラビリティを変化させるので、画像の劣化を防止することができる。
【００２２】
請求項６に記載の発明は、請求項１〜５の何れかの一に記載の画像送信装置において、前記作成前の符号列について画像の動き量を検出する動き量検出手段を備え、前記符号列変換手段は、前記検出した動き量に応じて前記作成を行なう。
【００２３】
したがって、ネットワークのトラフィックのみならず、画像の動き量に応じて動的にスケーラビリティを変化させるので、画像の劣化を防止することができる。
【００２４】
請求項７に記載の発明は、請求項１〜６の何れかの一に記載の画像送信装置と、この画像送信装置がネットワークを介して送信する前記符号列を受信する複数台の画像受信装置と、を備えているネットワークシステムである。
【００２５】
したがって、請求項１〜６の何れかの一に記載の発明と同様の作用、効果を奏することができる。
【００２６】
請求項８に記載の発明は、請求項１〜６の何れかの一に記載の発明の前記各手段の機能を実行するコンピュータに読取り可能なプログラムである。
【００２７】
したがって、請求項１〜６の何れかの一に記載の発明と同様の作用、効果を奏することができる。
【００２８】
請求項９に記載の発明は、請求項８に記載のプログラムを記憶している記憶媒体である。
【００２９】
したがって、請求項８に記載の発明と同様の作用、効果を奏することができる。
【００３０】
【発明の実施の形態】
［ＪＰＥＧ２０００アルゴリズムの概要］
まず、本発明の実施の形態における前提技術となるＪＰＥＧ２０００アルゴリズムの概要について説明する。
【００３１】
図１は、ＪＰＥＧ２０００アルゴリズムの基本を説明するための説明図である。ＪＰＥＧ２０００のアルゴリズムは、色空間変換・逆変換部１１１、２次元ウェーブレット変換・逆変換部１１２、量子化・逆量子化部１１３、エントロピー符号化・復号化部１１４、タグ処理部１１５で構成されている。
【００３２】
図２に示すように、カラー画像は、一般に、原画像の各コンポーネント（ここではＲＧＢ原色系）が、矩形をした領域（タイル）１２３１２２，１２３によって分割される。そして、個々のタイル、例えば、Ｒ００，Ｒ０１，…，Ｒ１５／Ｇ００，Ｇ０１，…，Ｇ１５／Ｂ００，Ｂ０１，…，Ｂ１５が、圧縮伸長プロセスを実行する際の基本単位となる。従って、圧縮伸長動作は、コンポーネント毎、そしてタイル毎に、独立に行なわれる。
【００３３】
画像データの符号化時には、各コンポーネントの各タイルのデータが、図１の色空間変換・逆変換部１１１に入力され、色空間変換を施されたのち、２次元ウェーブレット変換・逆変換部１１２で２次元ウェーブレット変換（順変換）が適用されて周波数帯に空間分割される。
【００３４】
図３には、デコンポジション・レベル数が３の場合の、各デコンポジション・レベルにおけるサブ・バンドを示している。すなわち、原画像のタイル分割によって得られたタイル原画像（０ＬＬ）（デコンポジション・レベル０（１３１））に対して、２次元ウェーブレット変換を施し、デコンポジション・レベル１（１３２）に示すサブ・バンド（１ＬＬ，１ＨＬ，１ＬＨ，１ＨＨ）を分離する。そして引き続き、この階層における低周波成分１ＬＬに対して、２次元ウェーブレット変換を施し、デコンポジション・レベル２（１３３）に示すサブ・バンド（２ＬＬ，２ＨＬ，２ＬＨ，２ＨＨ）を分離する。順次、同様に、低周波成分２ＬＬに対しても、２次元ウェーブレット変換を施し、デコンポジション・レベル３（１３４）に示すサブ・バンド（３ＬＬ，３ＨＬ，３ＬＨ，３ＨＨ）を分離する。さらに、図３では、各デコンポジション・レベルにおいて符号化の対象となるサブ・バンドを、斜線で表してある。例えば、デコンポジション・レベル数を３とした時、斜線で示したサブ・バンド（３ＨＬ，３ＬＨ，３ＨＨ，２ＨＬ，２ＬＨ，２ＨＨ，１ＨＬ，１ＬＨ，１ＨＨ）が符号化対象となり、３ＬＬサブ・バンドは符号化されない。
【００３５】
次いで、指定した符号化の順番で符号化の対象となるビットが定められ、図１の量子化・逆量子化部１１３で対象ビット周辺のビットからコンテキストが生成される。量子化の処理が終わったウェーブレット係数は、個々のサブバンド毎に、「プレシンクト」と呼ばれる重複しない矩形に分割される。これは、インプリメンテーションでメモリを効率的に使うために導入されたものである。図５に示すように、一つのプレシンクトは、空間的に一致した３つの矩形領域からなっている。更に、個々のプレシンクトは、重複しない矩形の「コード・ブロック」に分けられる。これは、エントロピー・コーディングを行なう際の基本単位となる。
【００３６】
ウェーブレット変換後の係数値は、そのまま量子化し符号化することも可能であるが、ＪＰＥＧ２０００では符号化効率を上げるために、係数値を「ビットプレーン」単位に分解し、画素あるいはコード・ブロック毎に「ビットプレーン」に順位付けを行なうことができる。図６には、その手順を簡単に示した。この例は、原画像（３２×３２画素）を１６×１６画素のタイル４つで分割した場合で、デコンポジション・レベル１のプレシンクトとコード・ブロックの大きさは、各々８×８画素と４×４画素としている。プレシンクトとコード・ブロックの番号は、ラスター順に付けられる。タイル境界外に対する画素拡張にはミラーリング法を使い、可逆（５×３）フィルタでウェーブレット変換を行ない、デコンポジションレベル１のウェーブレット係数値を求めている。また、タイル０／プレシンクト３／コード・ブロック３について、代表的な「レイヤー」についての概念図をも併せて示している。レイヤーの構造は、ウェーブレット係数値を横方向（ビットプレーン方向）から見ると理解し易い。１つのレイヤーは任意の数のビットプレーンから構成される。この例では、レイヤー０，１，２，３は、各々、１，３，１の３つのビットプレーンから成っている。そして、ＬＳＢに近いビットプレーンを含むレイヤー程、先に量子化の対象となり、逆に、ＭＳＢに近いレイヤーは最後まで量子化されずに残ることになる。ＬＳＢに近いレイヤーから破棄する方法はトランケーションと呼ばれ、量子化率を細かく制御することが可能である。
【００３７】
エントロピー符号化・復号化部１１４（図１参照）では、コンテキストと対象ビットから確率推定によって、各コンポーネントのタイルに対する符号化を行なう。こうして、原画像の全てのコンポーネントについて、タイル単位で符号化処理が行われる。最後にタグ処理部１１５は、エントロピコーダ部からの全符号化データを１本のコード・ストリームに結合するとともに、それにタグを付加する処理を行なう。図４には、コード・ストリームの構造を簡単に示した。図４に示すように、コード・ストリームの先頭と各タイルを構成する部分タイルの先頭にはヘッダと呼ばれるタグ情報が付加され、その後に、各タイルの符号化データが続く。そして、コード・ストリームの終端には、再びタグが置かれる。
【００３８】
一方、復号化時には、符号化時とは逆に、各コンポーネントの各タイルのコード・ストリームから画像データを生成する。図１を用いて簡単に説明する。この場合、タグ処理部１１５は、外部より入力したコード・ストリームに付加されたタグ情報を解釈し、コード・ストリームを各コンポーネントの各タイルのコード・ストリームに分解し、その各コンポーネントの各タイルのコード・ストリーム毎に復号化処理が行われる。コード・ストリーム内のタグ情報に基づく順番で復号化の対象となるビットの位置が定められるとともに、量子化・逆量子化部１１３で、その対象ビット位置の周辺ビット（既に復号化を終えている）の並びからコンテキストが生成される。エントロピー符号化・復号化部１１４で、このコンテキストとコード・ストリームから確率推定によって復号化を行ない、対象ビットを生成し、それを対象ビットの位置に書き込む。このようにして復号化されたデータは周波数帯域毎に空間分割されているため、これを２次元ウェーブレット変換・逆変換部１１２で２次元ウェーブレット逆変換を行なうことにより、画像データの各コンポーネントの各タイルが復元される。復元されたデータは色空間変換・逆変換部１１１によって元の表色系のデータに変換される。
【００３９】
［発明の実施の形態］
本発明の一実施の形態について説明する。
【００４０】
図７は、本実施の形態１のネットワークシステム１０を示すブロック図である。図７に示すように、本ネットワークシステム１０は、動画の画像データをＭｏｔｉｏｎＪＰＥＧ２０００等のアルゴリズムで圧縮符号化した符号列をインターネットなどのネットワーク３を介して送信するサーバ１と、このサーバ１から符号列を受信するクライアント２からなる。クライアント２は、同時に複数台接続され得る（図９に示す、クライアント２３ｎ（ｎ＝１，２，…））。
【００４１】
図８は、サーバ１、クライアント２の電気的な接続を示すブロック図である。図８に示すように、サーバ１、クライアント２は、それぞれ本発明の画像送信装置、画像受信装置を実施するもので、各種演算を行ないサーバ１（またはクライアント２）の各部を集中的に制御するＣＰＵ１１と、各種のＲＯＭやＲＡＭからなるメモリ１２とが、バス１３で接続されている。
【００４２】
バス１３には、所定のインターフェイスを介して、記憶装置となるハードディスクなどの磁気記憶装置１４と、マウスやキーボードなどで構成される入力装置１５と、ＬＣＤやＣＲＴなどの表示装置１６と、光ディスクなどの本発明の記憶媒体を実施する記憶媒体１７を読取る記憶媒体読取装置１８と、ネットワーク３と通信を行なう通信装置となる所定の通信インターフェイス１９とが接続されている。なお、記憶媒体１７としては、ＣＤやＤＶＤなどの光ディスク、光磁気ディスク、フレキシブルディスクなどの各種方式のメディアを用いることができる。また、記憶媒体読取装置１８は、具体的には記憶媒体１７の種類に応じて光ディスクドライブ、光磁気ディスクドライブ、フレキシブルディスクドライブなどが用いられる。
【００４３】
磁気記憶装置１４には、本発明のプログラムを実施する画像送信プログラム（または画像受信プログラム）が記憶されている。一般的には、この画像送信プログラム（または画像受信プログラム）は、本発明の記憶媒体を実施する記憶媒体１７から記憶媒体読取装置１８により読取ることでサーバ１（またはクライアント２）にインストールするが、ネットワーク３からダウンロードするなどして、磁気記憶装置１４にインストールしたものである。このインストールによりサーバ１、クライアント２は動作可能な状態となる。この画像送信プログラム、画像受信プログラムは、特定のアプリケーションソフトの一部をなすものであってもよい。また、所定のＯＳ上で動作するものであってもよい。
【００４４】
図９は、ネットワークシステム１０が画像送信プログラム、画像受信プログラム等に基づいて実行する処理を説明する機能ブロック図である。
【００４５】
まず、クライアント２が画像受信プログラム等に基づいて行なう処理について説明する。各クライアント２（クライアント２３１〜２３ｎ）では、送信手段、受信手段である送受信手段３１が通信インターフェイス１９を介して、サーバ１から画像を圧縮符号化した符号列（後述）を受信し、この受け取った符号列をバッファ３２（磁気記憶装置１４）に格納後、この符号列を復号手段３３で復号し、表示手段３４が表示装置１６に表示する。
【００４６】
また、送受信手段３１が受信した符号列を対象として、データ読込量検出手段２７が、単位時間当たりに送受信手段３１で読み込まれた符号列のデータ量を監視し、これにより、ネットワーク３のトラフィックの混み具合を判断する。この情報（監視情報）は、各クライアント２の送受信手段３１からサーバ１に返信される。
【００４７】
次に、サーバ１が画像送信プログラム等に基づいて行なう処理について説明する。サーバ１のバッファ２１（磁気記憶装置１４）には、例えば、ＭｏｔｉｏｎＪＰＥＧ２０００アルゴリズムで動画像データを圧縮符号化した符号列が蓄積されており、従って、この符号列は動画像データをフレームごとに１又は複数のタイルという小領域に分割して、このタイルごとに階層的に圧縮符号化されたものである。サーバ１は、かかる符号列を各クライアント２０１〜２０ｎからの要求に従ってストリーミング配信する。すなわち、バッファ２１に格納されている符号列のヘッダ情報から当該符号列の構文を構文解析手段２２により解析し、この解析の結果に基づいて符号列作成手段２３の個別の各符号列作成手段２３１〜２３ｎが、当該符号列をＭｏｔｉｏｎＪＰＥＧ２０００アルゴリズムによる新たな符号列に変換し（その詳細については後述する）、この変換後の符号列を、送信手段、受信手段である送受信手段２４が、通信インターフェイス１９、ネットワーク３を介して、クライアント２に送信する。
【００４８】
統合化手段２５は、各クライアント２０１〜２０ｎから受信した監視情報を統合化して、ネットワーク３のトラフィックの混雑状況を総合的に判断して、その結果（統合情報）を出力する。この統合化の処理は、例えば次のようにして行なう。
【００４９】
各クライアント２０１〜２０ｎの１秒当たりのデータの読込量をＡｎ（単位：ｂｐｓ）とすると、各クライアント２０１〜２０ｎの数はｎであるから、
通信回線の使用率Ｃ＝ΣＡｎ／Ｂ
となる。ここで、Ｂは全回線のデータ転送能力の合計（単位：ｂｐｓ）、すなわち、サーバ１の最大のデータ配信能力である。この通信回線の使用率Ｃが、統合情報の内容となる。
【００５０】
誤差量指定手段となるパラメータ作成手段２６は、統合化手段２５が作成した統合化情報から、各符号列作成手段２３１〜２３ｎで符号列を作成する際のパラメータを作成する。このパラメータは、具体的には、各符号列作成手段２３１〜２３ｎで符号列を作成する際の作成前の符号列と作成後の符号列とのデータの誤差量を指定するか、あるいは、各符号列作成手段２３１〜２３ｎで作成後の符号列における画像のフレームレートを指定するものである。
【００５１】
このパラメータの決定は、より具体的には、例えば次のように行なう。すなわち、統合化手段２５による統合情報に閾値を設け、統合情報が閾値を下回っているときは、符号列を送信する際の単位時間当たりのデータ量を多くするように符号列を作成する。すなわち、作成後の符号列における画像のフレームレートを大きくし、あるいは、誤差量を小さくして画像の量子化レベルを細かくするように、パラメータを作成する。
【００５２】
また、統合情報が閾値を上回ったときは、符号列作成手段２３の負荷が大きいため、送信する際の単位時間当たりのデータ量を少なくするように符号列を作成する。すなわち、フレームレートを小さくし、あるいは、誤差量を大きくして画像の量子化レベルを粗くするように、パラメータを作成する。
【００５３】
以下では、パラメータとして前記の誤差量を用いる場合について説明する。図１０は、各符号列作成手段２３１〜２３ｎの一構成例を示すブロック図である。量子化テーブル選択手段４１には、パラメータ作成手段２６で作成したパラメータが入力する。バッファ２１に格納された符号列は構文解析手段２２（図９）によりヘッダ情報が解読される。そして、そのヘッダ情報に基づいて、符号列を構成する各符号を部分的に符号破棄したときの元データに対するデータの誤差量が、量子化テーブル選択手段４１に入力される。この量子化テーブル選択手段４１は、入力された誤差量とパラメータとに基づいて、所定のテーブルデータ（量子化テーブル）を選択し、これを、符号列変換手段である量子化手段４２に送る。量子化手段４２は、その量子化テーブルのデータに従って符号列から符号を選択的に破棄し、また、ヘッダを書き換えて、新たな符号列を生成する。このように、各符号列作成手段２３１〜２３ｎにおいて新たな符号列を生成することができるので、新たな符号列を作成する処理は、同時並行的に複数（ｎ個）実行できることになる。
【００５４】
量子化テーブルには、図１１に例示するような各ウェーブレット変換係数について、図１２に示すようにウェーブレット変換係数ごとの量子化ビット数（符号破棄量）が記録されている。この量子化テーブルは、図１３に示すように、複数の量子化テーブル（量子化ビット数）を量子化後の視覚的劣化度順にＩｎｄｅｘと対応付けて並べた量子化テーブル群として保持されている。量子化テーブル選択手段４１は、符号列から符号を部分的に破棄したときの誤差量が、量子化テーブルを組み合わせたときにパラメータで指定されている誤差量となるＩｎｄｅｘの量子化テーブルを選択する。
【００５５】
したがって、パラメータ作成手段２６が作成するパラメータは、統合情報を段階的に（あるいは無段階的に）判断し、段階的に判断するときは、段階に応じた量子化ビット数で誤差量を指定すればよい。例えば、統合情報を５段階で判断し、段階５が最もネットワーク３が混雑している場合であるとすれば、統合情報の段階１〜５を、それぞれ誤差量となる量子化ビット数１〜５に対応させればよい。なお、通常は量子化後の視覚的劣化度合いと前述の誤差量とは単調増加の関係にあるため、パラメータどおりの誤差量になる量子化テーブルを選択することは容易である。この量子化テーブルは、一般的には、視覚的に劣化が目立たないように、高周波成分よりも低周波成分を重要視して量子化するように構成するのが望ましい。
【００５６】
図１４は、各符号列作成手段２３１〜２３ｎの別の構成例についての機能ブロック図である。バッファ２１に格納された符号列は構文解析手段２２（図９）により、ヘッダ情報が解読され、ヘッダ情報中から各符号を部分的に符号破棄したときの元データに対するデータの誤差量が、量子化テーブル選択手段４１に入力される。動き量検出手段４３は、構文解析手段２２で解析された各ウェーブレット変換係数の符号量に基づいて画像の動き量を検出する。量子化テーブル選択手段４１は、入力された誤差量と、クライアント２から指定されるフレームレート（この例では、クライアント２側からフレームレートの指定をサーバ１に送信可能であることを前提としている）、及び、動き量検出手段４３で検出された動き量に基づいて、前述と同様の量子化テーブルを選択し、これを量子化手段４２に送る。量子化手段４２は、その量子化テーブルのデータとクライアント２から指定されたフレームレートとに基づいて、符号列から部分的に符号を破棄し、また、ヘッダを書き換えて、新たな符号列を生成する。
【００５７】
複数の量子化テーブルからのテーブルの選択は、図１１〜図１３を参照して説明した前記の例と同様であるが、量子化テーブル選択手段４１には複数の量子化テーブル群が保持されており、フレームレート及び動き量により所定のテーブル群を選択し、適応している。
【００５８】
動き量検出手段４３は、次のようにして画像の動き量を検出する。図１５は、ＭｏｔｉｏｎＪＰＥＧ２０００方式における画像の動き量の考え方を説明する説明図である。図１５に示すように、インターレース画像において、動きが高速な画像は図１５（ａ）のように長い横エッジが発生する（インターレースのくし型と言う）。それに対し、動きが低速な画像は図１５（ｃ）のように短い横エッジが発生する。図１５（ｂ）は、これらの中間である動きが中速である場合を示している。これらの違いは、高周波成分の横エッジ量をあらわす１ＬＨ成分に大きく現れる。つまり、動き量の大きな画像は１ＬＨ成分の係数の絶対値が大きくなり、その結果、１ＬＨ成分の符号量は大きくなる。但し、１ＬＨ成分の符号量のみで画像の動き量を判定すると、画像によって閾値が変わる可能性があるので、１ＬＨ成分の符号量を１ＨＬ成分の符号量で正規化し、その値を画像の動き量の検出の特徴量としてもよい。
【００５９】
さらに、前記の特徴はビットプレーンを削る（ポスト量子化）前の符号量に大きく現れるので、ビットプレーンを削る前の１ＬＨと１ＨＬ符号量を符号に記述しておいて、その値を用いて動き量を推定するような構成も可能である。これは、特に画像の圧縮率が大きい場合に有効である。
【００６０】
図１６は、この場合における画像の動き量の判定処理の一例を示すフローチャートである。
【００６１】
図１６に示すように、まず、１ＬＨのロスレス符号量の和（ｓｕｍ１ＬＨ）を算出し（ステップＳ１）、また、１ＨＬのロスレス符号量の和（ｓｕｍ１ＨＬ）を算出し（ステップＳ２）、“ｓｕｍ１ＬＨ”を“ｓｕｍ１ＨＬ”で除算して（ステップＳ３）、その結果（ｓｐｅｅｄ）を所定の閾値（ｔｈ１）と比較し、“ｓｐｅｅｄ＞ｔｈ１”のときは（ステップＳ３のＹ）、画像の動き量が大きいと判定する（ステップＳ４）。“ｓｐｅｅｄ≦ｔｈ１”であるときは（ステップＳ３のＮ）、逆に画像の動き量が小さいと判定する（ステップＳ５）。
【００６２】
また、高速な画像の場合、前記のくし型を残すように量子化することで視覚的劣化度合いを抑さえることができる。ただし、フレームを間引いてフレームレートを落としていくと、１フレームが長く表示されることになり、結果的にくし型が目立つことになる。従って、フレームレートを落とす場合には、高速な画像においてもくし型が残らないような量子化をする必要がある。その組み合わせを表１に示す。すなわち、フレームレートが高いときは、動き量が大きいときにくし型を保存し、小さいときには保存しない。フレームレートが低いときは、動き量の大小にかかわらずくし型を保存しない。
【００６３】
【表１】

【００６４】
以上のクライアント２、サーバ１の処理を、図１７、図１８のフローチャートに整理して説明すると次のようになる。まず、図１７に示すように、クライアント２は、サーバ１から符号列の受信があるときは（ステップＳ１１のＹ）、データ読込量検出手段２７で単位時間当たりの符号列の読込量を検出し（ステップＳ１２）、誤差量作成手段２８が、この検出値から誤差量のパラメータを作成し（ステップＳ１３）、サーバ１に送信する（ステップＳ１４）。
【００６５】
図１８に示すように、サーバ１は、送信すべき符号列があるときは（ステップＳ２１のＹ）、各クライアント２から受信した誤差量（図１４の例の場合は、さらに、フレームレート、画像の動き量）を統合して統合情報を作成し（ステップＳ２２）、この統合情報に基づいて前述のように量子化テーブルを選択して（ステップＳ２３）、この量子化テーブルに従って新たな符号列を作成し（ステップＳ２４）、この作成後の符号列を送信する（ステップＳ２５）。
【００６６】
このように、本ネットワークシステム１０によれば、ネットワーク３のトラフィックが混雑してきたときは誤差量を大きくして、送信する符号列のデータ量を低減することにより、ネットワーク３のトラフィックに応じて動的にスケーラビリティを変化させて画像の送信を行なうことができる。
【００６７】
また、図１４以下を参照して説明した構成例によれば、ネットワーク３のトラフィックのみならず、フレームレートや画像の動き量に応じて動的にスケーラビリティを変化させるので、画像の劣化を防止することができる。
【００６８】
【発明の効果】
請求項１に記載の発明は、ネットワークのトラフィックが混雑してきたときは誤差量を大きくして送信する符号列のデータ量を低減することにより、ネットワークのトラフィックに応じて動的にスケーラビリティを変化させて画像の送信を行なうことができる。
【００６９】
請求項２に記載の発明は、請求項１に記載の発明において、ネットワークのトラフィックに応じて送信側で指定してきた誤差量を用いて、ネットワークのトラフィックに応じて動的にスケーラビリティを変化させて画像の送信を行なうことができる。
【００７０】
請求項３に記載の発明は、請求項１又は２に記載の発明において、新たな符号列の作成前の符号列に含まれている当該符号列の符号を破棄することによる誤差量を表わすデータをヘッダ情報などから読取って新たな符号列を作成し、ネットワークのトラフィックに応じて動的にスケーラビリティを変化させて画像の送信を行なうことができる。
【００７１】
請求項４に記載の発明は、請求項１〜３の何れかの一に記載の発明において、ウェーブレット変換係数を調節することで、ネットワークのトラフィックに応じて動的にスケーラビリティを変化させて画像の送信を行なうことができる。
【００７２】
請求項５に記載の発明は、請求項１〜４の何れかの一に記載の発明において、ネットワークのトラフィックのみならず、フレームレートに応じて動的にスケーラビリティを変化させるので、画像の劣化を防止することができる。
【００７３】
請求項６に記載の発明は、請求項１〜４の何れかの一に記載の発明において、ネットワークのトラフィックのみならず、画像の動き量に応じて動的にスケーラビリティを変化させるので、画像の劣化を防止することができる。
【００７４】
請求項７〜９に記載の発明は、請求項１〜６の何れかの一に記載の発明と同様の作用、効果を奏することができる。
【図面の簡単な説明】
【図１】ＪＰＥＧ２０００アルゴリズムの基本を説明するための説明図である。
【図２】カラー画像の各コンポーネントについて説明するための説明図である。
【図３】デコンポジション・レベル数が３の場合の、各デコンポジション・レベルにおけるサブ・バンドを示す説明図である。
【図４】コード・ストリームの構造の説明図である。
【図５】一つのプレシンクトが空間的に一致した３つの矩形領域からなっていることの説明図である。
【図６】係数値をビットプレーン単位に分解し、画素あるいはコード・ブロック毎にビットプレーンに順位付けを行なうことの説明図である。
【図７】本発明の一実施の形態であるネットワークシステムの概略構成のブロック図である。
【図８】サーバ、クライアントの電気的な接続のブロック図である。
【図９】ネットワークシステムの機能ブロック図である。
【図１０】符号列作成手段を説明する機能ブロック図である。
【図１１】各ウェーブレット変換係数の説明図である。
【図１２】ウェーブレット変換係数と量子化ビット数の関係を示す説明図である。
【図１３】量子化テーブルの説明図である。
【図１４】符号列作成手段の他の例を説明する機能ブロック図である。
【図１５】インターレースのくし型の説明図である。
【図１６】画像の動き量を判断する処理のフローチャートである。
【図１７】サーバが行なう処理のフローチャートである。
【図１８】クライアントが行なう処理のフローチャートである。
【符号の説明】
１画像送信装置
２画像受信装置
３ネットワーク
１０ネットワークシステム
１７記憶媒体
２２構文解析手段
２４送信手段、受信手段
２５統合化手段
３１受信手段、送信手段
３５データ読込量検出手段
４１誤差量作成手段
４２符号列変換手段[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to an image transmission device, a network system, a program, and a storage medium.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, there is a moving image distribution technology in which moving image data is supplied from a server, a transfer request of moving image data is issued from a client to a server, and a moving image is reproduced by a client based on the received data. . In this technique, download reproduction and stream reproduction are generally known as methods in which a reproduction device on the client side receives and reproduces moving image data from a server.
[0003]
Download reproduction is to reproduce the data downloaded from the server to the buffer of the client. This type is to reproduce the data once it is loaded into the buffer. There is a disadvantage that the playback time is short, but if all the data is received and then played back, there is an advantage that the moving image can be played regardless of the processing load on the server side, the speed of the transmission path, and the congestion situation. .
[0004]
On the other hand, in the streaming reproduction, a reproduction device on the client side continuously requests data from a server and performs a reproduction process of a moving image in parallel with an operation of acquiring the data on a buffer. In this type, since the client continuously receives moving image data, the data in its own buffer is reproduced and discarded, while new data is overwritten. Therefore, there is an advantage that a long-time moving image can be reproduced without being limited by the storage capacity of the buffer.
[0005]
However, there are disadvantages in that the server load increases due to an increase in the number of clients accessing the server at the same time, and the transmission line is easily affected by the speed. Therefore, there is a possibility that a serious problem that reproduction of a moving image is stopped due to an increase in a load on a server or a decrease in speed of a transmission line. In general, as a method for avoiding such a serious influence, a method called scalability for changing the capacity of moving image data is known.
[0006]
Conventionally, MPEG1 / MPEG2 / MPEG4 dedicated to moving images and Motion JPEG which treats still images as continuous frames have been used as image compression / decompression algorithms. Recently, however, the encoding of the latter Motion still images has been used. As for, a new method called Motion JPEG2000 is being standardized as an international standard.
[0007]
[Patent Document 1]
JP 2001-274861 A
[Problems to be solved by the invention]
However, in the conventional moving image distribution technology, distribution contents are generally stored on a server in advance with a plurality of scalabilities, and a user selects the most suitable scalability according to a transmission capability of a communication path and a reproduction capability of a client. It is common to perform streaming playback.
[0009]
In this case, the user who operates the client selects scalability in consideration of the restrictions on the communication path, etc., but if the state of the communication path changes on the way, or if there is an optimal condition among the given options. However, it is not always possible to obtain an optimal image for the constraints. In order to cope with this, it is conceivable to dynamically change the scalability according to the load of the server based on the prepared original image. The image quality is degraded, and an unnecessarily large amount of code is used in a scene that is not complicated, resulting in poor utilization efficiency of a communication path or the like.
[0010]
An object of the present invention is to enable transmission of an image whose scalability is dynamically changed according to a processing load on a transmission side.
[0011]
Another object of the present invention is to prevent image quality deterioration while dynamically changing scalability.
[0012]
[Means for Solving the Problems]
According to the first aspect of the present invention, the syntax of the code string data is divided into one or a plurality of small areas for each frame, and a code string hierarchically compression-coded for each of the small areas is targeted. Syntactic analysis means for analyzing a code string, code string conversion means capable of simultaneously and concurrently performing a plurality of processes for creating a new code string from the code string based on the analysis result, Transmitting means for transmitting to the respective destinations, receiving means for receiving information on the amount of data read per unit time when the transmitted code string is received at the respective destinations from the respective destinations, and Integrating means for integrating the information of each data read amount and detecting the traffic congestion state of the network; and integrating the error amount of data between the code string before creation and the new code string. Results and specified according to an image transmission apparatus and a, and the error amount specifying means for causing the creation to the code sequence conversion means such that the specified amount of error.
[0013]
Therefore, when network traffic becomes congested, the amount of error is increased to reduce the amount of data in the code string to be transmitted, so that scalability is dynamically changed according to network traffic to transmit images. Can be.
[0014]
According to a second aspect of the present invention, in the image transmission device according to the first aspect, the error amount designating unit receives the designation of the error amount from a transmission destination of the code string via the network.
[0015]
Therefore, it is possible to transmit the image while dynamically changing the scalability according to the network traffic, using the error amount designated on the transmission side according to the network traffic.
[0016]
According to a third aspect of the present invention, in the image transmission device according to the first or second aspect, the code string conversion unit discards a code of the code string included in the code string before creation. The creation is performed by reading data representing the error amount.
[0017]
Therefore, a new code string is created by reading data representing an error amount due to discarding the code of the code string included in the code string before creation of the new code string from the header information or the like, and creating a new code string. The image can be transmitted by dynamically changing the scalability according to the image data.
[0018]
According to a fourth aspect of the present invention, in the image transmission device according to any one of the first to third aspects, the code string conversion unit uses a wavelet transform to create the code string before the creation. In this case, the error amount designating unit uses a plurality of tables in which the number of quantization bits for each wavelet transform coefficient for each wavelet transform coefficient of the code sequence is arranged in the order of visual deterioration degree for the image of the created code sequence. A table is selected according to the specified error amount, and the creation is performed based on the data of the table.
[0019]
Therefore, by adjusting the wavelet transform coefficient, it is possible to dynamically change the scalability according to the traffic of the network and transmit the image.
[0020]
According to a fifth aspect of the present invention, in the image transmitting apparatus according to any one of the first to fourth aspects, the code string conversion unit performs the creation according to a frame rate of a specified image.
[0021]
Therefore, since the scalability is dynamically changed according to not only the traffic of the network but also the frame rate, it is possible to prevent image deterioration.
[0022]
According to a sixth aspect of the present invention, in the image transmitting apparatus according to any one of the first to fifth aspects, the image transmitting apparatus further includes a motion amount detecting unit configured to detect a motion amount of an image with respect to the code sequence before creation. The column conversion means performs the creation according to the detected amount of motion.
[0023]
Therefore, the scalability is dynamically changed according to not only the traffic of the network but also the amount of motion of the image, so that the deterioration of the image can be prevented.
[0024]
According to a seventh aspect of the present invention, there is provided the image transmitting apparatus according to any one of the first to sixth aspects, and a plurality of image receiving apparatuses receiving the code string transmitted by the image transmitting apparatus via a network. And a network system comprising:
[0025]
Therefore, the same operation and effect as the invention according to any one of claims 1 to 6 can be obtained.
[0026]
According to an eighth aspect of the present invention, there is provided a computer-readable program for executing the function of each of the means according to any one of the first to sixth aspects.
[0027]
Therefore, the same operation and effect as the invention according to any one of claims 1 to 6 can be obtained.
[0028]
According to a ninth aspect of the present invention, there is provided a storage medium storing the program according to the eighth aspect.
[0029]
Therefore, the same operation and effect as the invention described in claim 8 can be obtained.
[0030]
BEST MODE FOR CARRYING OUT THE INVENTION
[Overview of JPEG2000 algorithm]
First, an outline of a JPEG2000 algorithm which is a prerequisite technique in an embodiment of the present invention will be described.
[0031]
FIG. 1 is an explanatory diagram for explaining the basics of the JPEG2000 algorithm. The JPEG2000 algorithm includes a color space conversion / inverse conversion unit 111, a two-dimensional wavelet conversion / inverse conversion unit 112, a quantization / inverse quantization unit 113, an entropy encoding / decoding unit 114, and a tag processing unit 115. I have.
[0032]
As shown in FIG. 2, in the color image, generally, each component (here, the RGB primary color system) of the original image is divided by rectangular regions (tiles) 123122, 123. Each of the tiles, for example, R00, R01, ..., R15 / G00, G01, ..., G15 / B00, B01, ..., B15 is a basic unit when executing the compression / decompression process. Therefore, the compression / expansion operation is performed independently for each component and for each tile.
[0033]
When encoding the image data, the data of each tile of each component is input to the color space conversion / inverse conversion unit 111 of FIG. 1 and subjected to color space conversion. A two-dimensional wavelet transform (forward transform) is applied to perform spatial division into frequency bands.
[0034]
FIG. 3 shows subbands at each decomposition level when the number of decomposition levels is three. That is, a two-dimensional wavelet transform is performed on the tile original image (0LL) (decomposition level 0 (131)) obtained by the tile division of the original image, and the sub-level shown in the decomposition level 1 (132) Separate the bands (1LL, 1HL, 1LH, 1HH). Subsequently, two-dimensional wavelet transform is performed on the low-frequency component 1LL in this layer to separate the sub-bands (2LL, 2HL, 2LH, 2HH) indicated by the decomposition level 2 (133). Similarly, two-dimensional wavelet transform is similarly performed on the low-frequency component 2LL to separate the sub-bands (3LL, 3HL, 3LH, 3HH) indicated by the decomposition level 3 (134). Further, in FIG. 3, the sub-bands to be encoded at each decomposition level are indicated by oblique lines. For example, when the number of decomposition levels is 3, the sub-bands (3HL, 3LH, 3HH, 2HL, 2LH, 2HH, 1HL, 1LH, 1HH) indicated by oblique lines are to be encoded, and the 3LL sub-band is Not encoded.
[0035]
Next, bits to be encoded are determined in the specified encoding order, and the quantization / inverse quantization unit 113 in FIG. 1 generates a context from bits around the target bit. The wavelet coefficients after the quantization process are divided into non-overlapping rectangles called “precincts” for each subband. This was introduced to make efficient use of memory in the implementation. As shown in FIG. 5, one precinct is formed of three spatially coincident rectangular areas. Further, each precinct is divided into non-overlapping rectangular "code blocks". This is a basic unit when performing entropy coding.
[0036]
The coefficient value after wavelet transform can be quantized and encoded as it is. However, in JPEG2000, in order to increase the encoding efficiency, the coefficient value is decomposed into “bit plane” units, and each pixel or code block is decomposed. "Bit planes" can be ranked. FIG. 6 briefly shows the procedure. In this example, the original image (32 × 32 pixels) is divided by four tiles of 16 × 16 pixels, and the size of the precinct and the code block at the decomposition level 1 are 8 × 8 pixels and 4 pixels, respectively. × 4 pixels. Precincts and code blocks are numbered in raster order. The pixel expansion outside the tile boundary is performed using a mirroring method, and a wavelet transform is performed by a reversible (5 × 3) filter to obtain a wavelet coefficient value of decomposition level 1. In addition, a conceptual diagram of a representative “layer” for tile 0 / precinct 3 / code block 3 is also shown. The layer structure is easy to understand when the wavelet coefficient values are viewed from the horizontal direction (bit plane direction). One layer is composed of an arbitrary number of bit planes. In this example, layers 0, 1, 2, and 3 are each composed of three bit planes of 1, 3, and 1. Then, a layer including a bit plane closer to the LSB is subject to quantization first, and conversely, a layer closer to the MSB remains without being quantized to the end. A method of discarding from a layer close to the LSB is called truncation, and it is possible to finely control the quantization rate.
[0037]
The entropy encoding / decoding unit 114 (see FIG. 1) encodes each component tile by probability estimation from the context and the target bit. In this way, the encoding process is performed for all the components of the original image in tile units. Finally, the tag processing unit 115 combines all encoded data from the entropy coder unit into one code stream, and performs a process of adding a tag to the code stream. FIG. 4 briefly shows the structure of the code stream. As shown in FIG. 4, tag information called a header is added to the head of the code stream and the head of the partial tiles constituting each tile, followed by the encoded data of each tile. Then, the tag is placed again at the end of the code stream.
[0038]
On the other hand, at the time of decoding, image data is generated from the code stream of each tile of each component, contrary to the encoding. This will be briefly described with reference to FIG. In this case, the tag processing unit 115 interprets the tag information added to the code stream input from the outside, decomposes the code stream into a code stream of each tile of each component, and Decoding processing is performed for each code stream. The position of the bit to be decoded is determined in the order based on the tag information in the code stream, and the quantization / dequantization unit 113 sets the peripheral bits of the target bit position (decoding has already been completed). ) Generates a context. The entropy coding / decoding unit 114 performs decoding by probability estimation from the context and the code stream, generates a target bit, and writes it to the position of the target bit. Since the data decoded in this way is spatially divided for each frequency band, the two-dimensional wavelet transform / inverse transform unit 112 performs an inverse two-dimensional wavelet transform on each of the components, thereby obtaining each component of the image data. The tile is restored. The restored data is converted by the color space conversion / inverse conversion unit 111 into the original color system data.
[0039]
[Embodiment of the invention]
An embodiment of the present invention will be described.
[0040]
FIG. 7 is a block diagram illustrating the network system 10 according to the first embodiment. As shown in FIG. 7, the network system 10 includes a server 1 that transmits a code string obtained by compressing and encoding moving image data using an algorithm such as Motion JPEG2000 via a network 3 such as the Internet, and a code from the server 1. It consists of a client 2 receiving the queue. A plurality of clients 2 can be connected simultaneously (clients 23n (n = 1, 2,...) Shown in FIG. 9).
[0041]
FIG. 8 is a block diagram showing an electrical connection between the server 1 and the client 2. As shown in FIG. 8, the server 1 and the client 2 implement the image transmitting apparatus and the image receiving apparatus of the present invention, respectively, and perform various operations to centrally control each unit of the server 1 (or the client 2). A CPU 11 and a memory 12 including various ROMs and RAMs are connected by a bus 13.
[0042]
The bus 13 is provided with a magnetic storage device 14 such as a hard disk serving as a storage device, an input device 15 including a mouse and a keyboard, a display device 16 such as an LCD and a CRT, an optical disk A storage medium reading device 18 that reads a storage medium 17 that implements the storage medium of the present invention, and a predetermined communication interface 19 that is a communication device that communicates with the network 3 are connected. As the storage medium 17, various types of media such as optical disks such as CDs and DVDs, magneto-optical disks, and flexible disks can be used. Further, as the storage medium reading device 18, specifically, an optical disk drive, a magneto-optical disk drive, a flexible disk drive, or the like is used according to the type of the storage medium 17.
[0043]
The magnetic storage device 14 stores an image transmission program (or an image reception program) for executing the program of the present invention. Generally, the image transmission program (or the image reception program) is installed in the server 1 (or the client 2) by reading from the storage medium 17 embodying the storage medium of the present invention by the storage medium reading device 18, It is installed on the magnetic storage device 14 by downloading from the network 3 or the like. With this installation, the server 1 and the client 2 become operable. The image transmission program and the image reception program may be a part of specific application software. Further, it may operate on a predetermined OS.
[0044]
FIG. 9 is a functional block diagram illustrating processing executed by the network system 10 based on an image transmission program, an image reception program, and the like.
[0045]
First, processing performed by the client 2 based on an image receiving program or the like will be described. In each of the clients 2 (clients 231 to 23n), the transmission / reception unit 31, which is a transmission unit and a reception unit, receives a code string (to be described later) obtained by compressing and encoding an image from the server 1 via the communication interface 19, and receives the code string. After storing the code string in the buffer 32 (magnetic storage device 14), the code string is decoded by the decoding means 33, and the display means 34 displays it on the display device 16.
[0046]
In addition, for the code string received by the transmission / reception means 31, the data read amount detection means 27 monitors the data amount of the code string read by the transmission / reception means 31 per unit time, whereby the traffic of the network 3 is reduced. Determine the degree of crowding. This information (monitoring information) is returned to the server 1 from the transmission / reception means 31 of each client 2.
[0047]
Next, processing performed by the server 1 based on an image transmission program or the like will be described. In the buffer 21 (magnetic storage device 14) of the server 1, for example, a code string obtained by compressing and coding moving image data by the Motion JPEG2000 algorithm is stored. Alternatively, the tiles are divided into small areas of a plurality of tiles, and are compression-encoded hierarchically for each tile. The server 1 performs streaming distribution of the code string according to a request from each of the clients 201 to 20n. That is, the syntax of the code string is analyzed by the syntax analysis means 22 from the header information of the code string stored in the buffer 21, and the individual code string creation means 231 of the code string creation means 23 are analyzed based on the result of the analysis. To 23n convert the code string into a new code string according to the Motion JPEG2000 algorithm (the details will be described later), and transmit / receive the converted code string to a communication interface, which is a transmission unit and a reception unit. 19. Send to the client 2 via the network 3.
[0048]
The integrating means 25 integrates the monitoring information received from each of the clients 201 to 20n, comprehensively determines the traffic congestion state of the network 3, and outputs the result (integrated information). This integration process is performed, for example, as follows.
[0049]
Assuming that the amount of data read per second by each of the clients 201 to 20n is An (unit: bps), the number of each of the clients 201 to 20n is n.
Communication line utilization rate C = ΣAn / B
It becomes. Here, B is the total (unit: bps) of the data transfer capabilities of all the lines, that is, the maximum data distribution capability of the server 1. The usage rate C of the communication line is the content of the integrated information.
[0050]
The parameter creating means 26 serving as an error amount designating means creates parameters for creating a code string in each of the code string creating means 231 to 23n from the integrated information created by the integrating means 25. Specifically, this parameter specifies an error amount of data between a code string before creation and a code string after creation when each of the code string creation units 231 to 23n creates a code string. This specifies the frame rate of the image in the code string created by the code string creation means 231 to 23n.
[0051]
The determination of the parameter is more specifically performed, for example, as follows. That is, a threshold value is provided for the integrated information by the integrating means 25, and when the integrated information is lower than the threshold value, the code string is created so as to increase the data amount per unit time when transmitting the code string. That is, the parameters are created so that the frame rate of the image in the created code sequence is increased, or the amount of error is reduced and the quantization level of the image is reduced.
[0052]
When the integrated information exceeds the threshold value, the load on the code string creating unit 23 is large, so that the code string is created so as to reduce the amount of data per unit time at the time of transmission. That is, the parameters are created so that the frame rate is reduced or the error amount is increased to coarsen the quantization level of the image.
[0053]
Hereinafter, a case where the above-described error amount is used as a parameter will be described. FIG. 10 is a block diagram showing an example of a configuration of each of the code string creation units 231 to 23n. The parameters created by the parameter creation means 26 are input to the quantization table selection means 41. The header information of the code string stored in the buffer 21 is decoded by the syntax analysis means 22 (FIG. 9). Then, based on the header information, the error amount of the data with respect to the original data when each code constituting the code string is partially discarded is input to the quantization table selecting means 41. The quantization table selection unit 41 selects predetermined table data (quantization table) based on the input error amount and the parameter, and sends the selected table data to a quantization unit 42 that is a code string conversion unit. The quantization means 42 selectively discards the code from the code sequence according to the data of the quantization table, and rewrites the header to generate a new code sequence. As described above, since a new code string can be generated in each of the code string generation units 231 to 23n, a plurality of (n) processes for generating a new code string can be executed simultaneously and in parallel.
[0054]
In the quantization table, for each wavelet transform coefficient as exemplified in FIG. 11, the quantization bit number (code discard amount) for each wavelet transform coefficient is recorded as shown in FIG. As shown in FIG. 13, this quantization table is held as a quantization table group in which a plurality of quantization tables (quantization bit numbers) are arranged in order of visual deterioration degree after quantization in association with Index. . The quantization table selection means 41 selects an Index quantization table in which the error amount when the code is partially discarded from the code string becomes the error amount specified by the parameter when the quantization tables are combined. .
[0055]
Therefore, the parameter created by the parameter creation means 26 determines the integrated information stepwise (or steplessly), and when determining stepwise, the error amount is specified by the number of quantization bits according to the step. Just fine. For example, if the integrated information is determined in five stages, and if the stage 5 is the case where the network 3 is the most congested, the stages 1 to 5 of the integrated information are respectively replaced with the quantization bit numbers 1 to 5 which are error amounts. Should be made to correspond. Since the degree of visual deterioration after quantization and the above-described error amount usually have a monotonically increasing relationship, it is easy to select a quantization table having an error amount according to a parameter. In general, it is desirable that the quantization table is configured to quantize the low-frequency component more importantly than the high-frequency component so that deterioration is not visually noticeable.
[0056]
FIG. 14 is a functional block diagram of another configuration example of each of the code string creation units 231 to 23n. The header information of the code string stored in the buffer 21 is decoded by the syntax analysis means 22 (FIG. 9), and the error amount of the data with respect to the original data when each code is partially discarded from the header information is quantized. Is input to the conversion table selection means 41. The motion amount detection means 43 detects the motion amount of the image based on the code amount of each wavelet transform coefficient analyzed by the syntax analysis means 22. The quantization table selection unit 41 determines the input error amount and the frame rate specified by the client 2 (this example assumes that the client 2 can transmit the frame rate specification to the server 1). , And based on the motion amount detected by the motion amount detecting means 43, the same quantization table as described above is selected and sent to the quantization means 42. The quantization means 42 generates a new code string by partially discarding the code from the code string and rewriting the header based on the data of the quantization table and the frame rate specified by the client 2. I do.
[0057]
The selection of a table from the plurality of quantization tables is the same as in the example described with reference to FIGS. 11 to 13, but the quantization table selection unit 41 holds a plurality of quantization table groups. Thus, a predetermined table group is selected and adapted according to the frame rate and the amount of motion.
[0058]
The motion amount detecting means 43 detects the motion amount of the image as follows. FIG. 15 is an explanatory diagram for explaining the concept of the amount of motion of an image in the Motion JPEG2000 system. As shown in FIG. 15, in an interlaced image, a fast moving image has a long horizontal edge as shown in FIG. 15A (referred to as an interlaced comb type). On the other hand, an image having a slow motion has a short horizontal edge as shown in FIG. FIG. 15B shows a case where the intermediate movement is a medium speed. These differences appear greatly in the 1LH component representing the horizontal edge amount of the high frequency component. That is, in an image having a large amount of motion, the absolute value of the coefficient of the 1LH component increases, and as a result, the code amount of the 1LH component increases. However, if the motion amount of the image is determined only by the code amount of the 1LH component, the threshold value may vary depending on the image. Therefore, the code amount of the 1LH component is normalized by the code amount of the 1HL component, and the value is calculated by May be used as the feature amount of the detection of.
[0059]
Further, since the above-mentioned feature is greatly exhibited in the code amount before cutting the bit plane (post-quantization), the code amount is written with the 1LH and 1HL code amounts before cutting the bit plane, and the motion is calculated using the value. A configuration for estimating the amount is also possible. This is particularly effective when the image compression ratio is high.
[0060]
FIG. 16 is a flowchart illustrating an example of a process of determining the amount of motion of an image in this case.
[0061]
As shown in FIG. 16, first, the sum (sum1LH) of the lossless code amounts of 1 LH is calculated (step S1), and the sum (sum1HL) of the lossless code amounts of 1HL is calculated (step S2), and “sum1LH” is obtained. Is divided by "sum1HL" (step S3), and the result (speed) is compared with a predetermined threshold (th1). If "speed>th1" (Y in step S3), the amount of motion of the image is large. Is determined (step S4). If “speed ≦ th1” (N in step S3), it is determined that the amount of motion of the image is small (step S5).
[0062]
Further, in the case of a high-speed image, the degree of visual deterioration can be suppressed by performing quantization so as to leave the above-mentioned comb shape. However, when the frame rate is reduced by thinning out the frames, one frame is displayed longer, and as a result, the comb shape becomes conspicuous. Therefore, when the frame rate is reduced, it is necessary to perform quantization so that the comb pattern does not remain even in a high-speed image. Table 1 shows the combinations. That is, when the frame rate is high, the comb pattern is stored when the motion amount is large, and not when the motion amount is small. When the frame rate is low, the comb type is not saved regardless of the amount of motion.
[0063]
[Table 1]

[0064]
The processing of the client 2 and the server 1 described above will be described below in the form of flowcharts shown in FIGS. First, as shown in FIG. 17, when a code string is received from the server 1 (Y in step S11), the client 2 detects the amount of code string read per unit time by the data read amount detecting means 27. (Step S12) The error amount creating means 28 creates an error amount parameter from the detected value (Step S13) and transmits it to the server 1 (Step S14).
[0065]
As shown in FIG. 18, when there is a code string to be transmitted (Y in step S21), the server 1 receives an error amount received from each client 2 (in the example of FIG. Are integrated (step S22), a quantization table is selected based on the integrated information as described above (step S23), and a new code string is formed according to the quantization table. It is created (step S24), and the created code string is transmitted (step S25).
[0066]
As described above, according to the present network system 10, when the traffic of the network 3 becomes congested, the error amount is increased and the data amount of the code string to be transmitted is reduced, so that the network 3 operates according to the traffic of the network 3. The image can be transmitted while the scalability is changed.
[0067]
Further, according to the configuration example described with reference to FIG. 14 and subsequent figures, the scalability is dynamically changed according to not only the traffic of the network 3 but also the frame rate and the amount of motion of the image. be able to.
[0068]
【The invention's effect】
According to the first aspect of the invention, when network traffic becomes congested, the amount of error is increased to reduce the data amount of a code string to be transmitted, thereby dynamically changing scalability in accordance with network traffic. Image transmission.
[0069]
According to a second aspect of the present invention, in the first aspect of the present invention, the scalability is dynamically changed according to the network traffic by using the error amount designated by the transmission side according to the network traffic. Images can be transmitted.
[0070]
According to a third aspect of the present invention, in the first or second aspect, data representing an error amount due to discarding a code of the code string included in the code string before creation of a new code string. Can be read from header information or the like to create a new code string, and the image can be transmitted while dynamically changing the scalability according to network traffic.
[0071]
The invention according to claim 4 is the invention according to any one of claims 1 to 3, wherein the scalability is dynamically changed according to network traffic by adjusting the wavelet transform coefficient. Transmission can be performed.
[0072]
The invention according to claim 5 is the invention according to any one of claims 1 to 4, wherein the scalability is dynamically changed according to the frame rate as well as the network traffic. Can be prevented.
[0073]
In the invention according to claim 6, in the invention according to any one of claims 1 to 4, the scalability is dynamically changed according to not only the traffic of the network but also the amount of motion of the image. Deterioration can be prevented.
[0074]
The inventions according to claims 7 to 9 can provide the same functions and effects as the inventions according to any one of claims 1 to 6.
[Brief description of the drawings]
FIG. 1 is an explanatory diagram for explaining the basics of a JPEG2000 algorithm.
FIG. 2 is an explanatory diagram for describing each component of a color image.
FIG. 3 is an explanatory diagram showing sub-bands at each decomposition level when the number of decomposition levels is three.
FIG. 4 is an explanatory diagram of a structure of a code stream.
FIG. 5 is an explanatory diagram showing that one precinct is formed of three spatially coincident rectangular regions.
FIG. 6 is an explanatory diagram of decomposing a coefficient value in units of bit planes and ranking the bit planes for each pixel or code block.
FIG. 7 is a block diagram of a schematic configuration of a network system according to an embodiment of the present invention.
FIG. 8 is a block diagram of electrical connection between a server and a client.
FIG. 9 is a functional block diagram of a network system.
FIG. 10 is a functional block diagram illustrating a code string creating unit.
FIG. 11 is an explanatory diagram of each wavelet transform coefficient.
FIG. 12 is an explanatory diagram showing the relationship between wavelet transform coefficients and the number of quantization bits.
FIG. 13 is an explanatory diagram of a quantization table.
FIG. 14 is a functional block diagram illustrating another example of a code string creating unit.
FIG. 15 is an explanatory diagram of an interlace comb type.
FIG. 16 is a flowchart of a process of determining a motion amount of an image.
FIG. 17 is a flowchart of a process performed by a server.
FIG. 18 is a flowchart of a process performed by a client.
[Explanation of symbols]
REFERENCE SIGNS LIST 1 image transmission device 2 image reception device 3 network 10 network system 17 storage medium 22 syntax analysis unit 24 transmission unit, reception unit 25 integration unit 31 reception unit, transmission unit 35 data read amount detection unit 41 error amount creation unit 42 code string Conversion means

Claims

Syntax analysis means for dividing the video data into one or a plurality of small areas for each frame and analyzing the syntax of the code string data for a code string hierarchically compression-coded for each of the small areas;
Code string conversion means capable of simultaneously and concurrently executing a plurality of processes for creating a new code string from the code string based on the analysis result,
Transmitting means for transmitting each of the created code strings to each destination via a network;
Receiving means for receiving information on the amount of data read per unit time when receiving the transmitted code string at each of the destinations from each of the destinations,
Integrated means for integrating the information of the received data read amounts to detect the traffic congestion state of the network;
An error amount of data between the code string before creation and the new code string is specified according to the result of the integration, and the code string conversion unit is made to perform the creation so as to have the specified error amount. Error amount designating means;
An image transmission device comprising:

The image transmitting apparatus according to claim 1, wherein the error amount specifying unit receives the specification of the error amount from a transmission destination of the code string via the network.

The method according to claim 1, wherein the code string conversion unit performs the creation by reading data representing the amount of error caused by discarding a code of the code string included in the code string before creation. Image transmission device.

The code string transforming means, when a wavelet transform is used to create the code string before creation, calculates the number of quantization bits for each wavelet transform coefficient for each wavelet transform coefficient of the code string after the creation. A table is selected from a plurality of tables arranged in the order of visual deterioration degree of the image in the row according to the error amount specified by the error amount specifying means, and the creation is performed based on the data of the table. The image transmission device according to any one of claims 1 to 3.

The image transmission device according to claim 1, wherein the code string conversion unit performs the creation according to a frame rate of a designated image.

A motion amount detection unit that detects a motion amount of an image with respect to the code string before creation,
The image transmission device according to claim 1, wherein the code string conversion unit performs the creation according to the detected amount of motion.

An image transmitting device according to any one of claims 1 to 6,
A plurality of image receiving devices that receive the code string transmitted by the image transmitting device via a network,
A network system comprising:

A non-transitory computer-readable program that executes a function of each of the means according to any one of claims 1 to 6.

A storage medium storing the program according to claim 8.