JP4735558B2

JP4735558B2 - Information processing apparatus and information processing method

Info

Publication number: JP4735558B2
Application number: JP2007029460A
Authority: JP
Inventors: 幸広荒戸; 智副古屋野; 知子相田; 耕二山宮
Original assignee: サクサ株式会社
Priority date: 2007-02-08
Filing date: 2007-02-08
Publication date: 2011-07-27
Anticipated expiration: 2027-02-08
Also published as: JP2008199109A

Description

本発明は、動画圧縮技術に関するものであり、より詳しくは符号化された動画像のデータを復号化する情報処理装置および情報処理方法に関するものである。 The present invention relates to a moving image compression technique, and more particularly to an information processing apparatus and an information processing method for decoding encoded moving image data.

近年、例えばＴＶ会議システムなど複数の端末装置間でそれぞれが撮影した動画像を相互に視聴可能なシステムが提案されている。このようなシステムにおいて、各端末装置は、自装置で撮影した動画像を他の端末装置に送信するとともに、他の端末装置から送られてきた複数の動画像を同一画面に表示する。これにより、各端末装置のユーザは、他の端末装置のユーザの動画像を見ながら会話等を行うことが可能となる。 In recent years, for example, a system capable of mutually viewing moving images taken by a plurality of terminal devices such as a TV conference system has been proposed. In such a system, each terminal device transmits a moving image captured by the own device to other terminal devices, and displays a plurality of moving images sent from other terminal devices on the same screen. Accordingly, the user of each terminal device can perform a conversation or the like while watching the moving image of the user of another terminal device.

このようなシステムでは、通信負荷を軽減するために、送受信する動画像に対して符号化が行われる。例えば、Ｈ．２６４またはＭＰＥＧ−４と呼ばれる動画圧縮技術（例えば、非特許文献１参照。）を用いると、動画像は、空間的に符号化されるＩピクチャ、および、他のピクチャを参照して符号化されるＰピクチャまたはＢピクチャから構成される複数の連続したピクチャからなる画像データ（以下、符号化データという）に符号化される。この符号化データは、受信側の端末装置で復号化されることにより、動画像として再生することが可能となる。 In such a system, in order to reduce the communication load, encoding is performed on a moving image to be transmitted and received. For example, H.M. When a moving image compression technique called H.264 or MPEG-4 (for example, see Non-Patent Document 1) is used, a moving image is encoded with reference to a spatially encoded I picture and other pictures. Are encoded into image data (hereinafter referred to as encoded data) including a plurality of consecutive pictures including a P picture or a B picture. The encoded data can be reproduced as a moving image by being decoded by the terminal device on the receiving side.

上記システムにおいて、動画像の符号化および符号化データの復号化は、各端末装置に実装されたソフトウェアにより行われる。このため、例えば３，４台程度の少数の端末装置による符号化データの送受信であれば端末装置の動作上の問題はないが、例えば１０台程度など多数の端末装置による符号化データの送受信を行うとなると、各端末装置では同時に送られてくる多数の符号化データを同時に復号化しなければならないので、ＣＰＵにかかる負荷が増大し、結果として動画像が正常に再生されない場合があった。 In the system described above, encoding of moving images and decoding of encoded data are performed by software installed in each terminal device. For this reason, there is no problem in the operation of the terminal device as long as it is transmission / reception of encoded data by a small number of terminal devices, for example, about 3 or 4 units. If this is the case, each terminal device must simultaneously decode a large number of encoded data, which increases the load on the CPU, and as a result, the moving image may not be reproduced normally.

このような事態を防ぐために、従来では、例えばＩピクチャのみを復号化したり、フレームレートを変更するなど、端末装置における復号化処理にかかる負荷を低減させる方法が提案されている（例えば、特許文献１参照。）。 In order to prevent such a situation, conventionally, for example, a method for reducing the load on the decoding process in the terminal device such as decoding only the I picture or changing the frame rate has been proposed (for example, Patent Documents). 1).

特開平９−７４５４８号公報Japanese Patent Laid-Open No. 9-74548 International Telecommunication Union、“ITU-T Recommendation H.264 Advanced video coding for generic audiovisual services”、［online］、［平成１９年１月２３日検索］、インターネット<http://www.itu.int/rec/T-REC-H.264/en>International Telecommunication Union, “ITU-T Recommendation H.264 Advanced video coding for generic audiovisual services”, [online], [searched January 23, 2007], Internet <http://www.itu.int/rec/ T-REC-H.264 / en>

しかしながら、Ｉピクチャのみを復号化する方法は、動画像再生時のフレームレートが極端に低くなってしまうので、高画質な動画像を提供するのが困難となる。これを解消するためにＩピクチャの数を増やすと、低ビットレートな環境では復号した際の画質が低くなる。そこで画質を向上させようとすると、ビットレートを上げる必要があるが、ＰピクチャやＢピクチャと比較してＩピクチャはデータ量が大きいので、ネットワーク帯域の多くを占有してしまい他の通信に影響が出てしまう可能性がある。 However, the method of decoding only I pictures makes it difficult to provide high-quality moving images because the frame rate during moving image reproduction becomes extremely low. If the number of I pictures is increased in order to solve this problem, the image quality at the time of decoding decreases in a low bit rate environment. In order to improve the image quality, it is necessary to increase the bit rate. However, since the I picture has a larger amount of data than the P picture and B picture, it occupies much of the network bandwidth and affects other communications. May come out.

また、フレームレートを変更する方法は、フレームレートが高いときはよいものの、フレームレートが低いときには、高画質な動画像を再生することが困難となる。 The method for changing the frame rate is good when the frame rate is high, but when the frame rate is low, it is difficult to reproduce a high-quality moving image.

上述したように、従来は、端末装置の演算負荷を低減させるとともに、高画質の動画像を再生することが困難であった。
そこで、本願発明は、演算負荷を低減させるとともに動画像の画像を向上させることができる情報処理装置および情報処理方法を提供することを目的とする。 As described above, conventionally, it has been difficult to reduce the computational load on the terminal device and to reproduce high-quality moving images.
Accordingly, an object of the present invention is to provide an information processing apparatus and an information processing method capable of reducing a calculation load and improving a moving image.

上述したような課題を解決するために、本発明に係る情報処理装置は、符号化された連続する複数のピクチャを含む符号化データを入力とし、入力されたピクチャがＩピクチャであるか否かを判別し、当該ピクチャがＩピクチャである場合には当該ピクチャを出力する一方、当該ピクチャがＩピクチャではない場合、当該ピクチャより先に入力された所定の他のピクチャが復号化されているか否かを判別し、前記他のピクチャが復号化されていれば当該ピクチャを出力しない選択手段と、この選択手段から出力されたピクチャを復号化する復号化手段とを備え、符号化データは、Ｉピクチャではないピクチャが、直前ではない、より時間的に遠いＩピクチャまたは復号化されているＩピクチャではないピクチャを参照する構造を有することを特徴とする。 In order to solve the above-described problems, an information processing apparatus according to the present invention receives encoded data including a plurality of consecutive encoded pictures, and determines whether or not the input picture is an I picture. When the picture is an I picture, the picture is output. On the other hand, when the picture is not an I picture, whether or not a predetermined other picture input before the picture has been decoded. If the other picture has been decoded, a selection unit that does not output the picture and a decoding unit that decodes the picture output from the selection unit are provided . not a picture picture is not the immediately preceding, to have a structure that reference picture is not an I-picture that is more temporally distant I picture or decoding And butterflies.

また、本発明に係る他の情報処理装置は、符号化された連続する複数のピクチャを含む符号化データを入力とし、入力されたピクチャがＩピクチャであるか否かを判別し、当該ピクチャがＩピクチャである場合には当該ピクチャを出力する一方、当該ピクチャがＩピクチャではない場合、当該ピクチャより先に入力された所定の他のピクチャが復号化されているか否かを判別し、前記他のピクチャが復号化されていれば当該ピクチャを出力せず、前記他のピクチャが復号化されていなければ、当該ピクチャが参照すべき参照ピクチャが復号化されているか否かを判別し、前記参照ピクチャが復号化されていれば当該ピクチャを出力し、前記参照ピクチャが復号化されていなければ当該ピクチャを出力しない選択手段と、この選択手段から出力されたピクチャを復号化する復号化手段とを備え、符号化データは、Ｉピクチャではないピクチャが、直前ではない、より時間的に遠いＩピクチャまたは復号化されているＩピクチャではないピクチャを参照する構造を有することを特徴とする。 Further, another information processing apparatus according to the present invention receives encoded data including a plurality of consecutive encoded pictures, determines whether or not the input picture is an I picture, and the picture is If the picture is an I picture, the picture is output. On the other hand, if the picture is not an I picture, it is determined whether a predetermined other picture input prior to the picture has been decoded. If the picture is decoded, the picture is not output. If the other picture is not decoded, it is determined whether or not the reference picture to be referenced by the picture is decoded. If the picture has been decoded, the picture is output, and if the reference picture has not been decoded, the selection means that does not output the picture, and output from the selection means E Bei was a decoding means for decoding the picture, the encoded data is a picture not I pictures, not immediately before the picture that is not an I-picture that is more temporally distant I picture or decoding It has a structure to be referred to .

また、本発明に係る情報処理方法は、符号化された連続する複数のピクチャを含む符号化データを入力とし、入力されたピクチャがＩピクチャであるか否かを判別し、当該ピクチャがＩピクチャである場合には当該ピクチャを出力する一方、当該ピクチャがＩピクチャではない場合、当該ピクチャより先に入力された所定の他のピクチャが復号化されているか否かを判別し、前記他のピクチャが復号化されていれば当該ピクチャを出力しない選択ステップと、この選択ステップにより出力されたピクチャを復号化する復号化ステップとを有し、符号化データは、Ｉピクチャではないピクチャが、直前ではない、より時間的に遠いＩピクチャまたは復号化されているＩピクチャではないピクチャを参照する構造を有することを特徴とする。 Also, the information processing method according to the present invention receives encoded data including a plurality of consecutive encoded pictures, determines whether or not the input picture is an I picture, and the picture is an I picture. If the picture is not an I picture, it is determined whether or not a predetermined other picture input before the picture has been decoded, and the other picture is output. a selection step does not output the picture if but decoded, possess a decoding step of decoding the outputted picture by the selection step, the encoded data, a picture not I-picture, the immediately preceding It is characterized by having a structure that refers to a picture that is not a far-in-time I picture or a picture that is not a decoded I picture .

また、本発明に係る他の情報処理方法は、符号化された連続する複数のピクチャを含む符号化データを入力とし、入力されたピクチャがＩピクチャであるか否かを判別し、当該ピクチャがＩピクチャである場合には当該ピクチャを出力する一方、当該ピクチャがＩピクチャではない場合、当該ピクチャより先に入力された所定の他のピクチャが復号化されているか否かを判別し、前記他のピクチャが復号化されていれば当該ピクチャを出力せず、前記他のピクチャが復号化されていなければ、当該ピクチャが参照すべき参照ピクチャが復号化されているか否かを判別し、前記参照ピクチャが復号化されていれば当該ピクチャを出力し、前記参照ピクチャが復号化されていなければ当該ピクチャを出力しない選択ステップと、この選択ステップにより出力されたピクチャを復号化する復号化ステップとを有し、符号化データは、Ｉピクチャではないピクチャが、直前ではない、より時間的に遠いＩピクチャまたは復号化されているＩピクチャではないピクチャを参照する構造を有することを特徴とする。 Further, another information processing method according to the present invention receives encoded data including a plurality of consecutive encoded pictures, determines whether or not the input picture is an I picture, and the picture is If the picture is an I picture, the picture is output. On the other hand, if the picture is not an I picture, it is determined whether a predetermined other picture input prior to the picture has been decoded. If the picture is decoded, the picture is not output. If the other picture is not decoded, it is determined whether or not the reference picture to be referenced by the picture is decoded. A selection step of outputting the picture if the picture has been decoded, and not outputting the picture if the reference picture has not been decoded, and the selection step Possess a decoding step for decoding a more output picture, encoded data, the picture is not an I-picture is not a last minute, not the I picture that is more temporally distant I picture or decoding It has a structure that refers to a picture .

本発明によれば、入力されたピクチャがＩピクチャでない場合に当該ピクチャより先に入力された所定の他のピクチャが復号化されていると、当該ピクチャが復号化されないので、復号化に伴う演算負荷を低減させることができる。また、Ｉピクチャ以外のピクチャを復号化することも可能となるので、動画像の画質を向上させることができる。 According to the present invention, when an input picture is not an I picture, if a predetermined other picture input before the picture is decoded, the picture is not decoded. The load can be reduced. In addition, since it is possible to decode pictures other than I pictures, it is possible to improve the quality of moving images.

以下、図面を参照して、本発明の実施の形態について詳細に説明する。なお、本実施の形態では、ＴＶ会議システムにおける端末装置に本発明を適用した場合を例に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In this embodiment, a case where the present invention is applied to a terminal device in a TV conference system will be described as an example.

［ＴＶ会議システムの構成］
図１に示すように、本実施の形態に係るＴＶ会議システムは、サーバ１と、複数の端末装置２ａ〜２ｎとから構成され、それぞれはＬＡＮ(Local Area Network)やインターネット等からなる通信回線３により接続されている。なお、端末装置２ａ〜２ｎは、それぞれ同等の構成を有するので、便宜上、以下端末装置２と言う。 [Configuration of TV conference system]
As shown in FIG. 1, the TV conference system according to the present embodiment includes a server 1 and a plurality of terminal devices 2a to 2n, each of which is a communication line 3 including a LAN (Local Area Network), the Internet, or the like. Connected by. Since the terminal devices 2a to 2n have the same configuration, they are hereinafter referred to as the terminal device 2 for convenience.

（サーバの構成）
サーバ１は、ＴＶ会議システムを構成する端末装置２ａ〜２ｎの呼制御を行い、端末装置２間の通信を確立する。 (Server configuration)
The server 1 performs call control of the terminal devices 2a to 2n constituting the TV conference system, and establishes communication between the terminal devices 2.

（端末装置の構成）
端末装置２は、ＴＶ会議システムを利用するユーザにより用いられる公知のコンピュータから構成される。この端末装置２は、図２に示すように、外部Ｉ／Ｆ部４と、操作入力部５と、画像処理部６と、音声処理部７と、記憶部８と、制御部９とから構成される。 (Configuration of terminal device)
The terminal device 2 is composed of a known computer used by a user who uses the TV conference system. As shown in FIG. 2, the terminal device 2 includes an external I / F unit 4, an operation input unit 5, an image processing unit 6, an audio processing unit 7, a storage unit 8, and a control unit 9. Is done.

外部Ｉ／Ｆ部４は、通信回路からなり、通信回線３等を介して、サーバ１および他の端末装置２とデータ通信を行い、接続された各装置と各種データをやりとりする。 The external I / F unit 4 includes a communication circuit, performs data communication with the server 1 and other terminal devices 2 through the communication line 3 and the like, and exchanges various data with each connected device.

操作入力部５は、キーボードやマウス等からなり、ユーザの操作を検出して制御部９へ出力する。 The operation input unit 5 includes a keyboard and a mouse, and detects a user operation and outputs it to the control unit 9.

画像処理部６は、画像処理回路からなり、カメラ６ａにより取り込まれた動画像を制御部９に入力したり、制御部９から入力された動画像をモニタ６ｂから出力させたりするとともに、階調制御等を行う。 The image processing unit 6 includes an image processing circuit. The moving image captured by the camera 6a is input to the control unit 9, and the moving image input from the control unit 9 is output from the monitor 6b. Control and so on.

音声処理部７は、信号処理回路からなり、マイク７ａから入力された音声を音声データに符号化して制御部９に出力したり、制御部９から入力された音声データを復号化してスピーカ７ｂから出力したりするとともに、音量制御やエコーキャンセラ等の機能を有する。 The audio processing unit 7 includes a signal processing circuit, encodes audio input from the microphone 7a into audio data and outputs the audio data to the control unit 9, or decodes audio data input from the control unit 9 and outputs from the speaker 7b. Output, and has functions such as volume control and echo canceller.

記憶部８は、メモリやハードディスクなどの記憶装置からなり、制御データ８ａおよびプログラム８ｂを格納する。ここで、制御データ８ａは、端末装置２の制御に用いる情報である。また、プログラム８ｂは、端末装置２の動作プログラムであり、予め記憶部８に格納される。なお、プログラム８ｂは、ＤＶＤ（Digital Versatile Disk）やＣＤ（Compact disk）等の記録媒体やネットワークを介して記憶部８に格納するようにしてもよい。 The storage unit 8 includes a storage device such as a memory or a hard disk, and stores control data 8a and a program 8b. Here, the control data 8 a is information used for controlling the terminal device 2. The program 8b is an operation program for the terminal device 2 and is stored in the storage unit 8 in advance. The program 8b may be stored in the storage unit 8 via a recording medium such as a DVD (Digital Versatile Disk) or a CD (Compact disk) or a network.

制御部９は、ＣＰＵなどのマイクロプロセッサとその周辺回路からなり、記憶部８からプログラム８ｂを読み込んで実行することにより、上記ハードウェアとプログラム８ｂを協働させ、符号化手段１１、復号化手段１２およびＴＶ会議手段１３を実現させる。 The control unit 9 includes a microprocessor such as a CPU and its peripheral circuits, and reads and executes the program 8b from the storage unit 8 to cause the hardware and the program 8b to cooperate with each other, thereby encoding means 11 and decoding means. 12 and the TV conference means 13 are realized.

符号化手段１１は、Ｈ．２６４規格に基づいて、カメラ６ａおよび画像処理部６により生成された連続する複数のフレーム（画像）を符号化して、連続する複数のピクチャ（符号化された画像）からなる符号化データを生成する。この符号化手段１１は、図３に示すように、符号化手段１１は、差分演算部１１１、ＤＣＴ部１１２、量子化部１１３、逆量子化部１１４、ＩＤＣＴ部１１５、加算演算部１１６、デブロッキングフィルタ１１７、動き補償予測部１１８、フレームメモリ１１９、動き検出部１２０、イントラ予測部１２１、スイッチ１２２および符号化データ出力部１２３を備えている。このような構成を有する符号化手段１１による画像符号化処理動作の詳細については後述する。 The encoding means 11 is an H.264 standard. Based on the H.264 standard, a plurality of continuous frames (images) generated by the camera 6a and the image processing unit 6 are encoded to generate encoded data composed of a plurality of continuous pictures (encoded images). . As shown in FIG. 3, the encoding unit 11 includes a difference calculation unit 111, a DCT unit 112, a quantization unit 113, an inverse quantization unit 114, an IDCT unit 115, an addition calculation unit 116, A blocking filter 117, a motion compensation prediction unit 118, a frame memory 119, a motion detection unit 120, an intra prediction unit 121, a switch 122, and an encoded data output unit 123 are provided. Details of the image encoding processing operation by the encoding means 11 having such a configuration will be described later.

復号化手段１２は、外部Ｉ／Ｆ部４を介して他の端末装置２から受信した符号化データを復号化して画像データを生成する。この復号化手段１２は、図４に示すように、選択部１５０、復号化部１３１、逆量子化部１３２、ＩＤＣＴ部１３３、加算演算部１３４、デブロッキングフィルタ１３５、フレームメモリ１３６、動き補償予測部１３７、イントラ予測部１３８およびスイッチ１３９を備えている。このような復号化手段１２は、ＴＶ会議システムのように同時に複数の端末装置２と符号化データのやりとりを行う場合、相手先の端末装置２毎に設けられる。なお、復号化手段１２による復号化処理動作の詳細については後述する。 The decoding unit 12 decodes encoded data received from another terminal device 2 via the external I / F unit 4 to generate image data. As shown in FIG. 4, the decoding unit 12 includes a selection unit 150, a decoding unit 131, an inverse quantization unit 132, an IDCT unit 133, an addition operation unit 134, a deblocking filter 135, a frame memory 136, a motion compensation prediction. Unit 137, intra prediction unit 138, and switch 139. Such a decoding means 12 is provided for each terminal device 2 of the other party when exchanging encoded data with a plurality of terminal devices 2 at the same time as in the TV conference system. Details of the decryption processing operation by the decryption means 12 will be described later.

ＴＶ会議手段１３は、通信回線３および外部Ｉ／Ｆ部４を介して他の端末装置２と符号化データおよび音声データをやりとりすることにより、ＴＶ会議システムを実現するものである。具体的には、まず、サーバ１の呼制御の結果、ＴＶ会議に使用される端末装置２の数量に応じて、復号化手段１２の数量を設定する。また、符号化手段１１により生成された符号化データおよび音声処理部７により生成された音声データを、他の端末装置２それぞれに送信させる。また、他の端末装置２から受信した音声データを、音声処理部７によりスピーカ７ｂから出力させる。また、他の各端末装置２から受信した符号化データを、それぞれ対応する復号化手段１２により復号化させ、生成された各フレームを画像処理部６によりモニタ６ｂ上の所定の位置およびサイズで同時に表示させる。例えば、図５に示すように、発言しているユーザの端末装置２から動画像は、符号ｘで示すように画面の中央に大きく表示させ、他のユーザの端末装置２からの動画像は、符号ｙで示すように小さく表示させるようにしてもよい。このようにすることにより、他の端末装置２のユーザと、それぞれの動画像を見ながらそれぞれと会話を行うことができる。 The TV conference means 13 implements a TV conference system by exchanging encoded data and audio data with another terminal device 2 via the communication line 3 and the external I / F unit 4. Specifically, first, as a result of the call control of the server 1, the quantity of the decoding means 12 is set according to the quantity of the terminal devices 2 used for the TV conference. Further, the encoded data generated by the encoding unit 11 and the audio data generated by the audio processing unit 7 are transmitted to each of the other terminal devices 2. In addition, the audio data received from the other terminal device 2 is output from the speaker 7 b by the audio processing unit 7. Also, the encoded data received from each of the other terminal devices 2 is decoded by the corresponding decoding means 12, and each of the generated frames is simultaneously performed at a predetermined position and size on the monitor 6b by the image processing unit 6. Display. For example, as shown in FIG. 5, the moving image from the terminal device 2 of the speaking user is displayed largely at the center of the screen as indicated by the symbol x, and the moving image from the terminal device 2 of the other user is As indicated by the symbol y, it may be displayed in a small size. By doing in this way, it is possible to have a conversation with a user of another terminal device 2 while watching each moving image.

［符号化処理動作］
次に、図３を参照して符号化手段１１による符号化処理動作について説明する。この符号化は、イントラ予測符号化および動き補償予測符号化の２種類がある。これらは、スイッチ１２２によって切り替えがなされる。ここではまず始めにイントラ予測符号化について説明する。 [Encoding processing operation]
Next, the encoding processing operation by the encoding means 11 will be described with reference to FIG. There are two types of encoding: intra prediction encoding and motion compensation prediction encoding. These are switched by the switch 122. Here, first, intra prediction encoding will be described.

（イントラ予測符号化）
イントラ予測符号化とは、基本的には１枚のフレーム内において圧縮符号化を行うものである。具体的には、画像処理部６から入力されたフレームは、ＤＣＴ部１１２に入力されて周波数変換されたのち、量子化部１１３で量子化され、符号化データ出力部１２３で可変長符号化が施される。これにより、Ｉピクチャが出力される。 (Intra prediction coding)
Intra-predictive encoding is basically compression encoding within one frame. Specifically, the frame input from the image processing unit 6 is input to the DCT unit 112 and subjected to frequency conversion, then quantized by the quantization unit 113, and variable-length encoded by the encoded data output unit 123. Applied. As a result, an I picture is output.

なお、量子化部１１３から出力される残差データは、逆量子化部１１４およびＩＤＣＴ部１１５により逆量子化および逆周波数変換が順次施され、デブロッキングフィルタ１１７により符号化を行う際に生じたブロック歪を緩和するための処理が行われたのち、再構成画像データとしてフレームメモリ１１９に格納される。 The residual data output from the quantization unit 113 is generated when the inverse quantization unit 114 and the IDCT unit 115 sequentially perform inverse quantization and inverse frequency conversion, and the deblocking filter 117 performs encoding. After processing for reducing block distortion is performed, it is stored in the frame memory 119 as reconstructed image data.

（動き補償予測符号化）
動き補償予測符号化とは、基本的には他のフレーム（参照フレーム）からの変化を符号化するものである。具体的には、画像処理部６からフレームが入力されると、動き検出部１２０は、入力されたフレームとフレームメモリ１１９に格納された再構成画像データとに基づいて、動きベクトルを検出する。動き補償予測部１１８は、動きベクトルとフレームメモリ１１９に格納された再構成画像データとに基づいて、予測画像データを生成する。差分演算部１１１は、入力されたフレームと予測画像データとの差分をとり、差分画像データを生成する。この差分画像データは、ＤＣＴ部１１２に入力されて周波数変換されたのち、符号化データ出力部１２３で可変長符号化が施される。これにより、ＰピクチャやＢピクチャが出力される。 (Motion compensated prediction coding)
The motion compensation predictive encoding basically encodes a change from another frame (reference frame). Specifically, when a frame is input from the image processing unit 6, the motion detection unit 120 detects a motion vector based on the input frame and the reconstructed image data stored in the frame memory 119. The motion compensation prediction unit 118 generates predicted image data based on the motion vector and the reconstructed image data stored in the frame memory 119. The difference calculation unit 111 calculates the difference between the input frame and the predicted image data, and generates difference image data. The difference image data is input to the DCT unit 112 and subjected to frequency conversion, and then the encoded data output unit 123 performs variable length encoding. Thereby, P picture and B picture are output.

なお、量子化部１１３から出力される残差データは、逆量子化部１１４およびＩＤＣＴ部１１５により逆量子化および逆周波数変換が順次施され、加算演算部１１６により予測画像データが加算され、デブロッキングフィルタ１１７により符号化を行う際に生じたブロック歪を緩和するための処理が行われたのち、再構成画像データとしてフレームメモリ１１９に格納される。 Note that the residual data output from the quantization unit 113 is sequentially subjected to inverse quantization and inverse frequency conversion by the inverse quantization unit 114 and the IDCT unit 115, and the addition calculation unit 116 adds the predicted image data, After processing for reducing block distortion caused when encoding is performed by the blocking filter 117, it is stored in the frame memory 119 as reconstructed image data.

上述したようなＨ．２６４規格に基づいて生成された符号化データは、一般にＮＡＬユニットと呼ばれる構成を有する。このＮＡＬユニットは、図６に示すように、ＮＡＬヘッダとＲＢＳＰ(Raw Byte Sequence Payload)とから構成される。ＮＡＬヘッダは、固定ビットと、参照ピクチャかであるか否かを示すフラグからなるnal_ref_idoと、ＮＡＬユニットの種類を示す識別子からなるnal_unit_typeとから構成される。また、ＲＢＡＰは、圧縮データからなるスライスデータと、このスライスデータに関するヘッダ情報からなるスライスヘッダとから構成される。このスライスヘッダは、参照ピクチャが何れのピクチャであるかを識別することができる。 H. as described above. The encoded data generated based on the H.264 standard generally has a configuration called a NAL unit. As shown in FIG. 6, the NAL unit includes a NAL header and an RBSP (Raw Byte Sequence Payload). The NAL header includes a fixed bit, nal_ref_ido including a flag indicating whether the picture is a reference picture, and nal_unit_type including an identifier indicating the type of the NAL unit. The RBAP is composed of slice data composed of compressed data and a slice header composed of header information related to the slice data. This slice header can identify which picture the reference picture is.

［復号化処理動作］
次に、図４を参照して復号化手段１２による符号化データの復号化処理動作について説明する。 [Decryption processing operation]
Next, the operation of decoding encoded data by the decoding unit 12 will be described with reference to FIG.

まず、入力された符号化データは、選択部１５０の判別手段１５２により所定の条件により各ピクチャが判別され、この判別結果に基づいて出力手段１５１により復号化されるピクチャのみが選択的に出力される。このような選択部１５０によるピクチャの選択動作の詳細については後述する。 First, in the input encoded data, each picture is discriminated based on a predetermined condition by the discriminating means 152 of the selection unit 150, and only the picture decoded by the output means 151 is selectively outputted based on the discrimination result. The Details of the picture selection operation by the selection unit 150 will be described later.

選択されたピクチャに対する復号化は、イントラ予測復号化および動き補償予測復号化の２種類がある。これらは、スイッチ１３９によって切り替えがなされる。ここではまず始めにイントラ予測復号化について説明する。 There are two types of decoding for the selected picture: intra prediction decoding and motion compensated prediction decoding. These are switched by a switch 139. Here, first, intra prediction decoding will be described.

（イントラ予測復号化）
イントラ予測復号化は、Ｉピクチャのみから元のフレームを復元するものである。選択部１５０により選択されたピクチャは、復号化部１３１により可変長変換され、逆量子化部１３２およびＩＤＣＴ部１３３により逆量子化と逆周波数変換が順次施されたのち、デブロッキングフィルタ１３５により復号化を行う際に生じたブロック歪みが緩和され、フレームとして出力される。このフレームは、フレームメモリ１３６にも格納される。このようにＩピクチャは、他のピクチャを参照することなく復号化される。 (Intra prediction decoding)
Intra-prediction decoding restores the original frame from only the I picture. The picture selected by the selection unit 150 is subjected to variable length transformation by the decoding unit 131, subjected to inverse quantization and inverse frequency transformation in order by the inverse quantization unit 132 and the IDCT unit 133, and then decoded by the deblocking filter 135. The block distortion generated during the conversion is alleviated and output as a frame. This frame is also stored in the frame memory 136. Thus, an I picture is decoded without referring to other pictures.

（動き補償予測復号化）
次に、動き補償予測復号化について説明する。この動き補償予測復号化は、参照ピクチャと動きベクトルから元のフレームを復元するものである。選択部１５０からピクチャが入力されると、このピクチャは、復号化部１３１により可変長変換され、逆量子化部１３２およびＩＤＣＴ部１３３により逆量子化と逆周波数変換が順次施され、残差復号化データとして出力される。このとき、動き補償予測部１３７では、フレームメモリ１３６に格納されたフレームに基づいて予測画像データが生成される。上述したように生成された残差符号化データと予測画像データとは、加算演算部１３４により加算され、デブロッキングフィルタ１３５により復号化を行う際に生じたブロック歪みが緩和されることにより、フレームとして出力される。このフレームは、フレームメモリ１３６にも格納される。 (Motion compensated prediction decoding)
Next, motion compensation predictive decoding will be described. This motion compensation predictive decoding restores the original frame from the reference picture and the motion vector. When a picture is input from the selection unit 150, the picture is subjected to variable length transformation by the decoding unit 131, and inverse quantization and inverse frequency transformation are sequentially performed by the inverse quantization unit 132 and the IDCT unit 133, and residual decoding is performed. Output as digitized data. At this time, the motion compensation prediction unit 137 generates predicted image data based on the frame stored in the frame memory 136. The residual encoded data and the predicted image data generated as described above are added by the addition operation unit 134, and the block distortion generated when decoding is performed by the deblocking filter 135 is alleviated. Is output as This frame is also stored in the frame memory 136.

［ピクチャ選択動作］
次に、図７〜図１０を参照して、選択部１５０による復号化するピクチャの選択動作について説明する。 [Picture selection operation]
Next, the selection operation of the picture to be decoded by the selection unit 150 will be described with reference to FIGS.

まず、選択部１５０は、符号化データが入力されると（ステップＳ１）、判別手段１５２により各ピクチャについて復号化するか否かを以下の手順で順次判別する。 First, when the encoded data is input (step S1), the selection unit 150 sequentially determines whether or not each picture is to be decoded by the determination unit 152 according to the following procedure.

判別手段１５２は、Ｉピクチャ判別手段１５２ａにより、入力されたピクチャ（以下、「入力ピクチャ」という）がＩピクチャであるか否かを判別する（ステップＳ２）。この判別は、図６を参照して説明したＮＡＬユニットにおけるＮＡＬヘッダのnal_unit_typeを参照することにより行われる。 The discriminating means 152 discriminates whether or not the input picture (hereinafter referred to as “input picture”) is an I picture by the I picture discriminating means 152a (step S2). This determination is performed by referring to the nal_unit_type of the NAL header in the NAL unit described with reference to FIG.

入力ピクチャがＩピクチャである場合（ステップＳ２：ＹＥＳ）、Ｉピクチャ判別手段１５２ａは、その入力ピクチャを復号化するピクチャとして選択し、出力手段１５１から出力させる（ステップＳ６）。その結果、本実施の形態では、Ｉピクチャは全て復号化される。 When the input picture is an I picture (step S2: YES), the I picture determination unit 152a selects the input picture as a picture to be decoded and outputs it from the output unit 151 (step S6). As a result, in the present embodiment, all I pictures are decoded.

一例として、図８に示す符号ａ〜ｌの連続するピクチャから構成される符号化データに対して、選択部１５０により行われる選択動作ついて説明する。なお、図８において、矩形の中に示す文字は各ピクチャの種類を表し、矢印は各ピクチャが参照する参照ピクチャを示している。例えば、符号ｂのピクチャは、Ｐピクチャであって、符号ａのＩピクチャを参照していることを意味する。 As an example, a description will be given of a selection operation performed by the selection unit 150 with respect to encoded data composed of consecutive pictures of codes a to l illustrated in FIG. In FIG. 8, the characters shown in the rectangle indicate the type of each picture, and the arrows indicate the reference pictures to which each picture refers. For example, a picture with code b is a P picture, which means that an I picture with code a is referenced.

図８の符号化データの場合、符号ａのピクチャと符号ｇのピクチャがＩピクチャである。したがって、Ｉピクチャ判別手段１５２ａは、符号ａと符号ｇのピクチャを符号化するピクチャとして選択し、出力手段１５１から出力させる。これにより、図９に示すように、符号ａと符号ｇのピクチャに基づくフレームが生成される。なお、図９において、上段の矩形は図８に対応する符号化データの各ピクチャを、これらのピクチャと矢印でそれぞれ結ばれた下段の矩形は復号化手段１３から出力されるフレームを表しており、人の絵のフレームは復号すると選択されたピクチャに対応するフレーム、「×」が記されたフレームは破棄すると選択されたピクチャに対応するフレームを意味する。 In the case of the encoded data in FIG. 8, the picture with the code a and the picture with the code g are I pictures. Accordingly, the I picture determination unit 152a selects the pictures of the code a and the code g as the pictures to be encoded and outputs them from the output unit 151. Thereby, as shown in FIG. 9, the frame based on the picture of the code | symbol a and the code | symbol g is produced | generated. In FIG. 9, the upper rectangle represents each picture of the encoded data corresponding to FIG. 8, and the lower rectangle connected to these pictures by an arrow represents a frame output from the decoding means 13. When a person's picture frame is decoded, it means a frame corresponding to the selected picture, and a frame marked with “x” means a frame corresponding to the selected picture when discarded.

一方、入力ピクチャがＩピクチャではない場合（ステップＳ２：ＮＯ）、判別手段１５２は、所定ピクチャ判別手段１５２ｂにより入力ピクチャより前の所定の位置に存在するピクチャが復号化されているか否かを判別する（ステップＳ３）。所定ピクチャ判別手段１５２ｂは、フレームメモリ１３６を参照して、入力ピクチャより先に入力されたピクチャ（本実施の形態では直前に存在したピクチャ）が復号化されたか否かを判別する。 On the other hand, when the input picture is not an I picture (step S2: NO), the determination unit 152 determines whether or not a picture existing at a predetermined position before the input picture is decoded by the predetermined picture determination unit 152b. (Step S3). The predetermined picture discriminating means 152b refers to the frame memory 136 and discriminates whether or not the picture inputted before the input picture (the picture existing immediately before in the present embodiment) has been decoded.

直前のピクチャが復号化されている場合（ステップＳ３：ＹＥＳ）、所定ピクチャ判別手段１５２ｂは、入力ピクチャを復号化するピクチャとして選択しない、すなわち破棄するピクチャとして選択し、出力手段１５１から出力させない（ステップＳ５）。このように直前のピクチャが復号化されているのであれば、当該ピクチャの直後のピクチャを破棄したとしても、人間の目には、その違いがほとんど感じられないものと考えられる。したがって、本実施の形態では、直前のピクチャが復号化されている場合、入力ピクチャを破棄することにより、画質の低下を抑えながら、復号化するピクチャの数量を減少させることができるので、端末装置２の演算負荷を低減させることができる。 When the immediately preceding picture has been decoded (step S3: YES), the predetermined picture discriminating means 152b does not select the input picture as a picture to be decoded, that is, selects it as a picture to be discarded and does not output it from the output means 151 ( Step S5). If the immediately preceding picture is decoded in this way, even if the immediately following picture is discarded, it is considered that the difference is hardly perceived by human eyes. Therefore, in the present embodiment, when the previous picture is decoded, it is possible to reduce the number of pictures to be decoded while suppressing the deterioration of the image quality by discarding the input picture. 2 calculation load can be reduced.

例えば、図８の符号ｂや符号ｈのピクチャは、直前のピクチャ、すなわち符号ａまたは符号ｇのピクチャが復号化されている。したがって、所定ピクチャ判別手段１５２ｂは、符号ｂと符号ｈのピクチャを破棄するピクチャとして選択し、出力手段１５１から出力させない。これにより、図９に示すように、符号ｂと符号ｈのピクチャに基づくフレームが生成されない。 For example, the pictures b and h in FIG. 8 are obtained by decoding the previous picture, that is, the picture of the code a or g. Therefore, the predetermined picture discriminating unit 152b selects the pictures b and h as the pictures to be discarded and does not output them from the output unit 151. Thereby, as shown in FIG. 9, the frame based on the picture of the code | symbol b and the code h is not produced | generated.

直前のピクチャが復号化されていない場合（ステップＳ３：ＮＯ）、判別手段１５２は、参照ピクチャ判別手段１５２ｃにより入力ピクチャが参照しているピクチャ（以下、「参照ピクチャ」という）が復号化されているか否かを判別する（ステップＳ４）。上述したように、図６に示すＮＡＬユニットのＲＢＳＰのスライスヘッダには、入力ピクチャの参照ピクチャが何れのピクチャであるかを示す情報が含まれている。したがって、参照ピクチャ判別手段１５２ｃは、ＲＢＳＰのスライスヘッダから入力ピクチャの参照ピクチャを特定し、フレームメモリ１３６から当該参照ピクチャを検索することにより、入力ピクチャの参照ピクチャが符号化されたか否かを判別する。 When the immediately preceding picture has not been decoded (step S3: NO), the determination unit 152 decodes a picture (hereinafter referred to as “reference picture”) referred to by the input picture by the reference picture determination unit 152c. It is determined whether or not there is (step S4). As described above, the RBSP slice header of the NAL unit shown in FIG. 6 includes information indicating which picture is the reference picture of the input picture. Therefore, the reference picture determination unit 152c specifies the reference picture of the input picture from the slice header of the RBSP and searches the frame memory 136 for the reference picture to determine whether or not the reference picture of the input picture has been encoded. To do.

参照ピクチャが復号化されている場合（ステップＳ４：ＹＥＳ）、参照ピクチャ判別手段１５２ｃは、その入力ピクチャを復号化するピクチャとして選択し、出力手段１５１から出力させる（ステップＳ６）。一方、参照ピクチャが復号化されていない場合（ステップＳ４：ＮＯ）、参照ピクチャ判別手段１５２ｃは、入力ピクチャを復号化するピクチャとして選択しない、すなわち破棄するピクチャとして選択し、出力手段１５１から出力させない（ステップＳ５）。参照するピクチャが破棄されていると、入力ピクチャを十分な品質で復号化することができない上に、演算負荷が大きくなる。このため、本実施の形態では、参照するピクチャが破棄されている場合には、入力ピクチャを破棄することにより、端末装置２の演算負荷を低減させることができる。 When the reference picture is decoded (step S4: YES), the reference picture determination unit 152c selects the input picture as a picture to be decoded and outputs it from the output unit 151 (step S6). On the other hand, when the reference picture is not decoded (step S4: NO), the reference picture determination unit 152c does not select the input picture as a picture to be decoded, that is, selects it as a picture to be discarded and does not output it from the output unit 151. (Step S5). If the picture to be referenced is discarded, the input picture cannot be decoded with sufficient quality, and the computation load increases. For this reason, in this Embodiment, when the picture to refer is discarded, the calculation load of the terminal device 2 can be reduced by discarding an input picture.

例えば、図８に示す符号ｄのピクチャは、直前のピクチャが復号化されていないが、参照する符号ａのピクチャが復号化されている。したがって、参照ピクチャ判別手段１５２ｃは、符号ｄのピクチャを復号化するピクチャとして選択し、出力手段１５１から出力させる。一方、符号ｃのピクチャは、直前のピクチャが復号化されていないとともに、参照するピクチャである符号ｂのピクチャが復号化されていない。したがって、参照ピクチャ判別手段１５２ｃは、符号ｃのピクチャを破棄するピクチャとして選択し、出力手段１５１から出力させない。これにより、図９に示すように、符号ｄのピクチャに基づくフレームが生成される一方、符号ｃのピクチャに基づくフレームが生成されない。 For example, in the picture with the code d shown in FIG. 8, the previous picture is not decoded, but the picture with the code a to be referenced is decoded. Therefore, the reference picture determination unit 152c selects the picture with the code d as a picture to be decoded and outputs it from the output unit 151. On the other hand, as for the picture of code c, the picture immediately before is not decoded and the picture of code b which is a picture to be referenced is not decoded. Therefore, the reference picture determination unit 152 c selects the picture with the code c as a picture to be discarded and does not output it from the output unit 151. As a result, as shown in FIG. 9, a frame based on the picture with the code d is generated, whereas a frame based on the picture with the code c is not generated.

このような選択動作によって選択部１５０から出力されたピクチャは、復号化部１３１に入力されて復号化手段１２の各構成要素によって復号化されることにより、最終的にフレームとして出力される。ここで、図８に示す復号化データに対する選択部１５０による選択結果をまとめると、図９および以下に示す通りとなる。 The picture output from the selection unit 150 by such a selection operation is input to the decoding unit 131 and is decoded by each component of the decoding unit 12 to be finally output as a frame. Here, the selection results by the selection unit 150 for the decoded data shown in FIG. 8 are summarized as shown in FIG. 9 and the following.

符号ａのピクチャは、ステップＳ２によりＩピクチャであると判別されるので、出力手段１５１から出力され、復号化される。
符号ｂのピクチャは、ステップＳ３により直前のピクチャが復号化されていると判別されるので、出力手段１５１から出力されない、すなわち破棄される。
符号ｃのピクチャは、ステップＳ４により参照しているピクチャが復号化されていないと判別されるので、出力手段１５１から出力されない、すなわち破棄される。
符号ｄのピクチャは、ステップＳ４により参照しているピクチャが復号化されていると判別されるので、出力手段１５１から出力され、復号化される。
符号ｅのピクチャは、ステップＳ３により直前のピクチャが復号化されていると判別されるので、出力手段１５１から出力されない、すなわち破棄される。
符号ｆのピクチャは、ステップＳ４により参照しているピクチャが復号化されていると判別されるので、出力手段１５１から出力され、復号化される。
符号ｇのピクチャは、ステップＳ２によりＩピクチャであると判別されるので、復号化される。
符号ｈのピクチャは、ステップＳ３により直前のピクチャが復号化されていると判別されるので、出力手段１５１から出力されない、すなわち破棄される。
符号ｉのピクチャは、ステップＳ４により参照しているピクチャが復号化されていると判別されるので、出力手段１５１から出力され、復号化される。
符号ｊのピクチャは、ステップＳ３により直前のピクチャが復号化されていると判別されるので、出力手段１５１から出力されない、すなわち破棄される。
符号ｋのピクチャは、ステップＳ４により参照しているピクチャが復号化されていると判別されるので、出力手段１５１から出力され、復号化される。
符号ｌのピクチャは、ステップＳ３により直前のピクチャが復号化されていると判別されるので、出力手段１５１から出力されない、すなわち破棄される。 Since the picture with the code a is determined to be an I picture in step S2, it is output from the output means 151 and decoded.
Since it is determined in step S3 that the previous picture has been decoded, the picture with the code b is not output from the output means 151, that is, discarded.
The picture with the code c is not output from the output means 151 because it is determined that the picture referred to in step S4 has not been decoded, that is, discarded.
Since it is determined that the picture referred to in step S4 has been decoded, the picture with the code d is output from the output unit 151 and decoded.
Since it is determined in step S3 that the previous picture has been decoded, the picture with the code e is not output from the output means 151, that is, discarded.
Since it is determined in step S4 that the picture referred to by the code f is decoded, it is output from the output means 151 and decoded.
Since the picture with the code g is determined to be an I picture in step S2, it is decoded.
Since it is determined in step S3 that the previous picture has been decoded, the picture with the code h is not output from the output means 151, that is, discarded.
Since it is determined that the picture referred to in step S4 has been decoded, the picture with the code i is output from the output means 151 and decoded.
Since it is determined in step S3 that the immediately preceding picture has been decoded, the picture with the code j is not output from the output means 151, that is, discarded.
Since it is determined that the picture referred to in step S4 has been decoded, the picture with the code k is output from the output unit 151 and decoded.
Since it is determined in step S3 that the previous picture has been decoded, the picture with the code l is not output from the output means 151, that is, discarded.

このように、符号ａ〜ｌで示すピクチャから構成される符号化データに対して選択部１５０により選択動作が行われると、符号ａ，ｄ，ｆ，ｇ，ｉ，ｋのピクチャのみが復号化される。これにより、復号化されるピクチャの数量が減少するので、端末装置２の演算負荷を低減させることができる。また、Ｉピクチャのみならず、符号ｄ，ｆ，ｉ，ｋのＰピクチャも復号化されるため、Ｉピクチャのみを復号化する場合よりも画質を向上させることができる。さらに、参照するピクチャが存在しない符号ｃが復号化されないので、画質の低下を防ぐことができるとともに、演算負荷を低減させることができる。 As described above, when the selection unit 150 performs the selection operation on the encoded data including the pictures indicated by the codes a to l, only the pictures of the codes a, d, f, g, i, and k are decoded. Is done. As a result, the number of pictures to be decoded is reduced, so that the calculation load on the terminal device 2 can be reduced. Further, since not only the I picture but also the P pictures of the codes d, f, i, and k are decoded, the image quality can be improved as compared with the case of decoding only the I picture. Further, since the code c having no picture to be referred to is not decoded, it is possible to prevent the image quality from being deteriorated and to reduce the calculation load.

図９に示すような選択部１５０による選択結果が出力され、選択されたピクチャに対して復号化手段１２により復号化が行われると、復号化手段１２のフレームメモリ１３６には、図１０（ａ）〜（ｆ）に示すようにフレームが格納される。なお、図１０において、符号ｇのフレームに対応するピクチャはＩＤＲ(Instantaneous Decoding Refresh)ピクチャであるものとする。 When the selection result by the selection unit 150 as shown in FIG. 9 is output and the selected picture is decoded by the decoding means 12, the frame memory 136 of the decoding means 12 stores the result shown in FIG. ) To (f), the frame is stored. In FIG. 10, it is assumed that a picture corresponding to a frame of code g is an IDR (Instantaneous Decoding Refresh) picture.

すなわち、図１０（ａ）〜（ｃ）に示すように、選択部１５０により符号ａ，ｄ，ｆのピクチャが復号化するピクチャとして選択されると、復号化手段１２の他の構成要素により各ピクチャが順次復号化され、対応するフレームがフレームメモリ１３６に格納される。 That is, as shown in FIGS. 10A to 10C, when the picture of codes a, d, and f is selected as a picture to be decoded by the selection unit 150, each of the other components of the decoding means 12 Pictures are sequentially decoded and corresponding frames are stored in the frame memory 136.

しかしながら、選択部１５０により符号ｇのピクチャが復号化するピクチャとして選択されると、その符号ｇのピクチャはＩＤＲ(Instantaneous Decoding Refresh)ピクチャであるため、フレームメモリ１３６から符号ａ，ｄ，ｆのフレームがクリアされ、符号ｇのフレームのみが格納される。 However, when the picture with the code g is selected as a picture to be decoded by the selection unit 150, the picture with the code g is an IDR (Instantaneous Decoding Refresh) picture, and therefore, the frames with the codes a, d, and f from the frame memory 136. Is cleared, and only the frame of code g is stored.

次いで、選択部１５０により符号ｇ，ｉのピクチャが復号化するピクチャとして選択されると、復号化手段１２の他の構成要素により順次復号化され、対応するフレームが符号ｇのフレームとともにフレームメモリ１３６に格納される。 Next, when the picture of codes g and i is selected as a picture to be decoded by the selection unit 150, it is sequentially decoded by the other components of the decoding means 12, and the corresponding frame is framed together with the frame of the code g. Stored in

このように、選択部１５０により選択されたピクチャのみが復号化されるので、フレームメモリ１３６に格納するピクチャの数量を減らすことができる。これにより、復号化手段１２、特にデブロッキングフィルタ１３５や動き補償予測部１３７の動作を低減させることが可能となり、結果として、端末装置２の演算負荷を低減させることができる。また、このようなフレームメモリ１３６を参照することにより、判別手段１５２の所定ピクチャ判別手段１５２ｂおよび参照ピクチャ判別手段１５２ｃの動作を実現することができる。 In this way, since only the picture selected by the selection unit 150 is decoded, the number of pictures stored in the frame memory 136 can be reduced. Thereby, it becomes possible to reduce the operation of the decoding means 12, particularly the deblocking filter 135 and the motion compensation prediction unit 137, and as a result, the calculation load of the terminal device 2 can be reduced. Further, by referring to such a frame memory 136, the operations of the predetermined picture determination unit 152b and the reference picture determination unit 152c of the determination unit 152 can be realized.

以上説明したように、本実施の形態によれば、入力されたピクチャがＩピクチャであるか否かを判別し、当該ピクチャがＩピクチャではない場合、当該ピクチャより先に入力された所定の他のピクチャが復号化されているか否かを判別し、当該他のピクチャが復号化されていれば当該ピクチャを出力しないことにより、復号化に伴う演算負荷を低減させることができる。また、Ｉピクチャ以外のピクチャを復号化することが可能となるので、結果として、画質を向上させることができる。 As described above, according to the present embodiment, it is determined whether or not an input picture is an I picture. If the picture is not an I picture, a predetermined other input before the picture is determined. It can be determined whether or not the current picture is decoded, and if the other picture is decoded, the picture is not output, thereby reducing the computation load associated with the decoding. In addition, since it is possible to decode pictures other than the I picture, the image quality can be improved as a result.

ＴＶ会議システムのように同時に複数の端末装置２と符号化データのやりとりを行う場合、復号化手段１２は、通信相手となる端末装置２毎に設けられる。復号化手段１２は、ＣＰＵがプログラム８ｂを読み込んで実行することにより実現されるものなので、相手先の端末装置２の数量が増えるのに伴って復号化手段１２の数量も増えるため、ＣＰＵの演算負荷も増大することとなる。このため、本実施の形態では、受信した符号化データの中から選択部１５０により所定のピクチャのみを選択的に復号化させることにより、各復号化手段１２の処理を減らすことが可能となり、結果として、端末装置２のＣＰＵの演算負荷を低減させることができる。 When the encoded data is exchanged with a plurality of terminal devices 2 at the same time as in the TV conference system, the decoding means 12 is provided for each terminal device 2 that is a communication partner. Since the decryption means 12 is realized by the CPU reading and executing the program 8b, the amount of the decryption means 12 increases as the number of the counterpart terminal device 2 increases. The load will also increase. Therefore, in the present embodiment, it is possible to reduce the processing of each decoding unit 12 by selectively decoding only a predetermined picture from the received encoded data by the selection unit 150. As a result, the calculation load on the CPU of the terminal device 2 can be reduced.

なお、本実施の形態において、例えば図８，図９等に示したように符号化データがＩピクチャとＰピクチャとから構成されるものとして説明したが、符号化データがＩピクチャ、ＰピクチャおよびＢピクチャから構成されていてもよいことは言うまでもない。この場合、例えば、Ｂピクチャについても先に入力された所定のピクチャおよび参照ピクチャが復号化されているか否かに基づいて復号化するか否かを決定することにより、端末装置２の演算負荷を低減させるとともに、動画像の画質を向上させることができる。 In the present embodiment, for example, as illustrated in FIGS. 8 and 9, the encoded data is described as being composed of an I picture and a P picture, but the encoded data is an I picture, a P picture, and Needless to say, it may be composed of B pictures. In this case, for example, by determining whether or not to decode a B picture based on whether or not a predetermined picture and a reference picture previously input are decoded, the calculation load of the terminal device 2 is reduced. In addition to the reduction, the image quality of the moving image can be improved.

また、本実施の形態において、所定ピクチャ判別手段１５２ｂは、入力ピクチャの直前のピクチャが復号化されているか否かを識別するようにしたが、識別するピクチャはこれに限定されず、例えば、所定数前のピクチャや所定数前から直前までのピクチャなど適宜自由に設定することができる。 In the present embodiment, the predetermined picture discriminating unit 152b identifies whether or not the picture immediately before the input picture has been decoded. However, the picture to be identified is not limited to this. It is possible to set freely as appropriate, such as a few previous pictures and a predetermined number of pictures before and immediately before.

また、図５において、符号ｘで示す表示面積が大きな動画像を再生する端末装置２からの符号データについては、全てのピクチャを復号化するようにしてもよい。復号化したフレームを大きな表示面積で表示させる符号化データと、復号化したフレームを小さな表示面積で表示させる符号化データとを、同じ条件で復号化するピクチャを選択すると、表示面積が大きい方が表示面積が小さい方よりも画質が低いと感じることがある。このため、表示面積が大きい符号化データについては、全てのピクチャを復号化させる。これにより、ユーザ体感品質を向上させることができる。このとき、表示面積が小さい符号化データについては、上述したような方法により、復号化させるピクチャを選択させる。これにより、端末装置２の演算負荷を低減させることができる。 In FIG. 5, all the pictures may be decoded for the code data from the terminal device 2 that reproduces a moving image having a large display area indicated by the symbol x. When a picture that decodes encoded data that displays a decoded frame with a large display area and encoded data that displays a decoded frame with a small display area under the same conditions is selected, the larger display area You may feel that the image quality is lower than the smaller display area. For this reason, for encoded data having a large display area, all pictures are decoded. Thereby, user experience quality can be improved. At this time, for encoded data having a small display area, a picture to be decoded is selected by the method described above. Thereby, the calculation load of the terminal device 2 can be reduced.

また、復号化手段１２から出力されるフレームに基づく動画像は、符号化データのピクチャ全てを復号化した画像データに基づく動画像のフレームレートと同じフレームレートで再生されるようにしてもよい。この場合、破棄されたピクチャに対応するフレームには、このフレームの前に正常に再生されたフレームを表示し続けるようにすればよい。これにより、動画像を円滑に再生することができる。 Further, the moving image based on the frame output from the decoding unit 12 may be reproduced at the same frame rate as the moving image based on the image data obtained by decoding all the pictures of the encoded data. In this case, a frame that has been normally reproduced before this frame may be continuously displayed in the frame corresponding to the discarded picture. Thereby, a moving image can be smoothly reproduced.

本発明は、符号化されたデータを復号化する各種装置に適用することができる。 The present invention can be applied to various apparatuses that decode encoded data.

本発明に係るＴＶ会議システムの構成を示す模式的に示す図である。It is a figure showing typically the composition of the TV conference system concerning the present invention. 端末装置の構成を示すブロック図である。It is a block diagram which shows the structure of a terminal device. 符号化手段の構成を示すブロック図である。It is a block diagram which shows the structure of an encoding means. 復号化手段の構成を示すブロック図である。It is a block diagram which shows the structure of a decoding means. ＴＶ会議システムにおける端末装置のモニタの表示例を示す図である。It is a figure which shows the example of a display of the monitor of the terminal device in a video conference system. ＮＡＬユニットの構成を模式的に示す図である。It is a figure which shows the structure of a NAL unit typically. 復号化するピクチャの選択動作を示すフローチャートである。It is a flowchart which shows selection operation | movement of the picture to decode. 復号化データの一例を模式的に示す図である。It is a figure which shows an example of decoding data typically. 図８の復号化データに対する復号化するピクチャの選択動作の結果を模式的に示す図である。It is a figure which shows typically the result of the selection operation | movement of the picture to decode with respect to the decoding data of FIG. （ａ）〜（ｆ）は、図９の選択結果により復号化されたピクチャを格納したフレームメモリの状態を模式的に示す図である。(A)-(f) is a figure which shows typically the state of the frame memory which stored the picture decoded by the selection result of FIG.

Explanation of symbols

１…サーバ、２，２ａ，２ｂ，２ｎ…端末装置、３…通信回線、４…外部Ｉ／Ｆ部、５…操作入力部、６…画像処理部、６ａ…カメラ、６ｂ…モニタ、７…音声処理部、７ａ…マイク、７ｂ…スピーカ、８…記憶部、８ａ…制御データ、８ｂ…プログラム、９…制御部、１１…符号化手段、１２…復号化手段、１３…ＴＶ会議手段、１１１…差分演算部、１１２…ＤＣＴ部、１１３…量子化部、１１４…逆量子化部、１１５…ＩＤＣＴ部、１１６…加算演算部、１１７…デブロッキングフィルタ、１１８…動き補償予測部、１１９…フレームメモリ、１２０…動き検出部、１２１…イントラ予測部、１２２…スイッチ、１２３…符号化データ出力部、１３１…復号化部、１３２…逆量子化部、１３３…ＩＤＣＴ部、１３４…加算演算部、１３５…デブロッキングフィルタ、１３６…フレームメモリ、１３７…動き補償予測部、１３８…イントラ予測部、１３９…スイッチ、１５０…選択部、１５１…出力手段、１５２…判別手段、１５２ａ…Ｉピクチャ判別手段、１５２ｂ…所定ピクチャ判別手段、１５２ｃ…参照ピクチャ判別手段。 DESCRIPTION OF SYMBOLS 1 ... Server, 2, 2a, 2b, 2n ... Terminal device, 3 ... Communication line, 4 ... External I / F part, 5 ... Operation input part, 6 ... Image processing part, 6a ... Camera, 6b ... Monitor, 7 ... Audio processing unit, 7a ... microphone, 7b ... speaker, 8 ... storage unit, 8a ... control data, 8b ... program, 9 ... control unit, 11 ... encoding means, 12 ... decoding means, 13 ... TV conference means, 111 ... difference calculation unit, 112 ... DCT unit, 113 ... quantization unit, 114 ... dequantization unit, 115 ... IDCT unit, 116 ... addition calculation unit, 117 ... deblocking filter, 118 ... motion compensation prediction unit, 119 ... frame Memory 120, motion detector 121, intra prediction unit 122, switch 123, encoded data output unit 131, decoding unit 132, inverse quantization unit 133, IDCT unit 134, addition operation unit 13 Deblocking filter, 136 Frame memory, 137 Motion compensation prediction unit, 138 Intra prediction unit, 139 Switch, 150 Selection unit, 151 Output unit, 152 Discrimination unit, 152a I picture discrimination unit, 152b ... predetermined picture discrimination means, 152c ... reference picture discrimination means.

Claims

Encoded data including a plurality of consecutive encoded pictures is input, it is determined whether or not the input picture is an I picture, and if the picture is an I picture, the picture is output , the picture may not be the I picture, it is determined whether or not certain other picture input earlier than the picture is being decoded, it outputs the picture if the other picture is decoded No selection means,
E Bei and decoding means for decoding the outputted picture from the selection means,
2. The information processing apparatus according to claim 1, wherein the encoded data has a structure in which a picture that is not an I picture refers to a picture that is not immediately before and that is farther in time or a picture that is not a decoded I picture .

Encoded data including a plurality of consecutive encoded pictures is input, it is determined whether or not the input picture is an I picture, and if the picture is an I picture, the picture is output If the picture is not an I picture, it is determined whether or not a predetermined other picture input before the picture has been decoded. If the other picture has been decoded, the picture is output. If the other picture is not decoded, it is determined whether or not a reference picture to be referenced by the picture is decoded. If the reference picture is decoded, the picture is output. Selection means for not outputting the picture unless the reference picture is decoded;
Decoding means for decoding the picture output from the selection means ,
2. The information processing apparatus according to claim 1, wherein the encoded data has a structure in which a picture that is not an I picture refers to a picture that is not immediately before and that is farther in time or a picture that is not a decoded I picture .

Encoded data including a plurality of consecutive encoded pictures is input, it is determined whether or not the input picture is an I picture, and if the picture is an I picture, the picture is output , the picture may not be the I picture, it is determined whether or not certain other picture input earlier than the picture is being decoded, it outputs the picture if the other picture is decoded Not a selection step and
Possess a decoding step of decoding the outputted picture by the selection step,
The information processing method characterized in that the encoded data has a structure in which a picture that is not an I picture refers to an I picture that is not immediately before, is farther in time, or is not a decoded I picture .

Encoded data including a plurality of consecutive encoded pictures is input, it is determined whether or not the input picture is an I picture, and if the picture is an I picture, the picture is output If the picture is not an I picture, it is determined whether or not a predetermined other picture input before the picture has been decoded. If the other picture has been decoded, the picture is output. If the other picture is not decoded, it is determined whether or not a reference picture to be referenced by the picture is decoded. If the reference picture is decoded, the picture is output. A selection step of not outputting the picture unless the reference picture is decoded;
Possess a decoding step of decoding the outputted picture by the selection step,
The information processing method characterized in that the encoded data has a structure in which a picture that is not an I picture refers to an I picture that is not immediately before, is farther in time, or is not a decoded I picture .