JP2007235314A

JP2007235314A - Coding method

Info

Publication number: JP2007235314A
Application number: JP2006051786A
Authority: JP
Inventors: Masaru Matsuda; 優松田; Shigeyuki Okada; 茂之岡田
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 2006-02-28
Filing date: 2006-02-28
Publication date: 2007-09-13

Abstract

<P>PROBLEM TO BE SOLVED: To solve a problem that tha coding efficiency is reduced when a region of interest is set to an image and the image is hierarchically coded. <P>SOLUTION: A resolution conversion section 12 reduces image data of a received frame in matching with a spatial resolution in each layer and gives the image data of each layer to a fundamental layer processing block 120 and an extended layer processing block 110. A ROI setting section 14 sets a ROI onto a moving picture frame in the unit of a layer. The fundamental layer processing block 120 divides the image of the fundamental layer whose resolution is converted into a low resolution by the resolution conversion section 12 according to ROI information of the fundamental layer and applies compression coding to the image by each ROI and outputs a result to a multiplexer section 18. The extended layer processing block 110 divides the image of the extended layer with a high resolution according to ROI information of the extended layer and applies compression coding to the image by each ROI and outputs the result to the multiplexer section 18. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、画像を符号化する符号化方法、特に画像を階層的に符号化する符号化方法に関する。 The present invention relates to an encoding method for encoding an image, and more particularly to an encoding method for hierarchically encoding an image.

ブロードバンドネットワークが急速に発展しており、高品質な動画像を利用したサービスに期待が集まっている。また、ＤＶＤなど大容量の記録媒体が利用されており、高画質の画像を楽しむユーザ層が広がっている。動画像を通信回線で伝送したり、記録媒体に蓄積するために不可欠な技術として圧縮符号化がある。動画像圧縮符号化技術の国際標準として、ＭＰＥＧ４の規格やＨ．２６４／ＡＶＣ規格がある。また、１つのストリームで、符号量に応じて、異なる画質（たとえば高画質と低画質）、異なる解像度（たとえば高解像度と低解像度）、異なるフレームレート（たとえば高フレームレートと低フレームレート）の画像の圧縮および伸長を実現することのできる、Ｈ．２６４／ＡＶＣの拡張として規格化が進められているＳＶＣ（Scalable Video Coding）のような次世代画像圧縮技術がある。 Broadband networks are rapidly developing, and there are high expectations for services that use high-quality moving images. In addition, a large-capacity recording medium such as a DVD is used, and a user group who enjoys high-quality images is expanding. There is compression coding as an indispensable technique for transmitting moving images via a communication line or storing them in a recording medium. As an international standard for moving image compression coding technology, the MPEG4 standard and H.264 standard. There is a H.264 / AVC standard. Also, in one stream, images with different image quality (for example, high and low image quality), different resolution (for example, high and low resolution), and different frame rates (for example, high and low frame rates) depending on the code amount H. can be compressed and decompressed. There is a next-generation image compression technique such as SVC (Scalable Video Coding), which is being standardized as an extension of H.264 / AVC.

次世代画像圧縮技術であるＳＶＣでは、動画像を複数の異なる解像度、フレームレート、画質で再生することができるように、空間スケーラビリティ、時間スケーラビリティ、ＳＮＲスケーラビリティなどの各種スケーラビリティをもたせて動画像を符号化する。これらのスケーラビリティを任意に組み合わせて符号化することも可能であり、ＳＶＣのスケーラビリティ機能は柔軟性に富んでいる。 SVC, the next-generation image compression technology, encodes moving images with various scalability such as spatial scalability, temporal scalability, and SNR scalability so that moving images can be played at multiple different resolutions, frame rates, and image quality. Turn into. Coding can be performed by arbitrarily combining these scalability, and the scalability function of SVC is very flexible.

ＳＶＣの要求仕様（Requirements）の１つにインタラクティブＲＯＩ（Interactive Region of Interest;ＩＲＯＩ）符号化がある。画像の注目領域（Region of Interest;ＲＯＩ）を他の領域とは異なる画質で符号化する技術としてＲＯＩ符号化がある。これに対して、ＳＶＣのインタラクティブＲＯＩ符号化は、動画像の再生時にユーザが画像を見ながら画面上で注目領域の位置やサイズを逐次指定可能であり、注目領域を異なる品質で再生することを可能にするものである。ＳＶＣでは動画像を各種のスケーラビリティをもたせて符号化するため、再生時にユーザが指定した注目領域を他の領域とは異なる品質で復号することが可能である。 One of SVC Requirements is Interactive ROI (Interactive Region of Interest; IROI) coding. ROI coding is a technique for coding a region of interest (ROI) of an image with a different image quality from other regions. In contrast, SVC interactive ROI encoding allows the user to specify the position and size of a region of interest on the screen sequentially while viewing a moving image, and reproduces the region of interest with different quality. It is what makes it possible. In SVC, since a moving image is encoded with various scalability, it is possible to decode a region of interest designated by the user at the time of reproduction with a quality different from that of other regions.

特許文献１は、画像を階層的に符号化することで、パケットロスや帯域変動が起こる通信環境においても再生画像の品質を維持することのできる階層画像符号化技術を開示する。
特開２００５−７９９５３号公報 Patent Document 1 discloses a hierarchical image encoding technique that can maintain the quality of a reproduced image even in a communication environment in which packet loss and bandwidth fluctuation occur by encoding images hierarchically.
JP 2005-79953 A

ＳＶＣでは画像を基本レイヤと拡張レイヤに分けて階層的に符号化する。画像に注目領域を設定すると、全レイヤに対して共通に注目領域が設定され、各レイヤで注目領域にしたがった領域分割をして領域単位で独立した符号化を行うことになるため、符号化効率が低下し、処理負荷も増えてしまう。 In SVC, an image is divided into a base layer and an enhancement layer and encoded hierarchically. When a region of interest is set in an image, the region of interest is set in common for all layers, and each layer is divided into regions according to the region of interest and independent coding is performed for each region. Efficiency decreases and processing load also increases.

本発明はこうした状況に鑑みてなされたもので、その目的は、画像に領域を設定して階層的に符号化することのできる符号化技術を提供することにある。 The present invention has been made in view of such circumstances, and an object thereof is to provide an encoding technique capable of hierarchically encoding an area in an image.

上記課題を解決するために、本発明のある態様の符号化方法は、画像を複数のレイヤに分けて階層的に符号化する際、レイヤ単位で領域を独立に設定し、各レイヤにおいて前記領域毎に独立した符号化を行う。 In order to solve the above-described problem, an encoding method according to an aspect of the present invention is configured such that, when an image is divided into a plurality of layers and encoded hierarchically, an area is set independently for each layer, and the area is set in each layer. Independent encoding is performed every time.

ここで「画像」（ピクチャ）は一枚の独立した静止画であっても、動画像を構成する時系列で並べられた画像の１つであってもよい。「画像」（ピクチャ）は符号化の単位であり、その概念にはフレーム、フィールド、ＶＯＰ（Video Object Plane）などを含む。 Here, the “image” (picture) may be one independent still image or one of images arranged in a time series constituting a moving image. An "image" (picture) is a unit of encoding, and its concept includes a frame, a field, a VOP (Video Object Plane), and the like.

この態様によると、画像の階層符号化の際、レイヤ毎に独立して領域を設定して画像を符号化した符号化データを生成することができるため、符号化効率や処理効率を低下させることなく、画像に設定された領域単位で独立した符号化を行うことができる。 According to this aspect, at the time of hierarchical encoding of an image, it is possible to generate encoded data obtained by encoding an image by setting a region independently for each layer, thereby reducing encoding efficiency and processing efficiency. Independent encoding can be performed in units of regions set in the image.

前記画像に設定される複数の領域の内、いずれの２つの領域も互いに重なり部分をもつ場合（このとき、これらの複数の領域は「互いに重なり部分をもつ」ということにする）、前記複数の領域はそれぞれ異なるレイヤに分けて設定してもよい。これにより、同一レイヤ内では領域の重なりを避けることができ、重なり部分を別に扱って符号化する必要がなくなり、符号化効率の低下を防ぐことができる。 In the case where any two of the plurality of regions set in the image have overlapping portions with each other (in this case, these plurality of regions are referred to as “having overlapping portions with each other”), The areas may be set separately for different layers. Thereby, overlapping of regions can be avoided in the same layer, and it is not necessary to separately handle and encode the overlapping portion, thereby preventing a decrease in encoding efficiency.

たとえば、第１の領域と第２の領域が互いに重なり部分をもち、第２の領域と第３の領域も互いに重なり部分をもつが、第１の領域と第３の領域は重ならない場合、第１の領域と第２の領域は別々のレイヤに設定され、第２の領域と第３の領域も別々のレイヤに設定される。第１の領域と第３の領域は重ならないため、同一のレイヤに設定されてもよい。よって、この場合は、最低２つのレイヤがあればよく、たとえば、第１のレイヤに第２の領域を設定し、第２のレイヤに第１および第３の領域を設定することにより、各レイヤにおいて設定される領域に重なり部分がなくなる。 For example, if the first region and the second region have overlapping portions, and the second region and the third region also have overlapping portions, but the first region and the third region do not overlap, The first area and the second area are set in different layers, and the second area and the third area are also set in different layers. Since the first area and the third area do not overlap, they may be set to the same layer. Therefore, in this case, it is sufficient that there are at least two layers. For example, by setting the second area in the first layer and setting the first and third areas in the second layer, There is no overlap in the area set at.

別の例として、第１〜第３の領域の内、任意の２つの領域が互いに重なり部分をもつ場合、すなわち、第１および第２の領域が互いに重なり部分をもち、第２および第３の領域も互いに重なり部分をもち、第１および第３の領域も互いに重なり部分をもつ場合は、第１、第２および第３の領域はそれぞれ異なるレイヤに設定される。たとえば、第１の領域は第１のレイヤに、第２の領域は第２のレイヤに、第３の領域は第３のレイヤに設定される。これにより、どのレイヤでも領域の重なりがなくなる。 As another example, when any two of the first to third regions overlap each other, that is, the first and second regions overlap each other, and the second and third regions When the regions also have overlapping portions, and the first and third regions also have overlapping portions, the first, second, and third regions are set to different layers. For example, the first area is set to the first layer, the second area is set to the second layer, and the third area is set to the third layer. Thereby, there is no overlap of regions in any layer.

前記複数のレイヤは、スケーラブル階層符号化における基本レイヤと基本レイヤ以外の拡張レイヤであってもよい。基本レイヤでは領域を設定せずに画像全体を符号化し、拡張レイヤでは領域を設定し、領域毎に独立した符号化を行ってもよい。 The plurality of layers may be a basic layer and an enhancement layer other than the basic layer in scalable hierarchical coding. The entire image may be encoded without setting a region in the base layer, and a region may be set in the enhancement layer, and independent encoding may be performed for each region.

スケーラブル階層符号化とは、スケーラビリティをもたせて画像を階層的に符号化することであり、たとえば空間解像度、フレームレートおよび画質レベルなどの動画像の再生品質を異ならせて符号化し、複数の再生品質レベルの符号化データを生成することを含み、このようにしてスケーラブル階層符号化された動画像は、任意の再生品質レベルを選択して復号することができるというスケーラビリティをもつ。空間解像度を異ならせて符号化された動画像は、空間スケーラビリティを有し、フレームレートを異ならせて符号化された動画像は、時間スケーラビリティを有し、画質レベルを異ならせて符号化された動画像は、ＳＮＲスケーラビリティを有する。 Scalable hierarchical coding is the coding of images hierarchically with scalability. For example, encoding with different playback quality of moving images such as spatial resolution, frame rate and image quality level, and multiple playback qualities. In other words, a moving image that is scalable hierarchically encoded in this manner includes a generation of encoded data of a level, and has scalability that an arbitrary reproduction quality level can be selected and decoded. Video encoded with different spatial resolution has spatial scalability, and video encoded with different frame rates has temporal scalability and is encoded with different image quality levels. A moving image has SNR scalability.

複数の再生品質レベルの符号化データを階層構造をもたせて多重化すると、たとえば下位層の符号化データだけを復号すると、低い再生品質レベルで動画像が再生され、上位層の符号化データを含めて復号すると、高い再生品質レベルで動画像が再生される。 When the encoded data of a plurality of reproduction quality levels is multiplexed with a hierarchical structure, for example, when only the encoded data of the lower layer is decoded, a moving image is reproduced at a low reproduction quality level and includes the encoded data of the upper layer. When decoded, the moving image is reproduced at a high reproduction quality level.

なお、以上の構成要素の任意の組み合わせ、本発明の表現を方法、装置、システム、記録媒体、コンピュータプログラムなどの間で変換したものもまた、本発明の態様として有効である。 It should be noted that any combination of the above-described constituent elements and a conversion of the expression of the present invention between a method, an apparatus, a system, a recording medium, a computer program, etc. are also effective as an aspect of the present invention.

本発明によれば、画像の設定された領域を効率良く階層符号化することができる。 According to the present invention, a region where an image is set can be efficiently hierarchically encoded.

図１は、実施の形態に係る符号化装置１００の構成図である。これらの構成は、ハードウエア的には、任意のコンピュータのＣＰＵ、メモリ、その他のＬＳＩで実現でき、ソフトウエア的にはメモリにロードされた画像符号化機能のあるプログラムなどによって実現されるが、ここではそれらの連携によって実現される機能ブロックを描いている。したがって、これらの機能ブロックがハードウエアのみ、ソフトウエアのみ、またはそれらの組み合わせによっていろいろな形で実現できることは、当業者には理解されるところである。 FIG. 1 is a configuration diagram of an encoding apparatus 100 according to an embodiment. These configurations can be realized in hardware by a CPU, memory, or other LSI of an arbitrary computer, and in software, it is realized by a program having an image encoding function loaded in the memory. Here, functional blocks realized by the cooperation are depicted. Therefore, those skilled in the art will understand that these functional blocks can be realized in various forms by hardware only, software only, or a combination thereof.

本実施の形態の符号化装置１００は、次世代画像圧縮技術であるＳＶＣ（Scalable Video Coding）に準拠して、動画像に空間（spatial）スケーラビリティ、時間（temporal）スケーラビリティ、およびＳＮＲ（signal to noise ratio）スケーラビリティの少なくとも１つをもたせて符号化する「スケーラブル符号化」を行う。 The encoding apparatus 100 according to the present embodiment conforms to SVC (Scalable Video Coding), which is a next-generation image compression technology, to spatial (spatial) scalability, temporal scalability, and SNR (signal to noise) for moving images. ratio) Perform “scalable coding” in which coding is performed with at least one of scalability.

動画像の符号化には、国際標準化機関であるＩＳＯ（International Organization for Standardization）／ＩＥＣ（International Electrotechnical Commission）によって標準化されたＭＰＥＧ（Moving Picture Experts Group）シリーズの規格（ＭＰＥＧ−１、ＭＰＥＧ−２およびＭＰＥＧ−４）、電気通信に関する国際標準機関であるＩＴＵ−Ｔ（International Telecommunication Union-Telecommunication Standardization Sector）によって標準化されたＨ．２６ｘシリーズの規格（Ｈ．２６１、Ｈ．２６２およびＨ．２６３）、もしくは両方の標準化機関によって合同で標準化された最新の動画像圧縮符号化標準規格であるＨ．２６４／ＡＶＣ（両機関における正式勧告名はそれぞれMPEG-4 Part 10: Advanced Video CodingとH.264）に準拠する技術が用いられる。 For the coding of moving images, the standards (MPEG-1, MPEG-2 and MPEG-2) of the MPEG (Moving Picture Experts Group) standardized by ISO (International Organization for Standardization) / IEC (International Electrotechnical Commission) MPEG-4), an H.264 standardized by ITU-T (International Telecommunication Union-Telecommunication Standardization Sector) which is an international standard organization for telecommunications. 26x series standards (H.261, H.262 and H.263), or H.264, the latest video compression coding standard standardized jointly by both standards organizations. H.264 / AVC (the official recommendation names in both organizations are MPEG-4 Part 10: Advanced Video Coding and H.264, respectively) are used.

なお、実施の形態では、動画像の符号化の単位としてフレームを例に挙げて説明するが、符号化の単位はフィールドであってもよい。また、符号化の単位はＭＰＥＧ−４におけるＶＯＰであってもよい。 In the embodiment, a frame is used as an example of a moving image encoding unit, but the encoding unit may be a field. The unit of encoding may be a VOP in MPEG-4.

符号化装置１００は、フレーム単位で動画像の入力を受け取り、動画像をスケーラブル符号化し、動画像の符号化ストリームを出力する。入力された動画フレームはフレームメモリに格納され、符号化に係る各処理部によって読み書きされる。 The encoding apparatus 100 receives an input of a moving image in units of frames, performs scalable encoding of the moving image, and outputs an encoded stream of the moving image. The input moving image frame is stored in the frame memory and read / written by each processing unit related to encoding.

符号化装置１００は、空間スケーラビリティをもたせて動画像を符号化するために拡張レイヤ処理ブロック１１０と基本レイヤ処理ブロック１２０を有し、基本レイヤ処理ブロック１２０において低解像度で動画像を圧縮符号化し、拡張レイヤ処理ブロック１１０において高解像度で動画像を圧縮符号化する。これにより、階層毎に空間解像度の異なる動画像の符号化データが生成される。 The encoding device 100 includes an enhancement layer processing block 110 and a base layer processing block 120 for encoding a moving image with spatial scalability, and the base layer processing block 120 compresses and encodes the moving image at a low resolution. The enhancement layer processing block 110 compresses and encodes the moving image with high resolution. Thereby, encoded data of moving images having different spatial resolutions is generated for each layer.

また、符号化装置１００は、時間スケーラビリティをもたせて動画像を符号化するために、ＭＣＴＦ（Motion Compensated Temporal Filtering、動き補償時間方向フィルタ）技術を用いる。ＭＣＴＦ技術は、時間軸方向のサブバンド分割に動き補償を組み合わせたものであり、階層的な動き補償を行う。これにより、階層毎にフレームレートが異なる動画像の符号化データが生成される。 Also, the encoding apparatus 100 uses an MCTF (Motion Compensated Temporal Filtering) technique in order to encode a moving image with temporal scalability. The MCTF technique combines subband division in the time axis direction with motion compensation, and performs hierarchical motion compensation. As a result, encoded data of moving images having different frame rates for each layer is generated.

また、符号化装置１００は、ＳＮＲスケーラビリティをもたせて動画像を符号化するために、量子化ステップや量子化により切り捨てる下位ビット数を変えて動画像を圧縮符号化する。これにより、階層毎に画質の異なる動画像の符号化データが生成される。 Also, the encoding apparatus 100 compresses and encodes a moving image by changing the quantization step and the number of lower bits to be discarded by the quantization in order to encode the moving image with SNR scalability. Thereby, encoded data of moving images having different image quality for each layer is generated.

なお、空間スケーラビリティ、時間スケーラビリティ、およびＳＮＲスケーラビリティは任意に組み合わせてよい。 Note that spatial scalability, temporal scalability, and SNR scalability may be arbitrarily combined.

ＲＯＩ設定部１４は、動画フレーム上にレイヤ単位でＲＯＩ領域を設定する。ＲＯＩ設定部１４は、インタラクティブ性のない通常のＲＯＩ領域の他、インタラクティブＲＯＩ領域も設定することができる。インタラクティブＲＯＩ領域内では動画像の再生の際に任意にＲＯＩ領域を設定可能である。以下、インタラクティブＲＯＩ領域と通常ＲＯＩ領域を総称する場合、単にＲＯＩ領域と呼ぶ。 The ROI setting unit 14 sets the ROI area in units of layers on the moving image frame. The ROI setting unit 14 can set an interactive ROI area in addition to a normal ROI area having no interactivity. In the interactive ROI area, the ROI area can be arbitrarily set when a moving image is reproduced. Hereinafter, when the interactive ROI area and the normal ROI area are collectively referred to, they are simply referred to as the ROI area.

インタラクティブＲＯＩ領域や通常ＲＯＩ領域などの注目領域は、ユーザが画像上の特定の領域を指定することによって選択されてもよく、画像の中心領域などあらかじめ定まった領域が選択されてもよい。また、人物や文字が映っている領域などの重要領域が注目領域として自動的に抽出されてもよい。また、動画像において特定のオブジェクト等の動きを追跡することによって注目領域がフレーム単位で自動的に選択されてもよい。 The attention area such as the interactive ROI area and the normal ROI area may be selected by the user specifying a specific area on the image, or a predetermined area such as the center area of the image may be selected. In addition, an important area such as an area in which a person or a character is shown may be automatically extracted as the attention area. Further, the attention area may be automatically selected in units of frames by tracking the movement of a specific object or the like in the moving image.

なお、注目領域といっても、必ずしも高画質で再生することだけを目的としない。たとえば、プライバシーを保護する目的では、人物の顔が写っている注目領域を低画質で再生することが必要となる。インタラクティブＲＯＩ符号化や通常のＲＯＩ符号化は、そのような目的でも用いられる。スケーラブル符号化された画像データを用いて、インタラクティブＲＯＩ領域内でプライバシー保護の必要のある領域は低解像度、低フレームレート、あるいは低画質で再生することができる。また、プライバシー保護の必要のある領域を通常ＲＯＩ領域に指定し、あらかじめ他の領域よりは解像度、フレームレートあるいは画質を落として符号化することもできる。 It should be noted that the attention area is not necessarily intended only for reproduction with high image quality. For example, for the purpose of protecting privacy, it is necessary to reproduce a region of interest in which a person's face is captured with low image quality. Interactive ROI encoding and normal ROI encoding are also used for such purposes. By using scalable encoded image data, an area that needs privacy protection in the interactive ROI area can be reproduced at a low resolution, a low frame rate, or a low image quality. It is also possible to designate an area requiring privacy protection as a normal ROI area and encode in advance with a lower resolution, frame rate or image quality than other areas.

本実施の形態では、ＲＯＩ設定部１４は、レイヤ毎に独立にＲＯＩ領域を指定することができる。たとえば、基本レイヤにおいてＲＯＩ領域を指定するが、拡張レイヤにおいては基本レイヤのＲＯＩ領域に対応する領域をＲＯＩ領域に指定しなくてもよい。逆に、拡張レイヤにおいて、基本レイヤではＲＯＩ領域を指定していない領域をＲＯＩ領域に指定してもよい。拡張レイヤが複数ある場合、それぞれの拡張レイヤにおいて独立にＲＯＩ領域を指定することもできる。もちろん、レイヤ毎に独立にＲＯＩ領域を設定するだけでなく、基本レイヤと拡張レイヤの全レイヤを通じて共通の領域をＲＯＩ領域に設定してもよい。 In the present embodiment, the ROI setting unit 14 can specify the ROI region independently for each layer. For example, the ROI area is specified in the base layer, but the area corresponding to the ROI area of the base layer may not be specified as the ROI area in the enhancement layer. Conversely, in the enhancement layer, an area that does not designate the ROI area in the base layer may be designated as the ROI area. When there are a plurality of enhancement layers, the ROI region can be designated independently in each enhancement layer. Of course, not only the ROI region is set independently for each layer, but a common region may be set as the ROI region through all the layers of the base layer and the extension layer.

ＲＯＩ設定部１４は、レイヤ単位でＲＯＩ領域を指定するための情報（以下、「ＲＯＩ領域情報」という）を拡張レイヤ処理ブロック１１０の画像分割部１０ａと可変長符号化部３０ａ、および基本レイヤ処理ブロック１２０の画像分割部１０ｂと可変長符号化部３０ｂに与える。 The ROI setting unit 14 uses information for designating the ROI region in units of layers (hereinafter referred to as “ROI region information”), the image dividing unit 10a and the variable length coding unit 30a of the enhancement layer processing block 110, and base layer processing. This is given to the image dividing unit 10b and the variable length coding unit 30b of the block 120.

解像度変換部１２は、各レイヤにおける空間解像度に合わせて、入力されたフレームの画像データを縮小し、各レイヤの画像データを基本レイヤ処理ブロック１２０と拡張レイヤ処理ブロック１１０に与える。解像度変換部１２は、基本レイヤ処理ブロック１２０には低解像度の画像、拡張レイヤ処理ブロック１１０には高解像度の画像を与える。 The resolution conversion unit 12 reduces the image data of the input frame in accordance with the spatial resolution in each layer, and provides the image data of each layer to the base layer processing block 120 and the enhancement layer processing block 110. The resolution converter 12 provides the base layer processing block 120 with a low resolution image and the enhancement layer processing block 110 with a high resolution image.

基本レイヤ処理ブロック１２０は、解像度変換部１２により低解像度に変換された基本レイヤの画像を基本レイヤのＲＯＩ領域情報にしたがって分割し、ＲＯＩ領域毎に圧縮符号化して多重化部１８に出力する。符号化対象となる各領域がインタラクティブＲＯＩ領域、通常ＲＯＩ領域、非ＲＯＩ領域のいずれであるかによって、基本レイヤ処理ブロック１２０における符号化処理は異なる。 The base layer processing block 120 divides the base layer image converted to the low resolution by the resolution conversion unit 12 according to the ROI region information of the base layer, performs compression encoding for each ROI region, and outputs the result to the multiplexing unit 18. The encoding process in the base layer processing block 120 differs depending on whether each area to be encoded is an interactive ROI area, a normal ROI area, or a non-ROI area.

基本レイヤ処理ブロック１２０は、基本レイヤにおいて、通常ＲＯＩ領域を非ＲＯＩ領域とは異なる空間解像度、フレームレートまたは画質レベル、あるいはこれらの組み合わせで符号化する。たとえば、通常ＲＯＩ領域を非ＲＯＩ領域よりも高画質で符号化する場合、通常ＲＯＩ領域については、量子化の際に、異なる量子化テーブルを用いて、適用される量子化ステップを小さくしたり、量子化により切り捨てる下位ビット数を減らすなどにより、有効ビット数を多めに確保することで非ＲＯＩ領域よりも高画質で符号化する。 In the base layer, the base layer processing block 120 encodes the normal ROI region with a different spatial resolution, frame rate or image quality level, or a combination thereof than the non-ROI region. For example, when encoding a normal ROI region with a higher image quality than a non-ROI region, for the normal ROI region, a different quantization table is used during quantization, and the applied quantization step is reduced. Encoding is performed with higher image quality than the non-ROI area by securing a larger number of effective bits, for example, by reducing the number of lower bits to be discarded by quantization.

通常ＲＯＩ領域については、スケーラブル符号化により複数の異なる空間解像度、フレームレートまたは画質レベル、あるいはこれらの組み合わせをもたせてもよく、スケーラブル符号化を行わずに１つの空間解像度、フレームレートまたは画質レベル、あるいはこれらの組み合わせをもたせるだけにしてもよい。 In general, the ROI region may have a plurality of different spatial resolutions, frame rates or image quality levels, or a combination thereof by scalable coding. One spatial resolution, frame rate or image quality level without scalable coding, Or you may just give these combinations.

インタラクティブＲＯＩ領域については、通常、スケーラブル符号化を実施する。これにより、インタラクティブＲＯＩ領域内でユーザが指定された領域のみを解像度、フレームレート、画質レベルのいずれかを高くして再生し、それ以外の領域は通常の品質で再生するといったインタラクティブ性をもたせることができる。 For the interactive ROI region, scalable coding is usually performed. As a result, only the area designated by the user in the interactive ROI area is reproduced with a higher resolution, frame rate, or image quality level, and other areas are reproduced with normal quality. Can do.

時間スケーラブル符号化を行う場合は、基本レイヤ処理ブロック１２０においてＭＣＴＦ部２０ｂが動作し、階層毎にフレームレートを異ならせた符号化が行われる。空間スケーラビリティ符号化を行う場合は、基本レイヤ処理ブロック１２０の他に拡張レイヤ処理ブロック１１０が動作し、階層毎に空間解像度を異ならせた符号化が行われる。ＳＮＲスケーラブル符号化を行う場合は、量子化ステップや量子化により切り捨てる下位ビット数を変えることにより、階層毎に画質を異ならせた符号化が行われる。 When performing temporal scalable coding, the MCTF unit 20b operates in the base layer processing block 120, and coding is performed with different frame rates for each layer. When performing spatial scalability encoding, the enhancement layer processing block 110 operates in addition to the base layer processing block 120, and encoding with different spatial resolution is performed for each layer. When performing SNR scalable coding, coding with different image quality for each layer is performed by changing the quantization step and the number of lower bits to be discarded by quantization.

非ＲＯＩ領域については、通常はスケーラブル符号化を実施せず、基本レイヤ処理ブロック１２０において時間スケーラブル符号化に関係するＭＣＴＦ部２０ｂは動作せず、また、拡張レイヤ処理ブロック１１０を用いた空間スケーラブル符号化も行われない。 For the non-ROI region, normally, scalable coding is not performed, the MCTF unit 20b related to temporal scalable coding does not operate in the base layer processing block 120, and spatial scalable coding using the enhancement layer processing block 110 There is no conversion.

基本レイヤ処理ブロック１２０の各構成を説明する。 Each configuration of the base layer processing block 120 will be described.

画像分割部１０ｂは、解像度変換部１２から基本レイヤのフレーム画像のデータを受け取り、ＲＯＩ設定部１４から基本レイヤに対して設定されたＲＯＩ領域情報を受け取る。画像分割部１０ｂは、ＲＯＩ設定部１４から与えられた基本レイヤのＲＯＩ領域情報にしたがって、入力されたフレームの領域を複数の小領域に分割する。小領域の一例として、スライスを用いる。スライスは、Ｈ．２６４／ＡＶＣにおける符号化の基本単位であり、１フレームを複数のスライスに分割してスライス単位で符号化することが可能である。本実施の形態では、ＲＯＩ領域がレイヤ毎に独立に指定されていることに伴い、スライスもレイヤ毎に独立して設定されることになる。 The image dividing unit 10b receives the frame image data of the base layer from the resolution conversion unit 12, and receives the ROI region information set for the base layer from the ROI setting unit 14. The image dividing unit 10b divides the input frame region into a plurality of small regions in accordance with the ROI region information of the base layer given from the ROI setting unit 14. A slice is used as an example of the small area. The slices are H.264. It is a basic unit of encoding in H.264 / AVC, and one frame can be divided into a plurality of slices and encoded in units of slices. In the present embodiment, as the ROI area is designated independently for each layer, the slice is also set independently for each layer.

画像分割部１０ｂは、基本レイヤの画像のスライス分割に関する情報（「スライス情報」という）を可変長符号化部３０ｂに与える。スライス情報には、スライスグループのタイムを示す情報やスライスの領域情報が含まれる。 The image dividing unit 10b gives information (referred to as “slice information”) related to slice division of the base layer image to the variable length encoding unit 30b. The slice information includes information indicating the time of the slice group and slice area information.

ＲＯＩ設定部１４により、基本レイヤにおいてＲＯＩ領域としてインタラクティブＲＯＩ領域のみが設定されている場合は、基本レイヤの画像の全体領域は、インタラクティブＲＯＩ領域と、それ以外の領域（以下、「非ＲＯＩ領域」という）とに分割される。非ＲＯＩ領域は１つのスライスとなり、インタラクティブＲＯＩ領域内は、インタラクティブ性をもたせるために、さらに小さく分割され、インタラクティブＲＯＩ領域内に複数のスライスが設定される。 When only the interactive ROI area is set as the ROI area in the basic layer by the ROI setting unit 14, the entire area of the image of the basic layer includes the interactive ROI area and other areas (hereinafter referred to as “non-ROI area”). And divided. The non-ROI region becomes one slice, and the interactive ROI region is further divided into a plurality of slices in the interactive ROI region in order to have interactive properties.

ＲＯＩ設定部１４により、基本レイヤにおいてＲＯＩ領域としてインタラクティブＲＯＩ領域と通常ＲＯＩ領域の両方が設定されている場合は、基本レイヤの画像の全体領域は、インタラクティブＲＯＩ領域、通常ＲＯＩ領域、および非ＲＯＩ領域に分割され、通常ＲＯＩ領域に１つのスライス、非ＲＯＩ領域に別のスライスが設定され、インタラクティブＲＯＩ領域内には複数のスライスが設定される。 When both the interactive ROI area and the normal ROI area are set as the ROI areas in the base layer by the ROI setting unit 14, the entire area of the base layer image is the interactive ROI area, the normal ROI area, and the non-ROI area. One slice is set in the normal ROI area, another slice is set in the non-ROI area, and a plurality of slices are set in the interactive ROI area.

基本レイヤ処理ブロック１２０は、各スライスを他のスライスに依存することなく独立に符号化する。すなわち、各スライスは、他のスライスの画素データや動きベクトル情報を利用することなく、符号化対象スライス内に閉じた情報のみを利用して符号化される。 The base layer processing block 120 encodes each slice independently without depending on other slices. That is, each slice is encoded using only the information closed in the encoding target slice without using the pixel data and motion vector information of the other slices.

インタラクティブＲＯＩ領域をスライス単位で独立に符号化するのは、インタラクティブＲＯＩ領域内でスライス単位で部分的な領域をＲＯＩ領域として指定して復号することを可能とするためである。インタラクティブＲＯＩ領域が縦横に４分割され、１６個のスライスを含むとすると、インタラクティブＲＯＩ領域内ではスライス単位で独立したスケーラブル符号化がなされているため、動画像の復号の際、インタラクティブＲＯＩ領域内の任意のスライスを選び、選択したスライスについてスケーラブル符号化されたデータを利用して異なる品質で再生することができる。 The reason why the interactive ROI region is independently encoded in units of slices is to enable decoding by designating a partial region as an ROI region in units of slices within the interactive ROI region. Assuming that the interactive ROI area is divided into 4 parts vertically and horizontally and includes 16 slices, independent encoding is performed in units of slices in the interactive ROI area. Arbitrary slices can be selected and reproduced with different qualities using the scalable encoded data for the selected slices.

たとえば、インタラクティブＲＯＩ領域内の指定領域について高画質画像が要求された場合、まず最低画質の画像を得るためにすべてのスライスについて最下位層のみ復号する。次に、ユーザによって指定された領域に対応するスライスのみについて、ＳＮＲスケーラビリティの階層を上がりながら復号を繰り返し、ユーザが要求する画質になるまで復号する。 For example, when a high-quality image is requested for a designated area in the interactive ROI area, first, only the lowest layer is decoded for all slices in order to obtain an image with the lowest image quality. Next, only the slice corresponding to the area specified by the user is repeatedly decoded while going up the SNR scalability hierarchy until the image quality requested by the user is obtained.

また、インタラクティブＲＯＩ領域内の指定領域について拡大画像が要求された場合、まず最低画質の画像を得るためにすべてのスライスについて最下位層のみ復号する。次に、ユーザによって指定された領域に対応するスライスのみについて、空間スケーラビリティの階層を上がりながら復号を繰り返し、ユーザが要求する解像度になるまで復号する。 When an enlarged image is requested for a designated area in the interactive ROI area, only the lowest layer is decoded for all slices in order to obtain an image with the lowest image quality. Next, only the slice corresponding to the area designated by the user is repeatedly decoded while going up the spatial scalability hierarchy until the resolution requested by the user is reached.

基本レイヤ処理ブロック１２０は、通常ＲＯＩ領域、非ＲＯＩ領域の場合は、インタラクティブＲＯＩ領域のように注目領域の位置やサイズの任意指定が可能なインタラクティブ性をもたせる必要はないため、基本的にはスライスに分割することなく、通常ＲＯＩ領域、非ＲＯＩ領域の全体をそれぞれ１つのスライスに割り当てて符号化する。もっともインタラクティブ性以外の目的で必要に応じて通常ＲＯＩ領域、非ＲＯＩ領域についてもスライスに分割して符号化することはかまわない。 Since the basic layer processing block 120 does not have to be interactive in which the position and size of the region of interest can be arbitrarily specified unlike the interactive ROI region in the case of a normal ROI region and a non-ROI region, The entire ROI area and the entire non-ROI area are allotted to one slice without being divided into two. Of course, the normal ROI region and the non-ROI region may be divided into slices and encoded as necessary for the purpose other than the interactive property.

画像分割部１０ｂは、基本レイヤのフレームの画像データをスライス単位でＭＣＴＦ部２０ｂに与える。スライスを時間スケーラブル符号化する場合は、ＭＣＴＦ部２０ｂが動作する。ＭＣＴＦ部２０ｂは、ＭＣＴＦ技術にしたがった動き補償時間フィルタリングを実施する。ＭＣＴＦ部２０ｂは、動画像フレームから動きベクトルを求め、動きベクトルを用いて時間フィルタリングを実施する。時間フィルタリングは、ハール（Haar）ウェーブレット変換を用いて実施され、この結果、各階層に高域フレームと低域フレームとを含むフレームレートの異なる複数の階層に分解される。分解された高域フレームと低域フレームは階層毎にメモリに保持され、動きベクトルも階層毎にメモリに保持される。 The image dividing unit 10b supplies the frame layer image data to the MCTF unit 20b in units of slices. When the slice is time-scalable encoded, the MCTF unit 20b operates. The MCTF unit 20b performs motion compensation time filtering according to the MCTF technique. The MCTF unit 20b obtains a motion vector from the moving image frame, and performs temporal filtering using the motion vector. Temporal filtering is performed using a Haar wavelet transform, and as a result, the temporal filtering is decomposed into a plurality of layers having different frame rates including a high frequency frame and a low frequency frame in each layer. The decomposed high-frequency frame and low-frequency frame are stored in the memory for each layer, and the motion vector is also stored in the memory for each layer.

ＭＣＴＦ部２０ｂにおける処理が終了すると、すべての階層の高域フレームと最終的な階層の低域フレームは、予測部２４ｂに送られ、すべての階層の動きベクトルは、動き符号化部２２ｂに送られる。 When the processing in the MCTF unit 20b is completed, the high frequency frames of all layers and the low frequency frames of the final layer are sent to the prediction unit 24b, and the motion vectors of all layers are sent to the motion encoding unit 22b. .

予測部２４ｂは、画像フレームのフレーム内予測を行い、フレーム内予測誤差画像をＤＣＴ部２６ｂに与える。ＤＣＴ部２６ｂは、予測部２４ｂから供給されたフレーム内予測誤差画像を離散コサイン変換（ＤＣＴ）し、得られたＤＣＴ係数を量子化部２８ｂに与える。量子化部２８ｂは、ＤＣＴ係数を量子化し、可変長符号化部３０ｂに与える。 The prediction unit 24b performs intra-frame prediction of an image frame, and provides an intra-frame prediction error image to the DCT unit 26b. The DCT unit 26b performs discrete cosine transform (DCT) on the intra-frame prediction error image supplied from the prediction unit 24b, and gives the obtained DCT coefficient to the quantization unit 28b. The quantization unit 28b quantizes the DCT coefficient and provides it to the variable length coding unit 30b.

可変長符号化部３０ｂは、ＲＯＩ設定部１４から基本レイヤのＲＯＩ領域情報を受け取り、画像分割部１０ｂから基本レイヤのスライス情報を受け取り、量子化部２８ｂから差分画像の量子化されたＤＣＴ係数を受け取る。可変長符号化部３０ｂは、基本レイヤのＲＯＩ領域情報、基本レイヤのスライス情報、およびＤＣＴ係数とを可変長符号化し、多重化部１８に与える。スライス情報は、フレーム画像を復号する際、スライスグループを特定し、各スライスの領域を特定するために必要となる。ＲＯＩ領域情報は、復号の際、通常ＲＯＩ領域、インタラクティブＲＯＩ領域、および非ＲＯＩ領域を特定するために必要となる。 The variable length encoding unit 30b receives the ROI region information of the base layer from the ROI setting unit 14, receives the slice information of the base layer from the image dividing unit 10b, and receives the quantized DCT coefficient of the difference image from the quantization unit 28b. receive. The variable length coding unit 30 b performs variable length coding on the ROI region information of the base layer, the slice information of the base layer, and the DCT coefficient, and provides the multiplexed unit 18 with the variable length coding. The slice information is necessary to specify a slice group and an area of each slice when decoding a frame image. The ROI area information is necessary for specifying a normal ROI area, an interactive ROI area, and a non-ROI area at the time of decoding.

ＳＮＲスケーラブル符号化を行う場合は、複数のビットプレーンの内、切り捨てる下位ビットプレーンの数を変えたり、量子化ステップを変えることで、階層毎に異なる画質の符号化データを生成する。 When performing SNR scalable encoding, encoded data with different image quality is generated for each layer by changing the number of lower-order bit planes to be discarded or changing the quantization step.

動き符号化部２２ｂは、ＭＣＴＦ部２０ｂから与えられた動きベクトル情報を符号化し、多重化部１８に与える。 The motion encoding unit 22 b encodes the motion vector information given from the MCTF unit 20 b and provides the same to the multiplexing unit 18.

空間スケーラブル符号化のために、基本レイヤ処理ブロック１２０の動き符号化部２２ｂおよび予測部２４ｂは、それぞれ基本レイヤにおける各フレームの動きベクトルとフレーム内予測誤差画像を拡張レイヤ処理ブロック１１０の動き符号化部２２ａおよび内挿処理部３２に与える。 For spatial scalable coding, the motion coding unit 22b and the prediction unit 24b of the base layer processing block 120 respectively perform motion coding of the enhancement layer processing block 110 on the motion vector of each frame and the intra-frame prediction error image in the base layer. To the unit 22a and the interpolation processing unit 32.

次に、拡張レイヤ処理ブロック１１０の各構成を説明する。 Next, each configuration of the enhancement layer processing block 110 will be described.

画像分割部１０ａは、解像度変換部１２から拡張レイヤのフレーム画像のデータを受け取り、ＲＯＩ設定部１４から拡張レイヤに対して設定されたＲＯＩ領域情報を受け取る。基本レイヤの画像が低解像度であるのに対して、拡張レイヤの画像は高解像度である。画像分割部１０ａは、ＲＯＩ設定部１４から与えられた拡張レイヤのＲＯＩ領域情報にしたがって、入力されたフレームの領域を複数のスライスに分割する。基本レイヤと拡張レイヤでは異なるＲＯＩ領域が設定されるため、拡張レイヤと基本レイヤでは異なるスライス分割がなされることになる。 The image dividing unit 10a receives the frame image data of the enhancement layer from the resolution conversion unit 12, and receives the ROI region information set for the enhancement layer from the ROI setting unit 14. The base layer image has a low resolution, whereas the enhancement layer image has a high resolution. The image dividing unit 10 a divides the input frame region into a plurality of slices according to the ROI region information of the enhancement layer given from the ROI setting unit 14. Since different ROI regions are set in the base layer and the enhancement layer, different slice divisions are performed in the enhancement layer and the base layer.

拡張レイヤ処理ブロック１１０による拡張レイヤの各スライスの符号化処理は、基本的には基本レイヤ処理ブロック１２０における基本レイヤの各スライスの符号化処理と同じであり、スライス毎に独立した符号化を行うが、拡張レイヤ処理ブロック１１０は、基本レイヤ処理ブロック１２０の予測符号化結果を利用して、基本レイヤと拡張レイヤの差分情報だけを符号化する。 The encoding process of each slice of the enhancement layer by the enhancement layer processing block 110 is basically the same as the encoding process of each slice of the base layer in the base layer processing block 120, and independent coding is performed for each slice. However, the enhancement layer processing block 110 encodes only the difference information between the base layer and the enhancement layer using the prediction coding result of the base layer processing block 120.

ここで、基本レイヤと拡張レイヤでは設定されるＲＯＩ領域が異なるため、レイヤ間で差分符号化をする際、レイヤ間で対応する領域は、同一のＲＯＩ領域ではないことに留意する。たとえば、拡張レイヤでＲＯＩ領域でも、対応する基本レイヤの領域は非ＲＯＩ領域であったり、逆に拡張レイヤで非ＲＯＩ領域でも、対応する基本レイヤの領域はＲＯＩ領域であったりする。したがって、拡張レイヤ処理ブロック１１０は、拡張レイヤの各スライスの差分符号化の際、拡張レイヤの各スライスの領域に対応する基本レイヤの領域との間で差分を取ることになる。 Here, since the ROI area to be set is different between the base layer and the enhancement layer, when performing differential encoding between layers, it is noted that areas corresponding to each other are not the same ROI area. For example, even in the ROI region in the enhancement layer, the corresponding base layer region is a non-ROI region, and conversely, in the enhancement layer and the non-ROI region, the corresponding base layer region is an ROI region. Therefore, the enhancement layer processing block 110 takes a difference from the region of the base layer corresponding to the region of each slice of the enhancement layer at the time of differential encoding of each slice of the enhancement layer.

拡張レイヤ処理ブロック１１０のＭＣＴＦ部２０ａは、基本レイヤ処理ブロック１２０のＭＣＴＦ部２０ｂと同じ動き補償時間フィルタリングを拡張レイヤの画像の各スライスに施し、動きベクトル情報を動き符号化部２２ａに、符号化データを予測部２４ａに与える。拡張レイヤ処理ブロック１１０の動き符号化部２２ａは、基本レイヤ処理ブロック１２０の動き符号化部２２ｂから基本レイヤの画像の動きベクトルの情報を受け取る。拡張レイヤ処理ブロック１１０の動き符号化部２２ａは、拡張レイヤの各スライスの動きベクトル情報と基本レイヤの対応する領域の動きベクトル情報との間で差分符号化を行い、階層間で差分符号化された動きベクトル情報を多重化部１８に与える。 The MCTF unit 20a of the enhancement layer processing block 110 performs the same motion compensation time filtering as the MCTF unit 20b of the base layer processing block 120 on each slice of the enhancement layer image, and encodes the motion vector information to the motion encoding unit 22a. Data is given to the prediction unit 24a. The motion encoding unit 22a of the enhancement layer processing block 110 receives information on the motion vector of the base layer image from the motion encoding unit 22b of the base layer processing block 120. The motion encoding unit 22a of the enhancement layer processing block 110 performs differential encoding between the motion vector information of each slice of the enhancement layer and the motion vector information of the corresponding region of the base layer, and is differentially encoded between layers. The obtained motion vector information is provided to the multiplexing unit 18.

基本レイヤと拡張レイヤ間で動きベクトル情報を差分符号化する際、基本レイヤにおける動きベクトルを拡張レイヤの解像度に合うように拡大する。たとえば、拡張レイヤの領域の高さおよび幅がそれぞれ、基本レイヤの対応する領域の高さおよび幅の２倍である場合、基本レイヤの対応領域について得られた動きベクトルを高さ方向、幅方向にそれぞれ２倍する。拡張レイヤ処理ブロック１１０の動き符号化部２２ａは、このようにして拡張レイヤの解像度に合わせて拡大された基本レイヤの動きベクトルと、拡張レイヤの動きベクトルとの間で差分を取って符号化する。このように階層間で動きベクトル情報を差分符号化することにより、拡張レイヤの各領域の動きベクトル情報をそのまま符号化するよりは、動きベクトル情報の符号量を減らすことができる。 When motion vector information is differentially encoded between the base layer and the enhancement layer, the motion vector in the base layer is expanded to match the resolution of the enhancement layer. For example, when the height and width of the enhancement layer region are twice the height and width of the corresponding region of the base layer, the motion vector obtained for the corresponding region of the base layer is expressed in the height direction and the width direction. Double each. The motion encoding unit 22a of the enhancement layer processing block 110 encodes the difference between the motion vector of the base layer and the motion vector of the enhancement layer that have been expanded according to the resolution of the enhancement layer in this way. . By encoding the motion vector information between layers in this way, the amount of code of the motion vector information can be reduced rather than encoding the motion vector information of each area of the enhancement layer as it is.

内挿処理部３２は、基本レイヤ処理ブロック１２０の予測部２４ｂから基本レイヤの各領域の予測誤差画像を受け取り、拡張レイヤの解像度に合わせるために画素を内挿する処理を行う。内挿処理部３２は、内挿処理が施された基本レイヤの予測誤差画像を拡張レイヤ処理ブロック１１０の予測部２４ａに与える。 The interpolation processing unit 32 receives a prediction error image of each region of the base layer from the prediction unit 24b of the base layer processing block 120, and performs a process of interpolating pixels to match the resolution of the enhancement layer. The interpolation processing unit 32 gives the prediction error image of the base layer subjected to the interpolation processing to the prediction unit 24a of the enhancement layer processing block 110.

拡張レイヤ処理ブロック１１０の予測部２４ａは、ＭＣＴＦ部２０ａから与えられた画像フレームをフレーム内予測符号化する。さらに、拡張レイヤ処理ブロック１１０の予測部２４ａは、拡張レイヤの予測誤差画像と、拡張レイヤの解像度に合うように内挿された基本レイヤの予測誤差画像との間で差分符号化を行う。階層間で予測誤差画像の差分符号化を行うことにより、符号量を減らすことができる。 The prediction unit 24a of the enhancement layer processing block 110 performs intraframe prediction encoding on the image frame provided from the MCTF unit 20a. Further, the prediction unit 24 a of the enhancement layer processing block 110 performs differential encoding between the prediction error image of the enhancement layer and the prediction error image of the base layer that is interpolated to match the resolution of the enhancement layer. By performing differential encoding of prediction error images between layers, the amount of codes can be reduced.

拡張レイヤ処理ブロック１１０のＤＣＴ部２６ａおよび量子化部２８ａによる処理は、基本レイヤ処理ブロック１２０のＤＣＴ部２６ｂおよび量子化部２８ｂによる処理と同じである。 The processing by the DCT unit 26a and the quantization unit 28a of the enhancement layer processing block 110 is the same as the processing by the DCT unit 26b and the quantization unit 28b of the base layer processing block 120.

拡張レイヤ処理ブロック１１０の可変長符号化部３０ａは、ＲＯＩ設定部１４から拡張レイヤのＲＯＩ領域情報を受け取り、画像分割部１０ａから拡張レイヤのスライス情報を受け取り、量子化部２８ａから予測誤差画像の量子化されたＤＣＴ係数を受け取る。可変長符号化部３０ａは、拡張レイヤのＲＯＩ領域情報、拡張レイヤのスライス情報、およびＤＣＴ係数を可変長符号化し、多重化部１８に与える。 The variable length coding unit 30a of the enhancement layer processing block 110 receives enhancement layer ROI region information from the ROI setting unit 14, receives enhancement layer slice information from the image segmentation unit 10a, and receives a prediction error image from the quantization unit 28a. Receive quantized DCT coefficients. The variable length coding unit 30 a performs variable length coding on the enhancement layer ROI region information, enhancement layer slice information, and DCT coefficients, and supplies the result to the multiplexing unit 18.

多重化部１８は、基本レイヤ処理ブロック１２０から与えられる基本レイヤにおける符号化データと、拡張レイヤ処理ブロック１１０から与えられる拡張レイヤにおける符号化データとを１つにまとめた符号化ストリームを生成して出力する。各レイヤの符号化データには、画像データ、動きベクトル情報、ＲＯＩ領域情報、およびスライス情報が含まれる。 The multiplexing unit 18 generates an encoded stream in which the encoded data in the base layer given from the base layer processing block 120 and the coded data in the enhancement layer given from the enhancement layer processing block 110 are combined into one. Output. The encoded data of each layer includes image data, motion vector information, ROI region information, and slice information.

なお、本実施の形態では、各レイヤのＲＯＩ領域情報とスライス情報を可変長符号化部３０ａ、３０ｂにおいて符号化したが、各レイヤのＲＯＩ領域情報とスライス情報は符号化せずに、多重化部１８に与えて、符号化ストリームのヘッダに付加するようにしてもよい。 In this embodiment, the ROI region information and slice information of each layer are encoded by the variable length encoding units 30a and 30b. However, the ROI region information and slice information of each layer are multiplexed without being encoded. It may be given to the unit 18 and added to the header of the encoded stream.

上記では、基本レイヤ処理ブロック１２０と拡張レイヤ処理ブロック１１０とを別々に設け、それぞれ基本レイヤの低解像度画像、拡張レイヤの高解像度画像を符号化する構成を説明したが、基本レイヤ処理ブロック１２０と拡張レイヤ処理ブロック１１０で共通する構成要素は基本レイヤと拡張レイヤの間で共有してもよい。たとえば、基本レイヤ処理ブロック１２０の構成だけを設け、基本レイヤ処理ブロック１２０において基本レイヤの符号化を行い、基本レイヤにおける予測誤差画像と動きベクトル情報をメモリに保持する。次に、メモリに保持された基本レイヤの符号化結果を利用して、拡張レイヤの符号化処理を基本レイヤ処理ブロック１２０において実行する。このように基本レイヤにおける符号化処理の構成を拡張レイヤに流用すれば、符号化装置１００の回路規模を小さくすることができる。 In the above description, the base layer processing block 120 and the enhancement layer processing block 110 are separately provided, and the configuration for encoding the base layer low resolution image and the enhancement layer high resolution image has been described. Components common to the enhancement layer processing block 110 may be shared between the base layer and the enhancement layer. For example, only the configuration of the base layer processing block 120 is provided, the base layer is encoded in the base layer processing block 120, and the prediction error image and motion vector information in the base layer are held in the memory. Next, using the base layer encoding result stored in the memory, the enhancement layer encoding process is executed in the base layer processing block 120. Thus, if the configuration of the encoding process in the base layer is diverted to the enhancement layer, the circuit scale of the encoding device 100 can be reduced.

上記の説明では、空間スケーラビリティの階層が基本レイヤと拡張レイヤの２つである場合を説明したが、３以上の空間スケーラビリティの階層をもたせてもよい。その場合は、最下位のレイヤに対しては基本レイヤ処理ブロック１２０を設け、それ以外のレイヤに対してはレイヤ毎に拡張レイヤ処理ブロック１１０の構成を設け、下位層に行くほど低解像度の画像を符号化するようにし、下位層から上位層へ予測誤差画像と動きベクトル情報を送り、各レイヤで差分符号化を行うように構成する。あるいは、基本レイヤ処理ブロック１２０だけを設けて、基本レイヤ処理ブロック１２０をレイヤ毎に繰り返し利用することにより各レイヤの符号化を逐次的に行うように構成してもよい。 In the above description, the case where there are two layers of the spatial scalability, that is, the base layer and the enhancement layer has been described, but three or more layers of spatial scalability may be provided. In that case, the base layer processing block 120 is provided for the lowest layer, and the configuration of the extended layer processing block 110 is provided for each of the other layers. The prediction error image and motion vector information are sent from the lower layer to the upper layer, and differential encoding is performed in each layer. Alternatively, only the base layer processing block 120 may be provided, and the base layer processing block 120 may be repeatedly used for each layer so that each layer is sequentially encoded.

以下、符号化装置１００によりレイヤ単位でＲＯＩ領域を設定して画像を符号化する例を説明する。 Hereinafter, an example in which an image is encoded by setting an ROI region in units of layers by the encoding device 100 will be described.

まず、比較のために、レイヤ単位でＲＯＩ領域を設定しないで画像を符号化する場合を説明する。図２は、基本レイヤと拡張レイヤの両方に共通するＲＯＩ領域を設定して符号化する例を示す。画像の基本レイヤ２００ａには通常ＲＯＩ領域２０２ａとインタラクティブＲＯＩ領域２０４ａが設定され、拡張レイヤ２００ｂにも同一位置に通常ＲＯＩ領域２０２ｂとインタラクティブＲＯＩ領域２０４ｂが設定されている。 First, for comparison, a case will be described in which an image is encoded without setting an ROI region for each layer. FIG. 2 shows an example in which ROI regions common to both the base layer and the enhancement layer are set and encoded. A normal ROI area 202a and an interactive ROI area 204a are set in the base layer 200a of the image, and a normal ROI area 202b and an interactive ROI area 204b are also set in the same position in the extended layer 200b.

このように、ＳＶＣにおいて画像に対してＲＯＩ領域を指定した場合、通常は、基本レイヤと拡張レイヤの双方に共通するＲＯＩ領域が設定され、画像は各レイヤにおいてＲＯＩ領域に合わせてスライスに分割される。 As described above, when an ROI area is designated for an image in SVC, an ROI area common to both the base layer and the enhancement layer is normally set, and the image is divided into slices in accordance with the ROI area in each layer. The

図２の例では、各レイヤにおいて、画像は、通常ＲＯＩ領域２０２ａ、２０２ｂ、インタラクティブＲＯＩ領域２０４ａ、２０４ｂ、それ以外の非ＲＯＩ領域に分割され、各領域にスライスが割り当てられる。また、通常ＲＯＩ領域２０２ａ、２０２ｂとインタラクティブＲＯＩ領域２０４ａ、２０４ｂの重複領域（斜線で図示した領域）は、通常ＲＯＩ領域２０２ａ、２０２ｂおよびインタラクティブＲＯＩ領域２０４ａ、２０４ｂとは別の独立した領域として処理する。このようにして画像は、各レイヤにおいて、通常ＲＯＩ領域、インタラクティブＲＯＩ領域、非ＲＯＩ領域、通常ＲＯＩ領域とインタラクティブ領域の重複領域の４つに分割され、スライスが割り当てられる。なお、インタラクティブＲＯＩ領域２０４ａ、２０４ｂについては、インタラクティブ性をもたせるためにさらに小領域に分割され、小領域毎にスライスが割り当てられる。 In the example of FIG. 2, in each layer, an image is divided into normal ROI areas 202a and 202b, interactive ROI areas 204a and 204b, and other non-ROI areas, and a slice is assigned to each area. In addition, the overlapping area (the hatched area) of the normal ROI areas 202a and 202b and the interactive ROI areas 204a and 204b is processed as an independent area different from the normal ROI areas 202a and 202b and the interactive ROI areas 204a and 204b. . In this way, the image is divided into four areas, that is, a normal ROI area, an interactive ROI area, a non-ROI area, and an overlapping area of the normal ROI area and the interactive area in each layer, and slices are assigned. Note that the interactive ROI areas 204a and 204b are further divided into small areas in order to provide interactivity, and a slice is assigned to each small area.

このように、通常のＳＶＣでは、ＲＯＩ領域を設定すると全レイヤについて同じ領域分割がなされ、レイヤ毎にその領域分割にしたがって領域単位の符号化がなされるため、符号化効率が落ち、また、処理負荷も大きくなる。そこで、本実施の形態では、レイヤ単位でＲＯＩ領域の設定を異ならせる。 As described above, in the normal SVC, when the ROI region is set, the same region division is performed for all layers, and the coding is performed in units of regions according to the region division for each layer. The load also increases. Therefore, in the present embodiment, the setting of the ROI area is made different for each layer.

図３（ａ）、（ｂ）は、基本レイヤと拡張レイヤでＲＯＩ領域の設定を異ならせて符号化する例を示す。図３（ａ）では、基本レイヤ２００ａにはＲＯＩ領域は設定されず、拡張レイヤ２００ｂにおいてのみ通常ＲＯＩ領域２０２ｂが設定されている。基本レイヤ２００ａは、ＲＯＩ領域が設定されていないため、領域分割することなく、画像の全体領域を符号化することができ、フレーム内の差分符号化の効率を上げることができる。 FIGS. 3A and 3B show an example in which the ROI region is set differently in the base layer and the enhancement layer. In FIG. 3A, the ROI area is not set in the base layer 200a, and the normal ROI area 202b is set only in the enhancement layer 200b. Since the ROI region is not set in the base layer 200a, the entire region of the image can be encoded without dividing the region, and the efficiency of differential encoding within the frame can be increased.

一方、拡張レイヤ２００ｂは、通常ＲＯＩ領域２０２ｂが設定されているため、通常ＲＯＩ領域２０２ｂと、それ以外の非ＲＯＩ領域の２つに分けて、領域毎に独立した符号化をすることになる。また、レイヤ間での差分符号化は、通常ＲＯＩ領域２０２ｂについては、基本レイヤ２００ａの同一位置の領域との差分を取って符号化することによりなされる。 On the other hand, since the normal ROI area 202b is set in the enhancement layer 200b, the normal ROI area 202b and the other non-ROI areas are divided into two and encoded independently for each area. Also, differential encoding between layers is performed by taking the difference between the normal ROI region 202b and the region at the same position of the base layer 200a.

図３（ｂ）は、基本レイヤ２００ａには通常ＲＯＩ領域２０２ａが設定され、拡張レイヤ２００ｂにはインタラクティブＲＯＩ領域２０４ｂが設定されている。基本レイヤ２００ａは、通常ＲＯＩ領域２０２ａとそれ以外の非ＲＯＩ領域の２つに分割されて符号化され、拡張レイヤ２００ｂは、インタラクティブＲＯＩ領域２０４ｂとそれ以外の非ＲＯＩ領域の２つに分割されて符号化される。 In FIG. 3B, the normal ROI area 202a is set in the base layer 200a, and the interactive ROI area 204b is set in the enhancement layer 200b. The base layer 200a is divided and encoded into a normal ROI area 202a and other non-ROI areas, and the enhancement layer 200b is divided into an interactive ROI area 204b and other non-ROI areas. Encoded.

図２と図３（ｂ）を比較する。図２では、ＲＯＩ領域とインタラクティブＲＯＩ領域の重複部分も含めて、基本レイヤ２００ａでも拡張レイヤ２００ｂでも４つの領域に分けて符号化しなければならない。一方、図３（ｂ）のように、基本レイヤ２００ａにＲＯＩ領域を設定し、拡張レイヤ２００ｂにインタラクティブＲＯＩ領域を設定すれば、ＲＯＩ領域とインタラクティブＲＯＩ領域の重複部分を区別して符号化する必要もなくなり、基本レイヤ２００ａ、拡張レイヤ２００ｂともに２つの領域に分けて符号化するだけで済む。このため、符号化効率を高め、また処理負荷を低減することができる。 FIG. 2 is compared with FIG. In FIG. 2, the base layer 200a and the enhancement layer 200b must be divided into four regions including the overlapping portion of the ROI region and the interactive ROI region. On the other hand, as shown in FIG. 3B, if the ROI area is set in the base layer 200a and the interactive ROI area is set in the enhancement layer 200b, it is necessary to distinguish and encode the overlapping part of the ROI area and the interactive ROI area. Both the base layer 200a and the enhancement layer 200b need only be divided into two regions and encoded. For this reason, encoding efficiency can be improved and processing load can be reduced.

図３（ａ）の例では、通常ＲＯＩ領域２０２ｂは拡張レイヤ２００ｂに設けられているため、通常ＲＯＩ領域について空間スケーラビリティがあり、その領域を高解像度で表示することが可能である。 In the example of FIG. 3A, since the normal ROI region 202b is provided in the enhancement layer 200b, the normal ROI region has spatial scalability, and the region can be displayed with high resolution.

図３（ｂ）の例では、インタラクティブＲＯＩ領域２０４ｂは拡張レイヤ２００ｂに設けられているため、インタラクティブＲＯＩ領域については空間スケーラビリティがあり、高解像度で表示できるが、通常ＲＯＩ領域２０２ａは基本レイヤ２００ａに設定されているため、通常ＲＯＩ領域については空間スケーラビリティがなく、低解像度の表示しかできない。そこで、図３（ｂ）の例において、レイヤの数をさらに増やして、通常ＲＯＩ領域についても空間スケーラビリティをもたせるようにしてもよい。 In the example of FIG. 3B, since the interactive ROI area 204b is provided in the enhancement layer 200b, the interactive ROI area has spatial scalability and can be displayed with high resolution, but the normal ROI area 202a is in the base layer 200a. Since it is set, the normal ROI area has no spatial scalability and can only display at low resolution. Therefore, in the example of FIG. 3B, the number of layers may be further increased so that the normal ROI region also has spatial scalability.

図４は、３つのレイヤでＲＯＩ領域の設定を異ならせて符号化する例を示す。図３（ｂ）の例に比べて、レイヤを１つ増やし、基本レイヤ２００ａではＲＯＩ領域を設定せず、第１の拡張レイヤ２００ｂに通常ＲＯＩ領域２０２ｂを設定し、第２の拡張レイヤ２００ｃにインタラクティブＲＯＩ領域２０４ｃを設定する。基本レイヤ２００ａは低解像度、第１の拡張レイヤ２００ｂは中解像度、第２の拡張レイヤ２００ｃは高解像度の画像である。このように３つのレイヤに分けてＲＯＩ領域を設定することにより、通常ＲＯＩ領域とインタラクティブＲＯＩ領域の双方について空間スケーラビリティをもたせることができる。 FIG. 4 shows an example of encoding with different ROI region settings in three layers. Compared to the example of FIG. 3B, the number of layers is increased by one, the ROI area is not set in the base layer 200a, the normal ROI area 202b is set in the first extension layer 200b, and the second extension layer 200c is set. An interactive ROI area 204c is set. The base layer 200a is a low resolution image, the first enhancement layer 200b is a medium resolution image, and the second enhancement layer 200c is a high resolution image. Thus, by setting the ROI area in three layers, it is possible to provide spatial scalability for both the normal ROI area and the interactive ROI area.

図５〜図８は、レイヤ単位のＲＯＩ領域の設定の他の例を示す。図５〜図８では、簡単のため、通常ＲＯＩ領域とインタラクティブＲＯＩ領域を区別せずに、単にＲＯＩ領域として説明するが、いずれの場合も通常ＲＯＩ領域のみの設定、インタラクティブＲＯＩ領域のみの設定、通常ＲＯＩ領域とインタラクティブＲＯＩ領域が混在する設定のいずれであってもよい。 5 to 8 show other examples of setting ROI areas in units of layers. 5 to 8, for the sake of simplicity, the normal ROI region and the interactive ROI region are not distinguished from each other and are simply described as ROI regions. In either case, only the normal ROI region is set, only the interactive ROI region is set, Any setting in which a normal ROI area and an interactive ROI area are mixed may be used.

図５は、基本レイヤではＲＯＩ領域を設定せず、拡張レイヤでＲＯＩ領域を設定する例を示す。基本レイヤ２００ａにはＲＯＩ領域が設定されず、第１の拡張レイヤ２００ｂにＲＯＩ領域２１０ｂが設定され、第２の拡張レイヤ２００ｃにも第１の拡張レイヤ２００ｂのＲＯＩ領域２１０ｂと同一位置にＲＯＩ領域２１０ｃが設定されている。 FIG. 5 shows an example in which the ROI area is not set in the base layer but the ROI area is set in the enhancement layer. The ROI area is not set in the base layer 200a, the ROI area 210b is set in the first enhancement layer 200b, and the ROI area is also set in the second enhancement layer 200c at the same position as the ROI area 210b of the first enhancement layer 200b. 210c is set.

第１の拡張レイヤ２００ｂのＲＯＩ領域２１０ｂを符号化する際、基本レイヤ２００ａの対応領域との差分を符号化する。第２の拡張レイヤ２００ｃのＲＯＩ領域２１０ｃを符号化する際は、第１の拡張レイヤ２００ｂの同一位置にＲＯＩ領域２１０ｂがあるため、第１の拡張レイヤ２００ｂのＲＯＩ領域２１０ｂとの差分を符号化する。ＲＯＩ領域に指定された領域は３段階の空間解像度でスケーラビリティをもつが、基本レイヤ２００ａではＲＯＩ領域を設定しないため、基本レイヤ２００ａは分割することなく符号化できる。 When encoding the ROI area 210b of the first enhancement layer 200b, the difference from the corresponding area of the base layer 200a is encoded. When the ROI region 210c of the second enhancement layer 200c is encoded, the difference from the ROI region 210b of the first enhancement layer 200b is encoded because the ROI region 210b is located at the same position of the first enhancement layer 200b. To do. The area designated as the ROI area has scalability at three levels of spatial resolution. However, since the ROI area is not set in the base layer 200a, the base layer 200a can be encoded without being divided.

図６は、レイヤ単位で設定されたＲＯＩ領域がレイヤ間で入れ子構造をもつ例を示す。基本レイヤ２００ａにＲＯＩ領域２２０ａが設定され、第１の拡張レイヤ２００ｂには、基本レイヤ２００ａのＲＯＩ領域２２０ａよりも広いＲＯＩ領域２２２ｂが設定される。第２の拡張レイヤ２００ｃには、第１の拡張レイヤ２００ｂのＲＯＩ領域２２２ｂよりもさらに広いＲＯＩ領域２２４ｃが設定される。 FIG. 6 shows an example in which the ROI area set in units of layers has a nested structure between layers. The ROI area 220a is set in the base layer 200a, and the ROI area 222b wider than the ROI area 220a of the base layer 200a is set in the first enhancement layer 200b. In the second enhancement layer 200c, a ROI region 224c that is wider than the ROI region 222b of the first enhancement layer 200b is set.

第１の拡張レイヤ２００ｂのＲＯＩ領域２２２ｂを符号化する際、基本レイヤ２００ａのＲＯＩ領域２２０ａと重なる部分については、基本レイヤ２００ａのＲＯＩ領域２２０ａとの間で差分を計算するが、基本レイヤ２００ａのＲＯＩ領域２２０ａと重ならない外側の部分については、基本レイヤ２００ａの非ＲＯＩ領域との差分を計算する。第２の拡張レイヤ２００ｃのＲＯＩ領域２２４ｃを符号化する場合も同じである。 When the ROI region 222b of the first enhancement layer 200b is encoded, a difference between the ROI region 220a of the base layer 200a is calculated for the portion overlapping the ROI region 220a of the base layer 200a. For the outer portion that does not overlap the ROI region 220a, the difference from the non-ROI region of the base layer 200a is calculated. The same applies when the ROI region 224c of the second enhancement layer 200c is encoded.

図６では、基本レイヤで指定したＲＯＩ領域を含む広いＲＯＩ領域を拡張レイヤで指定することでＲＯＩ領域に入れ子構造をもたせたが、逆に、基本レイヤで指定したＲＯＩ領域に含まれる狭いＲＯＩ領域を拡張レイヤで指定することでＲＯＩ領域に入れ子構造をもたせてもよい。後者の場合、中心部ほど高解像度の画像によるスケーラビリティをもたせることができる。 In FIG. 6, a wide ROI area including the ROI area specified in the base layer is specified in the extension layer so that the ROI area is nested. On the contrary, the narrow ROI area included in the ROI area specified in the base layer May be nested in the ROI area by designating in the enhancement layer. In the latter case, the center portion can be provided with scalability by a high-resolution image.

図７は、レイヤ毎にＲＯＩ領域の位置が異なる設定例を示す。基本レイヤ２００ａにＲＯＩ領域２３０ａが設定され、第１の拡張レイヤ２００ｂには、基本レイヤ２００ａのＲＯＩ領域２３０ａと一部が重なる位置にＲＯＩ領域２３２ｂが設定される。さらに、第２の拡張レイヤ２００ｃには、第１の拡張レイヤ２００ｂのＲＯＩ領域２３２ｂとは異なる位置にＲＯＩ領域２３４ｃが設定される。 FIG. 7 shows a setting example in which the position of the ROI region is different for each layer. The ROI area 230a is set in the base layer 200a, and the ROI area 232b is set in the first enhancement layer 200b at a position partially overlapping with the ROI area 230a of the base layer 200a. Further, the ROI region 234c is set in the second enhancement layer 200c at a position different from the ROI region 232b of the first enhancement layer 200b.

異なる位置に複数のＲＯＩ領域を指定する場合でも、レイヤ単位でＲＯＩ領域を分けて設定するため、各レイヤにおける領域分割数を少なくすることができる。また、ＲＯＩ領域をレイヤ別に設けることで、ＲＯＩ領域にレイヤ間で重なりがあっても、同一レイヤでＲＯＩ領域が重なりをもつ場合を減らすことができるため、領域の分割数の増加を抑え、符号化効率を高めることができる。 Even when a plurality of ROI areas are designated at different positions, the ROI areas are set separately for each layer, so that the number of area divisions in each layer can be reduced. Also, by providing ROI regions for each layer, even if there are overlaps between layers in the ROI region, it is possible to reduce the number of ROI regions that overlap in the same layer. Efficiency can be increased.

図８は、レイヤ毎に異なる数のＲＯＩ領域が設定される例を示す。第１のＲＯＩ領域２４０ａ、２４０ｂ、２４０ｃは、基本レイヤ２００ａ、第１の拡張レイヤ２００ｂ、第２の拡張レイヤ２００ｃのすべてのレイヤに設けられ、第２のＲＯＩ領域２４２ｂ、２４２ｃは、第１の拡張レイヤ２００ｂと第２の拡張レイヤ２００ｃに設けられ、第３のＲＯＩ領域２４４ｃは、第２の拡張レイヤ２００ｃにのみ設けられる。 FIG. 8 shows an example in which different numbers of ROI regions are set for each layer. The first ROI regions 240a, 240b, 240c are provided in all layers of the base layer 200a, the first enhancement layer 200b, and the second enhancement layer 200c, and the second ROI regions 242b, 242c The third ROI area 244c is provided only in the second enhancement layer 200c, and is provided in the enhancement layer 200b and the second enhancement layer 200c.

図８の画像には３つのＲＯＩ領域が存在するが、基本レイヤ２００ａでは１つのＲＯＩ領域だけ設定して符号化し、第１の拡張レイヤ２００ｂでは２つのＲＯＩ領域だけ設定して符号化するため、すべてのレイヤで３つのＲＯＩ領域を設定する場合に比べて、符号化効率が良くなる。図８の例では、階層が上がる、すなわち解像度が上がるにつれて、ＲＯＩ領域の個数が増える場合を示したが、逆に、低い解像度ほどＲＯＩ領域の個数を多く設定してもよい。 Although there are three ROI regions in the image of FIG. 8, only one ROI region is set and encoded in the base layer 200a, and only two ROI regions are set and encoded in the first enhancement layer 200b. Compared with the case where three ROI regions are set in all layers, the encoding efficiency is improved. In the example of FIG. 8, the case where the number of ROI areas increases as the hierarchy increases, that is, the resolution increases, but conversely, the lower the resolution, the larger the number of ROI areas may be set.

以上述べたように、本実施の形態の符号化装置１００によれば、レイヤ毎に独立したＲＯＩ領域を設定してスライスに割り当てるため、各レイヤで見た場合に領域の分割数を減らして符号化することができ、符号化効率を高め、処理負荷を軽減できる。 As described above, according to coding apparatus 100 of the present embodiment, since an independent ROI region is set for each layer and assigned to a slice, the number of divided regions is reduced when viewed in each layer. Encoding efficiency can be improved and processing load can be reduced.

また、画像上でＲＯＩ領域が重なる場合でも、重なり合うＲＯＩ領域を異なるレイヤに分けて設定すれば、重なり合う領域を別のスライスに割り当てる必要がなくなるため、符号化効率が低下することがない。 Even when the ROI regions overlap each other on the image, if the overlapping ROI regions are divided and set in different layers, it is not necessary to assign the overlapping regions to different slices, so that the encoding efficiency does not decrease.

ＲＯＩ領域として、通常ＲＯＩ領域だけではなく、インタラクティブＲＯＩ領域も同様にレイヤ毎に設定して符号化することができ、また、通常ＲＯＩ領域とインタラクティブＲＯＩ領域が混在する場合にも同様に符号化することができる。 As the ROI area, not only the normal ROI area but also the interactive ROI area can be set and encoded for each layer in the same manner, and the normal ROI area and the interactive ROI area are also encoded in the same way. be able to.

以上、本発明を実施の形態をもとに説明した。実施の形態は例示であり、それらの各構成要素や各処理プロセスの組み合わせにいろいろな変形例が可能なこと、またそうした変形例も本発明の範囲にあることは当業者に理解されるところである。 The present invention has been described based on the embodiments. The embodiments are exemplifications, and it will be understood by those skilled in the art that various modifications can be made to combinations of the respective constituent elements and processing processes, and such modifications are within the scope of the present invention. .

上記の実施の形態では、各レイヤに設定されたＲＯＩ領域にしたがって画像をスライスに分割する例を説明したが、本発明は、ＲＯＩに限らず、何らかの目的で画像に領域を設定する場合に広く適用することができる。また、画像に領域を設定しない場合であっても、レイヤ毎に異なるスライスを設定して符号化する必要がある場合にも本発明は有効である。ＳＶＣでは全レイヤを通じて同一のスライスを設定して符号化し、基本レイヤにのみスライスグループのタイプを指定するビットを与えるのが通常であるが、レイヤ毎にスライスの形状や個数を変えて符号化する必要がある場合は、基本レイヤだけでなく拡張レイヤにもスライスグループのタイプを指定するビットを与えればよい。 In the above embodiment, an example in which an image is divided into slices according to the ROI area set for each layer has been described. However, the present invention is not limited to ROI, and is widely used when an area is set for an image for some purpose. Can be applied. In addition, even when an area is not set in an image, the present invention is also effective when it is necessary to set and encode different slices for each layer. In SVC, it is normal to set and encode the same slice through all layers, and to give bits specifying the type of slice group only to the base layer, but encode by changing the shape and number of slices for each layer. If necessary, a bit specifying the slice group type may be given not only to the base layer but also to the enhancement layer.

上記の実施の形態では、動画を例に階層符号化を説明したが、本発明は、静止画の階層符号化にも適用することができる。 In the above embodiment, hierarchical encoding has been described by taking a moving image as an example, but the present invention can also be applied to hierarchical encoding of still images.

実施の形態に係る符号化装置の構成図である。It is a block diagram of the encoding apparatus which concerns on embodiment. 基本レイヤと拡張レイヤの両方に共通するＲＯＩ領域を設定して符号化する例を示す図である。It is a figure which shows the example which sets and encodes the ROI area | region common to both a base layer and an extended layer. 基本レイヤと拡張レイヤでＲＯＩ領域の設定を異ならせて符号化する例を示す図である。It is a figure which shows the example encoded by changing the setting of a ROI area | region in a base layer and an extended layer. ３つのレイヤでＲＯＩ領域の設定を異ならせて符号化する例を示す図である。It is a figure which shows the example encoded by changing the setting of a ROI area | region in three layers. 基本レイヤではＲＯＩ領域を設定せず、拡張レイヤでＲＯＩ領域を設定する例を示す図である。It is a figure which shows the example which sets an ROI area | region in an extended layer, without setting an ROI area | region in a base layer. レイヤ単位で設定されたＲＯＩ領域がレイヤ間で入れ子構造をもつ例を示す図である。It is a figure which shows the example in which the ROI area | region set per layer has a nesting structure between layers. レイヤ毎にＲＯＩ領域の位置が異なる設定例を示す図である。It is a figure which shows the example of a setting from which the position of a ROI area | region differs for every layer. レイヤ単位のＲＯＩ領域の設定の他の例を示す図である。It is a figure which shows the other example of the setting of the ROI area | region of a layer unit.

Explanation of symbols

１０ａ、１０ｂ画像分割部、１２解像度変換部、１４ＲＯＩ設定部、１８多重化部、２０ａ、２０ｂＭＣＴＦ部、２２ａ、２２ｂ動き符号化部、２４ａ、２４ｂ予測部、２６ａ、２６ｂＤＣＴ部、２８ａ、２８ｂ量子化部、３０ａ、３０ｂ可変長符号化部、３２内挿処理部、１００符号化装置、１１０拡張レイヤ処理ブロック、１２０基本レイヤ処理ブロック。 10a, 10b Image segmentation unit, 12 Resolution conversion unit, 14 ROI setting unit, 18 Multiplexing unit, 20a, 20b MCTF unit, 22a, 22b Motion coding unit, 24a, 24b Prediction unit, 26a, 26b DCT unit, 28a, 28b quantization unit, 30a, 30b variable length coding unit, 32 interpolation processing unit, 100 coding device, 110 enhancement layer processing block, 120 base layer processing block.

Claims

An encoding method, wherein when an image is divided into a plurality of layers and encoded hierarchically, regions are set independently for each layer, and independent encoding is performed for each region in each layer.

The encoding method according to claim 1, wherein information specifying the region of each layer is included in encoded data of the image.

The encoding method according to claim 1 or 2, wherein a slice, which is an independent encoding unit in an image, is independently set for each layer according to the region set in each layer.

4. The method according to claim 1, wherein when any two of the plurality of regions set in the image have overlapping portions, the plurality of regions are set separately in different layers. An encoding method according to claim 1.

The encoding method according to claim 1, wherein the area is encoded with an image quality different from that of other areas.

The encoding method according to any one of claims 1 to 5, wherein the plurality of layers are a base layer and an enhancement layer other than the base layer in scalable hierarchical encoding.