JP5870835B2

JP5870835B2 - Moving image processing apparatus, moving image processing method, and moving image processing program

Info

Publication number: JP5870835B2
Application number: JP2012102735A
Authority: JP
Inventors: 渡部　康弘; 康弘渡部
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2012-04-27
Filing date: 2012-04-27
Publication date: 2016-03-01
Anticipated expiration: 2032-04-27
Also published as: JP2013232724A

Description

本発明は、動画像処理装置、動画像処理方法および動画像処理プログラムに関する。 The present invention relates to a moving image processing apparatus, a moving image processing method, and a moving image processing program.

監視カメラ等で撮影された映像では、犯罪や事件等が発生したとき、犯人の顔等の関心領域（ＲＯＩ：Region of Interest）を拡大して精細に表示することが要求される。なお、犯罪や事件等が発生していないときでは、特定の領域を拡大して精細に表示する必要はない。画像全体を高解像度で記録する方法では、記憶容量が膨大になり、記録コストが増加する。なお、例えば、関心領域の圧縮率を関心領域以外の領域の圧縮率より低くすることにより、データ量の増加を抑制しつつ、関心領域の画質を関心領域以外の領域の画質より高くする技術が知られている。 In a video taken by a surveillance camera or the like, when a crime or an incident occurs, it is required to enlarge and display a region of interest (ROI) such as a criminal's face in detail. When no crime or incident has occurred, it is not necessary to enlarge a specific area and display it in detail. In the method of recording the entire image at a high resolution, the storage capacity becomes enormous and the recording cost increases. For example, there is a technique for making the image quality of the region of interest higher than the image quality of the region other than the region of interest while suppressing the increase in the data amount by making the compression rate of the region of interest lower than the compression rate of the region other than the region of interest. Are known.

また、搭載されているＣＰＵの処理能力に合わせて、復号する画像データを選択する画像再生装置が提案されている（例えば、特許文献１参照。）。例えば、画像再生装置は、元の画像から一部の画素を間引いた縮小画像を圧縮符号化した圧縮縮小画像データと、間引きの対象となった部分を圧縮符号化した付加情報とを受ける。そして、例えば、所定の基準値より高い処理能力のＣＰＵが搭載された画像再生装置は、圧縮縮小画像データおよび付加情報の両方を復元して、縮小画像より高い画質の再生画像を生成する。また、例えば、所定の基準値より低い処理能力のＣＰＵが搭載された画像再生装置は、一定の再生速度を維持するために、圧縮縮小画像データのみを復元して再生画像を生成する。 In addition, there has been proposed an image reproducing device that selects image data to be decoded in accordance with the processing capability of the mounted CPU (see, for example, Patent Document 1). For example, the image reproduction apparatus receives compressed reduced image data obtained by compression encoding a reduced image obtained by thinning a part of pixels from an original image, and additional information obtained by compression encoding a portion to be thinned. Then, for example, an image reproducing device equipped with a CPU having a processing capability higher than a predetermined reference value restores both the compressed and reduced image data and the additional information, and generates a reproduced image with higher image quality than the reduced image. In addition, for example, an image playback device equipped with a CPU having a processing capability lower than a predetermined reference value restores only compressed / reduced image data and generates a playback image in order to maintain a constant playback speed.

特開平８−２８９２９０号公報JP-A-8-289290

関心領域の圧縮率を関心領域以外の領域の圧縮率より低くする方法では、関心領域の解像度は、関心領域以外の領域の解像度と同様である。このため、関心領域の圧縮率を関心領域以外の領域の圧縮率より低くする方法では、関心領域を拡大して表示しても、関心領域が精細に表示されるとは限らない。また、復号する画像データをＣＰＵの処理能力に合わせて選択する方法では、画像全体を高解像度で記録したデータ（例えば、圧縮縮小画像データおよび付加情報）が必要であるため、データ量が膨大になる。 In the method in which the compression rate of the region of interest is lower than the compression rate of the region other than the region of interest, the resolution of the region of interest is the same as the resolution of the region other than the region of interest. For this reason, in the method in which the compression rate of the region of interest is lower than the compression rate of the region other than the region of interest, even if the region of interest is enlarged and displayed, the region of interest is not always displayed finely. In addition, the method of selecting image data to be decoded in accordance with the processing capability of the CPU requires data (for example, compressed / reduced image data and additional information) recorded on the entire image at a high resolution, so that the amount of data is enormous. Become.

１つの側面では、本発明の目的は、データ量の増加を抑制しつつ、関心領域を拡大して精細に表示可能にすることである。 In one aspect, an object of the present invention is to enlarge a region of interest and enable fine display while suppressing an increase in the amount of data.

本発明の一形態では、動画像処理装置は、入力画像を予め設定された縮小率で縮小した第１画像に対応する第１画像データを生成する第１画像生成部と、入力画像内の関心領域を縮小率に対応する間引き率で間引いて複数の関心領域画像に分割し、複数の関心領域画像の少なくとも１つを含む第２画像に対応する第２画像データを生成する第２画像生成部と、第１画像データおよび第２画像データを符号化してストリームデータを生成し、第２画像データを符号化するとき、第２画像に対応付けられた時刻に対応する第１画像に基づいて予測処理を実施可能な画像符号化部とを有している。 In one aspect of the present invention, the moving image processing apparatus includes a first image generation unit that generates first image data corresponding to a first image obtained by reducing an input image at a preset reduction rate, and an interest in the input image. A second image generating unit that generates a second image data corresponding to a second image including at least one of the plurality of region-of-interest images by thinning out the region at a thinning-out rate corresponding to the reduction rate and dividing the region into a plurality of regions-of-interest images When the first image data and the second image data are encoded to generate stream data and the second image data is encoded, a prediction is made based on the first image corresponding to the time associated with the second image. And an image encoding unit capable of performing processing.

データ量の増加を抑制しつつ、関心領域を拡大して精細に表示可能にできる。 While suppressing an increase in the amount of data, the region of interest can be enlarged and displayed finely.

一実施形態における動画像処理システムの例を示している。1 illustrates an example of a moving image processing system according to an embodiment. 図１に示した動画像符号化装置の一例を示している。2 illustrates an example of a moving image encoding apparatus illustrated in FIG. 1. 関心領域画像の分割の一例を示している。An example of division of a region of interest image is shown. 関心領域画像の割り付けの一例を示している。An example of allocation of a region-of-interest image is shown. 関心領域画像の割り付けの別の例を示している。The other example of allocation of the region-of-interest image is shown. 関心領域画像の割り付けの別の例を示している。The other example of allocation of the region-of-interest image is shown. 符号化の際のフレームの参照関係の一例を示している。An example of a frame reference relationship at the time of encoding is shown. 図２に示した動画像符号化装置の動作の一例を示している。3 shows an example of the operation of the video encoding apparatus shown in FIG. 図１に示した動画像再生装置の一例を示している。2 shows an example of a moving image playback apparatus shown in FIG. 図９に示した動画像再生装置の動作の一例を示している。10 shows an example of the operation of the moving image playback apparatus shown in FIG. 別の実施形態における動画像符号化装置の一例を示している。The example of the moving image encoder in another embodiment is shown. 図１１に示した動画像符号化装置の動作の一例を示している。12 illustrates an example of the operation of the moving image encoding device illustrated in FIG. 11. ストリーム構成の一例を示している。An example of the stream configuration is shown. 図１１に示した動画像符号化装置に対応した動画像再生装置の一例を示している。FIG. 12 illustrates an example of a moving image reproduction device corresponding to the moving image encoding device illustrated in FIG. 11. 図１４に示した動画像再生装置の動作の一例を示している。Fig. 15 illustrates an example of an operation of the moving image reproduction device illustrated in Fig. 14.

以下、実施形態を図面を用いて説明する。 Hereinafter, embodiments will be described with reference to the drawings.

図１は、一実施形態における動画像処理システムＳＹＳの例を示している。動画像処理システムＳＹＳは、動画像処理装置の一態様である動画像符号化装置１０と、動画像符号化装置１０で生成された圧縮ストリームＳＴＲＭを記憶するストリーム記憶部１００と、圧縮ストリームＳＴＲＭを復号する動画像再生装置１１０とを有している。なお、動画像再生装置１１０も動画像処理装置の一態様である。また、動画像処理システムＳＹＳも動画像処理装置の一態様である。 FIG. 1 shows an example of a moving image processing system SYS in one embodiment. The moving image processing system SYS includes a moving image encoding device 10 that is an aspect of the moving image processing device, a stream storage unit 100 that stores the compressed stream STRM generated by the moving image encoding device 10, and a compressed stream STRM. And a moving image reproduction device 110 for decoding. The moving image playback device 110 is also an aspect of the moving image processing device. The moving image processing system SYS is also an aspect of the moving image processing apparatus.

動画像符号化装置１０は、例えば、Ｈ．２６４等に準拠した符号化方式を用いて、画像データＩＩＭＧに基づく圧縮ストリームＳＴＲＭを生成する。画像データＩＩＭＧは、入力画像の画像データである。以下、画像データに対応する画像を、画像データと同様の符号で示す。例えば、符号ＩＩＭＧは、入力画像および入力画像データを示す。動画像符号化装置１０は、例えば、縮小画像生成部２０、関心画像生成部３０および画像符号化部４０を有している。 The moving image encoding apparatus 10 is, for example, H.264. A compressed stream STRM based on the image data IIMG is generated using an encoding method compliant with H.264 or the like. Image data IIMG is image data of an input image. Hereinafter, the image corresponding to the image data is indicated by the same reference numerals as the image data. For example, symbol IIMG indicates an input image and input image data. The moving image encoding apparatus 10 includes, for example, a reduced image generation unit 20, an interest image generation unit 30, and an image encoding unit 40.

縮小画像生成部２０は、画像データＩＩＭＧを受け、入力画像ＩＩＭＧを縮小した画像の画像データＲＩＭＧ（以下、縮小画像データＲＩＭＧとも称する）を生成する。すなわち、縮小画像生成部２０は、入力画像ＩＩＭＧを予め設定された縮小率で縮小した縮小画像ＲＩＭＧに対応する縮小画像データＲＩＭＧを生成する。 The reduced image generation unit 20 receives the image data IIMG and generates image data RIMG (hereinafter also referred to as reduced image data RIMG) of an image obtained by reducing the input image IIMG. That is, the reduced image generation unit 20 generates reduced image data RIMG corresponding to the reduced image RIMG obtained by reducing the input image IIMG at a preset reduction rate.

関心画像生成部３０は、画像データＩＩＭＧを受け、入力画像ＩＩＭＧ内の関心領域（ＲＯＩ：Region of Interest）を表示するための画像データＳＩＭＧ（以下、関心画像データＳＩＭＧとも称する）を生成する。関心領域は、例えば、人の顔である。例えば、関心画像生成部３０は、入力画像ＩＩＭＧ内の関心領域を縮小画像ＲＩＭＧの縮小率に対応する間引き率で間引いて複数の関心領域画像に分割する。そして、関心画像生成部３０は、複数の関心領域画像の少なくとも１つを含む関心画像ＳＩＭＧに対応する関心画像データＳＩＭＧを生成する。 The interest image generation unit 30 receives the image data IIMG and generates image data SIMG (hereinafter also referred to as interest image data SIMG) for displaying a region of interest (ROI) in the input image IIMG. The region of interest is, for example, a human face. For example, the interest image generation unit 30 thins out the region of interest in the input image IIMG at a thinning rate corresponding to the reduction rate of the reduced image RIMG, and divides it into a plurality of region of interest images. Then, the interest image generation unit 30 generates interest image data SIMG corresponding to the interest image SIMG including at least one of the plurality of region-of-interest images.

このように、関心画像データＳＩＭＧは、例えば、関心領域を拡大して精細に表示するための画像データを有している。例えば、関心画像データＳＩＭＧに基づいて生成される関心領域の画像は、縮小画像データＲＩＭＧに基づいて生成される縮小画像ＲＩＭＧ内の関心領域より、高解像度である。 In this way, the interest image data SIMG has image data for enlarging the region of interest and displaying it finely, for example. For example, the image of the region of interest generated based on the interested image data SIMG has a higher resolution than the region of interest in the reduced image RIMG generated based on the reduced image data RIMG.

画像符号化部４０は、縮小画像データＲＩＭＧおよび関心画像データＳＩＭＧを符号化して圧縮ストリームＳＴＲＭを生成する。圧縮ストリームＳＴＲＭは、例えば、縮小画像データＲＩＭＧを符号化したデータと関心画像データＳＩＭＧを符号化したデータとを含むストリームデータである。例えば、画像符号化部４０は、縮小画像生成部２０で生成された縮小画像データＲＩＭＧおよび関心画像生成部３０で生成された関心画像データＳＩＭＧを受ける。そして、画像符号化部４０は、縮小画像データＲＩＭＧおよび関心画像データＳＩＭＧをＨ．２６４等に準拠した符号化方式で符号化し、圧縮ストリームＳＴＲＭを生成する。 The image encoding unit 40 encodes the reduced image data RIMG and the interest image data SIMG to generate a compressed stream STRM. The compressed stream STRM is stream data including, for example, data obtained by encoding the reduced image data RIMG and data obtained by encoding the image data of interest SIMG. For example, the image encoding unit 40 receives the reduced image data RIMG generated by the reduced image generating unit 20 and the interested image data SIMG generated by the interested image generating unit 30. Then, the image encoding unit 40 converts the reduced image data RIMG and the interest image data SIMG to H.264. A compressed stream STRM is generated by encoding with an encoding method compliant with H.264 or the like.

例えば、画像符号化部４０は、縮小画像データＲＩＭＧおよび関心画像データＳＩＭＧを、Ｈ．２６４のＭＶＣ（Multiview Video Coding）に準拠した符号化方式で符号化する。以下、縮小画像データＲＩＭＧを符号化したデータを縮小画像データＲＩＭＧの符号化データとも称し、関心画像データＳＩＭＧを符号化したデータを関心画像データＳＩＭＧの符号化データとも称する。 For example, the image encoding unit 40 converts the reduced image data RIMG and the interest image data SIMG into the H.264 format. The encoding is performed using an encoding method compliant with H.264 MVC (Multiview Video Coding). Hereinafter, data obtained by encoding the reduced image data RIMG is also referred to as encoded data of the reduced image data RIMG, and data obtained by encoding the interest image data SIMG is also referred to as encoded data of the interest image data SIMG.

Ｈ．２６４のＭＶＣに準拠した符号化方式では、画像符号化部４０は、縮小画像データＲＩＭＧをベースビューのフレームとして符号化し、関心画像データＳＩＭＧを非ベースビューのフレームとして符号化する。例えば、関心画像データＳＩＭＧは、縮小画像データＲＩＭＧや他の関心画像データＳＩＭＧを参照して符号化される。これにより、Ｈ．２６４のＭＶＣに準拠した圧縮ストリームＳＴＲＭが生成される。すなわち、画像符号化部４０は、関心画像データＳＩＭＧを符号化するとき、関心画像ＳＩＭＧに対応付けられた時刻に対応する縮小画像ＲＩＭＧに基づいて予測処理を実行可能である。 H. In the encoding method compliant with the H.264 MVC, the image encoding unit 40 encodes the reduced image data RIMG as a base view frame, and encodes the interest image data SIMG as a non-base view frame. For example, the interest image data SIMG is encoded with reference to the reduced image data RIMG and other interest image data SIMG. As a result, H.C. A compressed stream STRM conforming to H.264 MVC is generated. That is, when encoding the image-of-interest data SIMG, the image encoding unit 40 can execute a prediction process based on the reduced image RIMG corresponding to the time associated with the image of interest SIMG.

動画像符号化装置１０で生成された圧縮ストリームＳＴＲＭは、ストリーム記憶部１００に記憶される。ストリーム記憶部１００に記憶されている圧縮ストリームＳＴＲＭは、例えば、動画像再生装置１１０で復号される。例えば、動画像再生装置１１０は、ストリーム記憶部１００に記憶されている圧縮ストリームＳＴＲＭを読み出し、圧縮ストリームＳＴＲＭを復号する。すなわち、動画像再生装置１１０は、動画像符号化装置１０で生成された圧縮ストリームＳＴＲＭを復号する。これにより、ディスプレイ等に表示する画像ＯＩＭＧが生成される。 The compressed stream STRM generated by the moving image encoding device 10 is stored in the stream storage unit 100. The compressed stream STRM stored in the stream storage unit 100 is decoded by, for example, the moving image playback device 110. For example, the moving image reproduction device 110 reads the compressed stream STRM stored in the stream storage unit 100 and decodes the compressed stream STRM. That is, the moving image reproduction device 110 decodes the compressed stream STRM generated by the moving image encoding device 10. Thereby, an image OIMG to be displayed on a display or the like is generated.

例えば、動画像再生装置１１０は、Ｈ．２６４等に準拠したエンコーダとして機能する画像復号部１２０と、表示画像生成部１３０とを有している。画像復号部１２０は、ストリーム記憶部１００から受けた圧縮ストリームＳＴＲＭを復号し、縮小画像データＲＤＩＭＧおよび関心画像データＳＤＩＭＧを生成する。なお、関心領域の精細表示の指示等（例えば、関心領域の精細表示要求や関心領域指定）がないとき、画像復号部１２０は、関心画像データＳＤＩＭＧを生成しなくてもよい。すなわち、画像復号部１２０は、関心領域の精細表示の指示等がないとき（例えば、通常時）、関心画像データＳＩＭＧの符号化データを復号しなくてもよい。 For example, the moving image reproduction apparatus 110 is H.264. An image decoding unit 120 that functions as an encoder compliant with H.264 and the like, and a display image generation unit 130 are included. The image decoding unit 120 decodes the compressed stream STRM received from the stream storage unit 100, and generates reduced image data RDIMG and interest image data SDIMG. Note that when there is no instruction for fine display of the region of interest (for example, a fine display request for the region of interest or a region of interest designation), the image decoding unit 120 may not generate the interest image data SDIMG. That is, the image decoding unit 120 does not have to decode the encoded data of the image data of interest SIMG when there is no instruction for fine display of the region of interest (for example, at normal time).

表示画像生成部１３０は、画像復号部１２０で生成された縮小画像データＲＤＩＭＧおよび関心画像データＳＤＩＭＧを受ける。そして、表示画像生成部１３０は、例えば、縮小画像データＲＤＩＭＧおよび関心画像データＳＤＩＭＧを合成して、表示画像の画像データＯＩＭＧ（以下、表示画像データＯＩＭＧとも称する）を生成する。なお、関心領域の精細表示の指示等がないとき（通常時）、表示画像生成部１３０は、例えば、縮小画像データＲＤＩＭＧを表示画像データＯＩＭＧとして出力する。 The display image generation unit 130 receives the reduced image data RDIMG and the interest image data SDIMG generated by the image decoding unit 120. Then, for example, the display image generation unit 130 combines the reduced image data RDIMG and the interest image data SDIMG to generate display image image data OIMG (hereinafter also referred to as display image data OIMG). When there is no instruction for fine display of the region of interest (normal time), the display image generation unit 130 outputs reduced image data RDIMG as display image data OIMG, for example.

また、関心領域の精細表示の指示等があるときは、表示画像生成部１３０は、例えば、ユーザに指示された関心領域を関心画像ＳＤＩＭＧから抽出し、抽出した関心領域を縮小画像ＲＤＩＭＧに合成する。これにより、注目する関心領域（ユーザに指示された関心領域）を精細に表示するための画像データを含む表示画像データＯＩＭＧが生成される。このように、この実施形態では、注目する関心領域（ユーザに指示された関心領域）を他の領域に比べて高精細にした画像を、表示できる。 When there is an instruction for fine display of the region of interest, for example, the display image generation unit 130 extracts the region of interest instructed by the user from the image of interest SDIMG, and synthesizes the extracted region of interest into the reduced image RDIMG. . Thereby, display image data OIMG including image data for finely displaying a region of interest of interest (region of interest designated by the user) is generated. As described above, in this embodiment, it is possible to display an image in which a region of interest of interest (region of interest instructed by the user) is higher in definition than other regions.

図２は、図１に示した動画像符号化装置１０の一例を示している。図２の例では、画像データＩＩＭＧ等の転送は、メモリ５０を介して実行される。例えば、動画像符号化装置１０は、縮小画像生成部２０、関心画像生成部３０、画像符号化部４０およびメモリ５０を有している。なお、メモリ５０は、画像符号化部４０等のモジュール内に設けられてもよいし、動画像符号化装置１０の外部に設けられてもよい。 FIG. 2 shows an example of the moving picture encoding apparatus 10 shown in FIG. In the example of FIG. 2, transfer of the image data IIMG or the like is executed via the memory 50. For example, the moving image encoding device 10 includes a reduced image generation unit 20, an interest image generation unit 30, an image encoding unit 40, and a memory 50. The memory 50 may be provided in a module such as the image encoding unit 40 or may be provided outside the moving image encoding device 10.

メモリ５０は、デジタルビデオカメラ等により撮影された画像ＩＩＭＧの画像データＩＩＭＧを順次記憶する。縮小画像生成部２０は、画像データＩＩＭＧをメモリ５０から読み出し、画像ＩＩＭＧを予め設定された縮小率（例えば、水平１／２、垂直１／２の縮小率）で縮小して縮小画像データＲＩＭＧを生成する。そして、縮小画像生成部２０は、縮小画像データＲＩＭＧをメモリ５０に書き込む。縮小画像データＲＩＭＧは、画像符号化部４０の第１符号化部４２で符号化される。 The memory 50 sequentially stores image data IIMG of an image IIMG taken by a digital video camera or the like. The reduced image generation unit 20 reads the image data IIMG from the memory 50, reduces the image IIMG at a preset reduction ratio (for example, horizontal 1/2, vertical 1/2 reduction ratio), and reduces the reduced image data RIMG. Generate. Then, the reduced image generation unit 20 writes the reduced image data RIMG in the memory 50. The reduced image data RIMG is encoded by the first encoding unit 42 of the image encoding unit 40.

関心画像生成部３０は、例えば、関心領域検出部３２、切り出し部３４および画像合成部３６を有している。関心領域検出部３２は、ローカルデコード画像データＬＤＥＣ１をメモリ５０から読み出し、顔検出のアルゴリズム等を用いて関心領域を検出する。ローカルデコード画像データＬＤＥＣ１は、縮小画像データＲＩＭＧの符号化データを復号した画像データであり、第１符号化部４２の符号化処理時に生成される。 The interest image generation unit 30 includes, for example, a region of interest detection unit 32, a cutout unit 34, and an image composition unit 36. The region of interest detection unit 32 reads the local decoded image data LDEC1 from the memory 50, and detects the region of interest using a face detection algorithm or the like. The local decoded image data LDEC1 is image data obtained by decoding the encoded data of the reduced image data RIMG, and is generated during the encoding process of the first encoding unit 42.

関心領域検出部３２により検出される関心領域の個数、位置およびサイズは、顔検出のアルゴリズム等の所定のアルゴリズムを使用することにより、ローカルデコード画像ＬＤＥＣ１（検出対象の画像）に対して一意的に決まる。関心領域検出部３２は、検出した関心領域に関する関心領域情報ＣＩＮＦ（関心領域の個数、位置、サイズ等）を、切り出し部３４および画像合成部３６に通知する。 The number, position, and size of the region of interest detected by the region of interest detection unit 32 are uniquely determined for the local decoded image LDEC1 (detection target image) by using a predetermined algorithm such as a face detection algorithm. Determined. The region-of-interest detection unit 32 notifies the segmentation unit 34 and the image composition unit 36 of the region-of-interest information CINF (number of regions of interest, position, size, etc.) regarding the detected region of interest.

切り出し部３４は、画像データＩＩＭＧをメモリ５０から読み出し、画像ＩＩＭＧから関心領域を関心領域情報ＣＩＮＦに基づいて切り出す。これにより、画像ＩＩＭＧ内の関心領域に対応する関心領域画像ＲＯＩが生成される。以下、関心領域画像ＲＯＩに対応する画像ＩＩＭＧ内の関心領域を、関心領域画像と同様の符号（ＲＯＩ等）で示す。 The cutout unit 34 reads the image data IIMG from the memory 50, and cuts out the region of interest from the image IIMG based on the region of interest information CINF. Thereby, the region of interest image ROI corresponding to the region of interest in the image IIMG is generated. Hereinafter, the region of interest in the image IIMG corresponding to the region of interest image ROI is indicated by the same symbol (ROI or the like) as the region of interest image.

ここで、関心領域情報ＣＩＮＦが示す関心領域の位置（座標）等は、画像ＩＩＭＧを縮小した画像（ローカルデコード画像ＬＤＥＣ１）を基準にした値である。このため、切り出し部３４は、画像ＩＩＭＧから関心領域ＲＯＩを切り出す際、関心領域情報ＣＩＮＦが示す関心領域の位置（座標）等を画像ＩＩＭＧの画像サイズに合わせた値に補正する。例えば、切り出し部３４は、縮小画像ＲＩＭＧの水平方向および垂直方向のそれぞれの縮小率が１／２倍のとき、関心領域情報ＣＩＮＦが示す関心領域の座標を、水平方向および垂直方向にそれぞれ２倍する。 Here, the position (coordinates) of the region of interest indicated by the region-of-interest information CINF is a value based on the image (local decoded image LDEC1) obtained by reducing the image IIMG. Therefore, when the region of interest ROI is extracted from the image IIMG, the clipping unit 34 corrects the position (coordinates) of the region of interest indicated by the region of interest information CINF to a value that matches the image size of the image IIMG. For example, the cutout unit 34 doubles the coordinates of the region of interest indicated by the region-of-interest information CINF in the horizontal direction and the vertical direction, respectively, when the reduction ratio in the horizontal direction and the vertical direction of the reduced image RIMG is ½. To do.

切り出し部３４は、画像ＩＩＭＧから切り出した関心領域ＲＯＩの画像データＲＯＩ（以下、関心領域画像データＲＯＩとも称する）を、画像合成部３６に出力する。なお、切り出し部３４は、関心領域画像データＲＯＩをメモリ５０に書き込んでもよい。このときには、画像合成部３６は、関心領域画像データＲＯＩをメモリ５０から読み出す。 The cutout unit 34 outputs image data ROI of the region of interest ROI cut out from the image IIMG (hereinafter also referred to as region of interest image data ROI) to the image composition unit 36. Note that the cutout unit 34 may write the region-of-interest image data ROI into the memory 50. At this time, the image composition unit 36 reads the region-of-interest image data ROI from the memory 50.

画像合成部３６は、切り出し部３４から受けた関心領域画像データＲＯＩに基づいて、関心画像データＳＩＭＧを生成する。そして、画像合成部３６は、関心画像データＳＩＭＧをメモリ５０に書き込む。なお、関心画像ＳＩＭＧは、例えば、縮小画像ＲＩＭＧと同様の画像サイズになるように生成される。例えば、画像合成部３６は、切り出し部３４により切り出された関心領域画像ＲＯＩの画素を、画像ＩＭＧに対する縮小画像ＲＩＭＧの縮小率に対応する間引き率で間引く。これにより、関心領域画像（例えば、図３に示す関心領域画像ＳＲＯＩ）は、縮小画像ＲＩＭＧ内の関心領域と同様の画像サイズに生成される。この際、画像合成部３６は、例えば、間引き対象の画素が異なる複数の関心領域画像を生成する。 The image composition unit 36 generates the interest image data SIMG based on the region-of-interest image data ROI received from the clipping unit 34. Then, the image composition unit 36 writes the interest image data SIMG in the memory 50. Note that the interest image SIMG is generated to have the same image size as that of the reduced image RIMG, for example. For example, the image composition unit 36 thins out the pixels of the region of interest image ROI cut out by the cutout unit 34 at a thinning rate corresponding to the reduction rate of the reduced image RIMG with respect to the image IMG. Thereby, the region-of-interest image (for example, the region-of-interest image SROI shown in FIG. 3) is generated in the same image size as the region of interest in the reduced image RIMG. At this time, for example, the image composition unit 36 generates a plurality of region-of-interest images having different pixels to be thinned out.

例えば、画像合成部３６は、縮小画像ＲＩＭＧの水平方向および垂直方向のそれぞれの縮小率が１／２倍のとき、関心領域画像ＲＯＩの画素を４つのサンプリング位相で間引いてサンプリングする。これにより、関心領域画像ＲＯＩは、図３に示すように、４つのサンプリング位相の画像（関心領域画像ＳＲＯＩａ、ＳＲＯＩｂ、ＳＲＯＩｃ、ＳＲＯＩｄ）に分割される。なお、複数の関心領域ＲＯＩが検出されたときには、画像合成部３６は、各関心領域画像ＲＯＩに対して、上述の分割処理（異なるサンプリング位相を用いた間引きサンプリング）を実行する。 For example, when the respective reduction ratios in the horizontal direction and the vertical direction of the reduced image RIMG are ½ times, the image composition unit 36 samples the pixels of the region-of-interest image ROI by thinning out at four sampling phases. Thus, the region of interest image ROI is divided into four sampling phase images (region of interest images SROIa, SROIb, SROIc, SROId) as shown in FIG. When a plurality of regions of interest ROI are detected, the image composition unit 36 performs the above-described division processing (thinning sampling using different sampling phases) for each region of interest image ROI.

そして、画像合成部３６は、分割した関心領域画像を組み合わせて、関心画像データＳＩＭＧを生成する。分割された関心領域画像は、例えば、図４に示すように、１つの関心画像ＳＩＭＧに割り付けられる。あるいは、分割された関心領域画像は、例えば、図５や図６に示すように、複数の関心画像ＳＩＭＧに割り付けられてもよい。 Then, the image synthesis unit 36 generates the interest image data SIMG by combining the divided region-of-interest images. The divided region-of-interest image is allocated to one region of interest image SIMG, for example, as shown in FIG. Alternatively, the divided region-of-interest images may be assigned to a plurality of images of interest SIMG as shown in FIGS. 5 and 6, for example.

画像符号化部４０は、第１符号化部４２、第２符号化部４４およびストリーム合成部４６を有している。第１符号化部４２は、縮小画像データＲＩＭＧをメモリ５０から読み出し、縮小画像データＲＩＭＧを符号化して圧縮ストリームＣＳＴ１を生成する。例えば、第１符号化部４２は、Ｈ．２６４のＭＶＣに準拠した符号化方式で、縮小画像データＲＩＭＧをベースビューのフレームとして符号化する。そして、第１符号化部４２は、縮小画像データＲＩＭＧを符号化して生成した圧縮ストリームＣＳＴ１を、ストリーム合成部４６に出力する。 The image encoding unit 40 includes a first encoding unit 42, a second encoding unit 44, and a stream synthesis unit 46. The first encoding unit 42 reads the reduced image data RIMG from the memory 50, encodes the reduced image data RIMG, and generates a compressed stream CST1. For example, the first encoding unit 42 is configured as H.264. The reduced image data RIMG is encoded as a frame of a base view by an encoding method compliant with H.264 MVC. Then, the first encoding unit 42 outputs the compressed stream CST1 generated by encoding the reduced image data RIMG to the stream synthesis unit 46.

なお、第１符号化部４２は、縮小画像データＲＩＭＧの符号化処理中に生成したローカルデコード画像データＬＤＥＣ１を、メモリ５０に書き込む。例えば、第１符号化部４２は、縮小画像データＲＩＭＧの符号化処理時に、ローカルデコード画像データＬＤＥＣ１を参照画像として必要に応じて使用する。 The first encoding unit 42 writes the local decoded image data LDEC1 generated during the encoding process of the reduced image data RIMG in the memory 50. For example, the first encoding unit 42 uses the local decoded image data LDEC1 as a reference image as necessary during the encoding process of the reduced image data RIMG.

第２符号化部４４は、関心画像データＳＩＭＧをメモリ５０から読み出し、関心画像データＳＩＭＧを符号化して圧縮ストリームＣＳＴ２を生成する。例えば、第２符号化部４４は、Ｈ．２６４のＭＶＣに準拠した符号化方式で、関心画像データＳＩＭＧを非ベースビューのフレームとして符号化する。そして、第２符号化部４２は、関心画像データＳＩＭＧを符号化して生成した圧縮ストリームＣＳＴ２を、ストリーム合成部４６に出力する。 The second encoding unit 44 reads out the interest image data SIMG from the memory 50, encodes the interest image data SIMG, and generates a compressed stream CST2. For example, the second encoding unit 44 includes the H.264 standard. The image data SIMG of interest is encoded as a non-base view frame by an encoding method compliant with H.264 MVC. Then, the second encoding unit 42 outputs the compressed stream CST2 generated by encoding the interest image data SIMG to the stream synthesizing unit 46.

なお、第２符号化部４２は、関心画像データＳＩＭＧの符号化処理中に生成したローカルデコード画像データＬＤＥＣ２を、メモリ５０に書き込む。例えば、第２符号化部４４は、関心画像データＳＩＭＧの符号化処理時に、ローカルデコード画像データＬＤＥＣ２を参照画像として必要に応じて使用する。ローカルデコード画像データＬＤＥＣ２は、関心画像データＳＩＭＧの符号化データを復号した画像データである。 The second encoding unit 42 writes the local decoded image data LDEC2 generated during the encoding process of the image data of interest SIMG in the memory 50. For example, the second encoding unit 44 uses the local decoded image data LDEC2 as a reference image as necessary when encoding the image data of interest SIMG. The local decoded image data LDEC2 is image data obtained by decoding the encoded data of the image data of interest SIMG.

また、第２符号化部４４は、関心画像データＳＩＭＧの符号化処理時に、ローカルデコード画像データＬＤＥＣ１をビュー間予測の参照画像として必要に応じて使用する。例えば、第２符号化部４４は、ビュー間予測を実行するとき、ローカルデコード画像データＬＤＥＣ１をメモリ５０から読み出す。 In addition, the second encoding unit 44 uses the local decoded image data LDEC1 as a reference image for inter-view prediction as necessary during the encoding process of the image data of interest SIMG. For example, the second encoding unit 44 reads the local decoded image data LDEC1 from the memory 50 when performing inter-view prediction.

ストリーム合成部４６は、第１符号化部４２で生成された圧縮ストリームＣＳＴ１および第２符号化部４４で生成された圧縮ストリームＣＳＴ２を受け、圧縮ストリームＣＳＴ１、ＣＳＴ２をＨ．２６４のＭＶＣに準拠した形式で合成する。そして、ストリーム合成部４６は、圧縮ストリームＣＳＴ１、ＣＳＴ２を合成した圧縮ストリームＳＴＲＭを、図１に示したストリーム記憶部１００等に出力する。 The stream synthesizing unit 46 receives the compressed stream CST1 generated by the first encoding unit 42 and the compressed stream CST2 generated by the second encoding unit 44, and converts the compressed streams CST1 and CST2 to H.264. It is synthesized in a format compliant with H.264 MVC. Then, the stream synthesizing unit 46 outputs the compressed stream STRM obtained by synthesizing the compressed streams CST1 and CST2 to the stream storage unit 100 shown in FIG.

なお、動画像符号化装置１０の構成は、この例に限定されない。例えば、ストリーム合成部４６は、圧縮ストリームＣＳＴ１、ＣＳＴ２を第１符号化部４２および第２符号化部４４からメモリ５０を介して受けてもよい。このときには、第１符号化部４２は、圧縮ストリームＣＳＴ１をメモリ５０に書き込み、第２符号化部４４は、圧縮ストリームＣＳＴ２をメモリ５０に書き込む。 In addition, the structure of the moving image encoder 10 is not limited to this example. For example, the stream synthesis unit 46 may receive the compressed streams CST1 and CST2 from the first encoding unit 42 and the second encoding unit 44 via the memory 50. At this time, the first encoding unit 42 writes the compressed stream CST1 into the memory 50, and the second encoding unit 44 writes the compressed stream CST2 into the memory 50.

図３は、関心領域画像ＲＯＩの分割の一例を示している。なお、図３は、縮小画像ＲＩＭＧの水平方向および垂直方向のそれぞれの縮小率が１／２倍のときの関心領域画像ＲＯＩの分割の一例を示している。図３の左側に示した関心領域画像ＲＯＩは、切り出し部３４により画像ＩＩＭＧから切り出された関心領域ＲＯＩである。画素Ｐａ、Ｐｂ、Ｐｃ、Ｐｄは、４つのサンプリング位相Ａ、Ｂ、Ｃ、Ｄでサンプリングされるそれぞれの画素Ｐを示している。なお、図３では、左側に示した関心領域画像ＲＯＩの一番上の行を奇数番目の行とし、一番左の列を奇数番目の列として、各画素Ｐの位置を説明する。 FIG. 3 shows an example of division of the region of interest image ROI. FIG. 3 shows an example of division of the region-of-interest image ROI when the respective reduction ratios in the horizontal direction and the vertical direction of the reduced image RIMG are ½ times. The region of interest image ROI shown on the left side of FIG. 3 is the region of interest ROI cut out from the image IIMG by the cutout unit 34. Pixels Pa, Pb, Pc, and Pd indicate the respective pixels P sampled at four sampling phases A, B, C, and D. In FIG. 3, the position of each pixel P will be described with the top row of the region of interest image ROI shown on the left as the odd-numbered row and the leftmost column as the odd-numbered column.

例えば、関心領域画像ＲＯＩの奇数行でかつ奇数列の画素Ｐａは、サンプリング位相Ａでサンプリングされる。これにより、サンプリング位相Ａの関心領域画像ＳＲＯＩａが生成される。すなわち、画像合成部３６は、関心領域画像ＲＯＩの画素Ｐ（Ｐａ）をサンプリング位相Ａで間引きサンプリングすることにより、関心領域画像ＲＯＩを水平方向および垂直方向に１／２倍した関心領域画像ＳＲＯＩａを生成する。 For example, the pixels Pa in the odd rows and the odd columns of the region-of-interest image ROI are sampled at the sampling phase A. Thereby, the region-of-interest image SROIa of the sampling phase A is generated. In other words, the image composition unit 36 performs sampling by sampling pixels P (Pa) of the region of interest image ROI at the sampling phase A, thereby generating the region of interest image SROIa that is ½ times the region of interest image ROI in the horizontal direction and the vertical direction. Generate.

関心領域画像ＲＯＩの奇数行でかつ偶数列の画素Ｐｂは、例えば、サンプリング位相Ｂでサンプリングされる。これにより、サンプリング位相Ｂの関心領域画像ＳＲＯＩｂが生成される。すなわち、画像合成部３６は、関心領域画像ＲＯＩの画素Ｐ（Ｐｂ）をサンプリング位相Ｂで間引きサンプリングすることにより、関心領域画像ＲＯＩを水平方向および垂直方向に１／２倍した関心領域画像ＳＲＯＩｂを生成する。 The pixels Pb in the odd rows and even columns in the region of interest image ROI are sampled at the sampling phase B, for example. Thereby, the region-of-interest image SROIb having the sampling phase B is generated. That is, the image compositing unit 36 performs sampling by sampling pixels P (Pb) of the region of interest image ROI at the sampling phase B, thereby generating the region of interest image SROIb obtained by halving the region of interest image ROI in the horizontal direction and the vertical direction. Generate.

関心領域画像ＲＯＩの偶数行でかつ奇数列の画素Ｐｃは、例えば、サンプリング位相Ｃでサンプリングされる。これにより、サンプリング位相Ｃの関心領域画像ＳＲＯＩｃが生成される。すなわち、画像合成部３６は、関心領域画像ＲＯＩの画素Ｐ（Ｐｃ）をサンプリング位相Ｃで間引きサンプリングすることにより、関心領域画像ＲＯＩを水平方向および垂直方向に１／２倍した関心領域画像ＳＲＯＩｃを生成する。 The pixels Pc in even rows and odd columns in the region of interest image ROI are sampled at the sampling phase C, for example. Thereby, the region-of-interest image SROIc of the sampling phase C is generated. That is, the image compositing unit 36 thins and samples the pixels P (Pc) of the region of interest image ROI at the sampling phase C, thereby generating the region of interest image SROIc that is ½ times the region of interest image ROI in the horizontal direction and the vertical direction. Generate.

関心領域画像ＲＯＩの偶数行でかつ偶数列の画素Ｐｄは、例えば、サンプリング位相Ｄでサンプリングされる。これにより、サンプリング位相Ｄの関心領域画像ＳＲＯＩｄが生成される。すなわち、画像合成部３６は、関心領域画像ＲＯＩの画素Ｐ（Ｐｄ）をサンプリング位相Ｃで間引きサンプリングすることにより、関心領域画像ＲＯＩを水平方向および垂直方向に１／２倍した関心領域画像ＳＲＯＩｄを生成する。 The pixels Pd in even rows and even columns in the region of interest image ROI are sampled at the sampling phase D, for example. Thereby, the region-of-interest image SROId having the sampling phase D is generated. That is, the image compositing unit 36 thins and samples the pixels P (Pd) of the region of interest image ROI at the sampling phase C, thereby obtaining the region of interest image SROId that is ½ times the region of interest image ROI in the horizontal direction and the vertical direction. Generate.

このように、画像合成部３６は、縮小画像ＲＩＭＧの水平方向および垂直方向のそれぞれの縮小率が１／２倍のとき、関心領域画像ＲＯＩを４つの関心領域画像ＳＲＯＩａ、ＳＲＯＩｂ、ＳＲＯＩｃ、ＳＲＯＩｄに分割する。なお、関心領域画像ＲＯＩを関心領域画像ＳＲＯＩａ、ＳＲＯＩｂ、ＳＲＯＩｃ、ＳＲＯＩｄに分割する際の間引き率は、縮小画像ＲＩＭＧの縮小率に対応している。これにより、この実施形態では、関心領域画像ＳＲＯＩａ、ＳＲＯＩｂ、ＳＲＯＩｃ、ＳＲＯＩｄを、縮小画像ＲＩＭＧと同様の解像度にできる。 As described above, the image composition unit 36 converts the region-of-interest image ROI into four regions-of-interest images SROIa, SROIb, SROIc, and SROId when the respective reduction ratios of the reduced image RIMG in the horizontal direction and the vertical direction are ½ times. To divide. Note that the thinning rate when the region-of-interest image ROI is divided into the region-of-interest images SROIa, SROIb, SROIc, and SROId corresponds to the reduction rate of the reduced image RIMG. Thereby, in this embodiment, the region-of-interest images SROIa, SROIb, SROIc, and SROId can be set to the same resolution as the reduced image RIMG.

図４は、関心領域画像ＲＯＩの割り付けの一例を示している。なお、図４は、縮小画像ＲＩＭＧの水平方向および垂直方向のそれぞれの縮小率が１／２倍のときの関心領域画像ＲＯＩの割り付けの一例を示している。図４の破線は、マクロブロック（１６×１６画素）の境界を示している。図４の例では、入力画像ＩＩＭＧから２つの関心領域ＲＯＩ１、ＲＯＩ２が検出されている。縮小画像ＲＩＭＧ内の関心領域ＲＲＯＩ１、ＲＲＯＩ２は、入力画像ＩＩＭＧ内の関心領域ＲＯＩ１、ＲＯＩ２を縮小した画像にそれぞれ対応している。 FIG. 4 shows an example of allocation of the region of interest image ROI. FIG. 4 shows an example of the allocation of the region-of-interest image ROI when the respective reduction ratios of the reduced image RIMG in the horizontal direction and the vertical direction are ½ times. The broken lines in FIG. 4 indicate the boundaries of macroblocks (16 × 16 pixels). In the example of FIG. 4, two regions of interest ROI1 and ROI2 are detected from the input image IIMG. The regions of interest RROI1 and RROI2 in the reduced image RIMG correspond to images obtained by reducing the regions of interest ROI1 and ROI2 in the input image IIMG, respectively.

関心画像ＳＩＭＧ内の関心領域ＳＲＯＩ１ａ、ＳＲＯＩ１ｂ、ＳＲＯＩ１ｃ、ＳＲＯＩ１ｄは、入力画像ＩＩＭＧから切り出された関心領域ＲＯＩ１をサンプリング位相Ａ、Ｂ、Ｃ、Ｄでサンプリングしたときの画像である。また、関心画像ＳＩＭＧ内の関心領域ＳＲＯＩ２ａ、ＳＲＯＩ２ｂ、ＳＲＯＩ２ｃ、ＳＲＯＩ２ｄは、入力画像ＩＩＭＧから切り出された関心領域ＲＯＩ２をサンプリング位相Ａ、Ｂ、Ｃ、Ｄでサンプリングしたときの画像である。 The regions of interest SROI1a, SROI1b, SROI1c, and SROI1d in the image of interest SIMG are images when the region of interest ROI1 cut out from the input image IIMG is sampled at the sampling phases A, B, C, and D. The regions of interest SROI2a, SROI2b, SROI2c, and SROI2d in the image of interest SIMG are images obtained by sampling the region of interest ROI2 cut out from the input image IIMG at the sampling phases A, B, C, and D.

なお、各関心領域ＲＯＩ（ＲＯＩ１、ＲＯＩ２等）は、例えば、各サンプリング位相の関心領域画像ＳＲＯＩ（ＳＲＯＩ１ａ−ＳＲＯＩ１ｄ、ＳＲＯＩ２ａ−ＳＲＯＩ２ｄ等）をマクロブロック単位で処理できるように、入力画像ＩＩＭＧから切り出される。例えば、関心領域ＲＯＩ１は、各サンプリング位相の関心領域画像ＳＲＯＩ１ａ、ＳＲＯＩ１ｂ、ＳＲＯＩ１ｃ、ＳＲＯＩ１ｄのサイズがマクロブロックのサイズの倍数になるように、入力画像ＩＩＭＧから切り出される。同様に、関心領域ＲＯＩ２は、各サンプリング位相の関心領域画像ＳＲＯＩ２ａ、ＳＲＯＩ２ｂ、ＳＲＯＩ２ｃ、ＳＲＯＩ２ｄのサイズがマクロブロックのサイズの倍数になるように、入力画像ＩＩＭＧから切り出される。 Each region of interest ROI (ROI1, ROI2, etc.) is cut out from the input image IIMG so that, for example, the region of interest image SROI (SROI1a-SROI1d, SROI2a-SROI2d, etc.) of each sampling phase can be processed in units of macroblocks. . For example, the region of interest ROI1 is cut out from the input image IIMG so that the size of the region of interest images SROI1a, SROI1b, SROI1c, and SROI1d of each sampling phase is a multiple of the size of the macroblock. Similarly, the region of interest ROI2 is cut out from the input image IIMG so that the size of the region of interest images SROI2a, SROI2b, SROI2c, and SROI2d of each sampling phase is a multiple of the size of the macroblock.

図４の例では、縮小画像ＲＩＭＧ内の関心領域ＲＲＯＩ１および関心画像ＳＩＭＧ内の関心領域ＳＲＯＩ１ａ、ＳＲＯＩ１ｂ、ＳＲＯＩ１ｃ、ＳＲＯＩ１ｄのサイズは、３２×３２画素である。また、縮小画像ＲＩＭＧ内の関心領域ＲＲＯＩ２および関心画像ＳＩＭＧ内の関心領域ＳＲＯＩ２ａ、ＳＲＯＩ２ｂ、ＳＲＯＩ２ｃ、ＳＲＯＩ２ｄのサイズは、４８×４８画素である。 In the example of FIG. 4, the size of the region of interest RROI1 in the reduced image RIMG and the regions of interest SROI1a, SROI1b, SROI1c, and SROI1d in the image of interest SIMG are 32 × 32 pixels. The size of the region of interest RROI2 in the reduced image RIMG and the regions of interest SROI2a, SROI2b, SROI2c, SROI2d in the image of interest SIMG are 48 × 48 pixels.

関心領域ＲＯＩを分割した関心領域画像ＳＲＯＩは、１つの精細な関心領域ＲＯＩの画像データとなるように、例えば、１つの関心画像ＳＩＭＧ内にまとめて配置される。例えば、関心領域画像ＳＲＯＩ１がまとめて配置されているブロックでは、サンプリング位相Ａ、Ｂ、Ｃ、Ｄの関心領域画像ＳＲＯＩ１ａ、ＳＲＯＩ１ｂ、ＳＲＯＩ１ｃ、ＳＲＯＩ１ｄは、左上、右上、左下、右下にそれぞれ配置されている。 The region-of-interest image SROI obtained by dividing the region of interest ROI is, for example, arranged together in one region-of-interest image SIMG so as to be one minute image data of the region-of-interest ROI. For example, in the block in which the region-of-interest image SROI1 is collectively arranged, the region-of-interest images SROI1a, SROI1b, SROI1c, and SROI1d of the sampling phases A, B, C, and D are respectively disposed in the upper left, upper right, lower left, and lower right. ing.

このように、関心画像生成部３０は、入力画像ＩＩＭＧから切り出した関心領域ＲＯＩを分割して、１つの関心画像ＳＩＭＧ内に割り付ける。すなわち、関心画像生成部３０は、複数の関心領域画像ＳＲＯＩ１を共通の関心画像ＳＩＭＧに割り付け、複数の関心領域画像ＳＲＯＩ１が割り付けられた関心画像ＳＩＭＧに対応する関心画像データＳＩＭＧを生成する。 In this way, the interest image generation unit 30 divides the region of interest ROI cut out from the input image IIMG and assigns it to one interest image SIMG. That is, the interest image generation unit 30 assigns a plurality of region-of-interest images SROI1 to a common interest image SIMG, and generates interest image data SIMG corresponding to the interest image SIMG to which the plurality of region-of-interest images SROI1 are assigned.

なお、複数の関心領域ＲＯＩが検出されたときには、例えば、関心領域ＲＯＩを分割した関心領域画像ＳＲＯＩをまとめたブロック（ＳＲＯＩ１ａ−ＳＲＯＩ１ｄのブロック、ＳＲＯＩ２ａ−ＳＲＯＩ２ｄのブロック等）が、関心領域画像ＲＯＩ毎に左上から順に配置される。すなわち、複数の関心領域ＲＯＩが検出されたときにも、各関心領域ＲＯＩを分割した関心領域画像ＳＲＯＩは、各関心領域ＲＯＩを精細に表示する画像データとなるように、１つの関心画像ＳＩＭＧ内に割り付けられる。 When a plurality of regions of interest ROI are detected, for example, a block (SROI1a-SROI1d block, SROI2a-SROI2d block, etc.) in which the region of interest images SROI obtained by dividing the region of interest ROI are collected for each region of interest ROI. Are arranged in order from the upper left. That is, even when a plurality of regions of interest ROI are detected, the region of interest image SROI obtained by dividing each region of interest ROI becomes the image data that displays each region of interest ROI in a fine manner in one region of interest SIMG. Assigned.

また、関心画像生成部３０は、例えば、検出された複数の関心領域ＲＯＩのサイズの総和が関心画像ＳＩＭＧのサイズより大きいとき、各関心領域ＲＯＩに優先順位を付ける。そして、関心画像生成部３０は、優先順位の高い関心領域ＲＯＩから順に、分割した関心領域画像ＳＲＯＩを関心画像ＳＩＭＧ内に割り付ける。関心画像ＳＩＭＧ内に割り付けられなかった関心領域ＲＯＩ（相対的に優先順位の低い関心領域ＲＯＩ）は、精細に復号可能な関心領域ＲＯＩの候補から外れる。 Further, for example, when the sum of the sizes of the plurality of detected regions of interest ROI is larger than the size of the image of interest SIMG, the region of interest ROI gives priority to each region of interest ROI. Then, the interest image generation unit 30 assigns the divided region of interest images SROI in the region of interest SIMG in order from the region of interest ROI with the highest priority. A region of interest ROI that has not been allocated in the image of interest SIMG (region of interest ROI with a relatively low priority) is excluded from the candidates for the region of interest ROI that can be finely decoded.

この実施形態では、関心画像生成部３０により割り付けられる関心領域画像ＳＲＯＩの配置は、検出された関心領域ＲＯＩに対して一意的に決まる。これにより、動画像再生装置１１０は、例えば、縮小画像ＲＤＩＭＧ（縮小画像データＲＩＭＧの符号化データを復号した画像）内の関心領域ＲＲＯＩ（ＲＲＯＩ１、ＲＲＯＩ２等）を検出することにより、関心画像ＳＤＩＭＧ内の関心領域画像ＳＲＯＩの配置を認識できる。 In this embodiment, the arrangement of the region-of-interest image SROI allocated by the image-of-interest generation unit 30 is uniquely determined with respect to the detected region of interest ROI. Thereby, the moving image reproduction apparatus 110 detects the region of interest RROI (RROI1, RROI2, etc.) in the reduced image RDIMG (the image obtained by decoding the encoded data of the reduced image data RIMG), for example, thereby detecting the inside of the interested image SDIMG. Of the region of interest image SROI can be recognized.

関心画像ＳＩＭＧ内の関心領域画像ＳＲＯＩ以外の領域は、復号側で必要とされない領域であるため、どのような画像を挿入してもよい。例えば、関心画像ＳＩＭＧ内の関心領域画像ＳＲＯＩ以外の領域は、縮小画像ＲＩＭＧの対応する位置の画像のコピーでもよいし、固定色（単色）の画像でもよい。この実施形態では、関心画像ＳＩＭＧ内の関心領域画像ＳＲＯＩ以外の領域を縮小画像ＲＩＭＧの対応する位置の画像のコピー等にすることにより、関心画像ＳＩＭＧ内の関心領域画像ＲＯＩ以外の領域の画像データの圧縮率を高くすることができる。 Since the region other than the region-of-interest image SROI in the image of interest SIMG is a region that is not required on the decoding side, any image may be inserted. For example, the region other than the region-of-interest image SROI in the image of interest SIMG may be a copy of an image at a corresponding position in the reduced image RIMG, or may be a fixed color (single color) image. In this embodiment, by making a region other than the region of interest image SROI in the image of interest SIMG a copy of an image at a corresponding position in the reduced image RIMG, image data of the region other than the region of interest image ROI in the image of interest SIMG The compression ratio can be increased.

また、この実施形態では、図３で説明したように、関心領域画像ＳＲＯＩは、縮小画像ＲＩＭＧの縮小率に対応する間引き率で、関心領域画像ＲＯＩの画素を間引いて生成される。これにより、関心領域画像ＳＲＯＩのサイズおよび解像度は、縮小画像ＲＩＭＧのサイズおよび解像度と同様になる。ここで、例えば、入力画像ＩＩＭＧ内の関心領域ＲＯＩがそのままコピーされた関心画像ＳＩＭＧでは、縮小画像ＲＩＭＧを参照しても、関心領域のサイズや解像度が異なるため、圧縮率向上は、見込めない。 In this embodiment, as described with reference to FIG. 3, the region-of-interest image SROI is generated by thinning out pixels of the region-of-interest image ROI at a thinning rate corresponding to the reduction rate of the reduced image RIMG. Thereby, the size and resolution of the region-of-interest image SROI become the same as the size and resolution of the reduced image RIMG. Here, for example, in the interest image SIMG in which the region of interest ROI in the input image IIMG is copied as it is, even if the reduced image RIMG is referred to, the size and resolution of the region of interest are different, so that the compression rate cannot be improved.

これに対し、この実施形態では、上述したように、各関心領域画像ＳＲＯＩのサイズや解像度は、各関心領域画像ＳＲＯＩに対応する縮小画像ＲＩＭＧ内の各関心領域ＲＲＯＩのサイズや解像度と同様である。このため、この実施形態では、関心画像ＳＩＭＧと縮小画像ＲＩＭＧとのビュー間での予測が当たる確率を向上できる。この結果、この実施形態では、関心領域画像ＳＲＯＩを符号化する際の圧縮率を高くすることができ、データ量の増加を抑制できる。 In contrast, in this embodiment, as described above, the size and resolution of each region of interest image SROI are the same as the size and resolution of each region of interest RROI in the reduced image RIMG corresponding to each region of interest image SROI. . For this reason, in this embodiment, the probability that the prediction between the views of the image of interest SIMG and the reduced image RIMG will be improved. As a result, in this embodiment, the compression rate when encoding the region-of-interest image SROI can be increased, and an increase in the amount of data can be suppressed.

さらに、この実施形態では、関心画像ＳＩＭＧ内の各関心領域画像ＳＲＯＩと縮小画像ＲＩＭＧ内の各関心領域ＲＲＯＩとの位置関係が分っているため、動きベクトルを求める際の探索範囲を狭くできる。これにより、この実施形態では、動きベクトルの探索処理を効率よく実行できる。 Furthermore, in this embodiment, since the positional relationship between each region of interest image SROI in the image of interest SIMG and each region of interest RROI in the reduced image RIMG is known, the search range for obtaining the motion vector can be narrowed. Thereby, in this embodiment, the motion vector search process can be executed efficiently.

なお、各サンプリング位相Ａ、Ｂ、Ｃ、Ｄの関心領域画像ＳＲＯＩの配置は、この例に限定されない。例えば、サンプリング位相Ａ、Ｂ、Ｃ、Ｄの関心領域画像ＳＲＯＩ１ａ、ＳＲＯＩ１ｂ、ＳＲＯＩ１ｃ、ＳＲＯＩ１ｄは、左側から右側に順に配置されてもよい。また、例えば、縮小画像ＲＩＭＧが画素の間引きにより生成されているときには、縮小画像ＲＩＭＧを生成したときのサンプリング位相に対応するサンプリング位相の関心領域画像ＳＲＯＩは、関心画像ＳＩＭＧ内に割り付けられなくてもよい。 Note that the arrangement of the region-of-interest images SROI of the sampling phases A, B, C, and D is not limited to this example. For example, the region-of-interest images SROI1a, SROI1b, SROI1c, and SROI1d of the sampling phases A, B, C, and D may be sequentially arranged from the left side to the right side. For example, when the reduced image RIMG is generated by pixel thinning, the region-of-interest image SROI having a sampling phase corresponding to the sampling phase when the reduced image RIMG is generated may not be allocated in the image of interest SIMG. Good.

あるいは、画像合成部３６は、縮小画像ＲＩＭＧの縮小率に応じて算出される複数のサンプリング位相から少なくとも１つのサンプリング位相を選択して、選択したサンプリング位相の関心領域画像ＳＲＯＩのみを関心画像ＳＩＭＧ内に割り付けてもよい。このときには、縮小画像ＲＩＭＧの縮小率に応じて算出される全てのサンプリング位相の関心領域画像ＳＲＯＩを関心画像ＳＩＭＧ内に割り付けるときに比べて、データ量を低減できる。 Alternatively, the image composition unit 36 selects at least one sampling phase from a plurality of sampling phases calculated according to the reduction rate of the reduced image RIMG, and only the region-of-interest image SROI of the selected sampling phase is included in the image of interest SIMG. May be assigned. In this case, the data amount can be reduced as compared with the case where the region-of-interest images SROI of all the sampling phases calculated according to the reduction ratio of the reduced image RIMG are allocated in the image of interest SIMG.

関心画像ＳＩＭＧ内に割り付ける関心領域画像ＳＲＯＩを選択する動画像処理システムＳＹＳでは、動画像再生装置１１０は、例えば、選択されなかったサンプリング位相に対応する画素Ｐを、選択されたサンプリング位相に対応する画素Ｐを用いて補間する。このときにも、関心領域ＲＯＩに関する画像情報は、縮小画像ＲＩＭＧのみを使用するときに比べて多くなるため、縮小画像ＲＩＭＧのみを使用するときに比べて高精細な関心領域ＲＯＩを表示できる。 In the moving image processing system SYS that selects the region-of-interest image SROI to be allocated in the image of interest SIMG, the moving image reproduction device 110, for example, corresponds the pixel P corresponding to the unselected sampling phase to the selected sampling phase. Interpolation is performed using the pixel P. Also at this time, since the image information regarding the region of interest ROI is larger than when only the reduced image RIMG is used, it is possible to display the region of interest ROI with higher definition than when only the reduced image RIMG is used.

図５は、関心領域画像ＲＯＩの割り付けの別の例を示している。なお、図５は、縮小画像ＲＩＭＧの水平方向および垂直方向のそれぞれの縮小率が１／２倍のときの関心領域画像ＲＯＩの割り付けの一例を示している。図５の破線の意味は、図４と同様である。入力画像ＩＩＭＧおよび縮小画像ＲＩＭＧは、図４の例と同様である。例えば、入力画像ＩＩＭＧから２つの関心領域ＲＯＩ１、ＲＯＩ２が検出されている。 FIG. 5 shows another example of allocation of the region of interest image ROI. FIG. 5 shows an example of the allocation of the region-of-interest image ROI when the respective reduction ratios of the reduced image RIMG in the horizontal direction and the vertical direction are ½ times. The meaning of the broken line in FIG. 5 is the same as that in FIG. The input image IIMG and the reduced image RIMG are the same as in the example of FIG. For example, two regions of interest ROI1 and ROI2 are detected from the input image IIMG.

図５の例では、縮小画像ＲＩＭＧの縮小率に応じて算出される複数のサンプリング位相の関心領域画像ＳＲＯＩは、共通の時刻の複数の関心画像ＳＩＭＧにそれぞれ割り付けられる。関心画像ＳＩＭＧａ、ＳＩＭＧｂ、ＳＩＭＧｃ、ＳＩＭＧｄは、共通の時刻の関心画像ＳＩＭＧである。例えば、関心画像ＳＩＭＧａ、ＳＩＭＧｂ、ＳＩＭＧｃ、ＳＩＭＧｄは、互いに異なる系列の非ベースビューのフレームとしてそれぞれ符号化される。 In the example of FIG. 5, the region-of-interest images SROI having a plurality of sampling phases calculated according to the reduction rate of the reduced image RIMG are allocated to the plurality of images of interest SIMG at the common time. The images of interest SIMGa, SIMGb, SIMGc, and SIMGd are images of interest SIMG at a common time. For example, the images of interest SIMGa, SIMGb, SIMGc, and SIMGd are each encoded as non-base-view frames of different sequences.

例えば、４つのサンプリング位相の関心領域画像ＳＲＯＩ１ａ、ＳＲＯＩ１ｂ、ＳＲＯＩ１ｃ、ＳＲＯＩ１ｄは、４つの関心画像ＳＩＭＧａ、ＳＩＭＧｂ、ＳＩＭＧｃ、ＳＩＭＧｄにそれぞれ割り付けられる。なお、各関心画像ＳＩＭＧ内の関心領域画像ＳＲＯＩ１は、例えば、縮小画像ＲＩＭＧ内の関心領域ＲＲＯＩ１に対応する位置（例えば、同じ座標）に配置される。同様に、関心領域画像ＳＲＯＩ２ａ、ＳＲＯＩ２ｂ、ＳＲＯＩ２ｃ、ＳＲＯＩ２ｄは、縮小画像ＲＩＭＧ内の関心領域ＲＲＯＩ２に対応する位置（例えば、同じ座標）で、関心画像ＳＩＭＧａ、ＳＩＭＧｂ、ＳＩＭＧｃ、ＳＩＭＧｄにそれぞれ割り付けられる。 For example, the region-of-interest images SROI1a, SROI1b, SROI1c, and SROI1d of four sampling phases are assigned to the four images of interest SIMGa, SIMGb, SIMGc, and SIMGd, respectively. Note that the region-of-interest image SROI1 in each image of interest SIMG is arranged at a position (for example, the same coordinates) corresponding to the region of interest RROI1 in the reduced image RIMG, for example. Similarly, the region-of-interest images SROI2a, SROI2b, SROI2c, and SROI2d are respectively assigned to the images of interest SIMGa, SIMGb, SIMGc, and SIMGd at positions (for example, the same coordinates) corresponding to the region of interest RROI2 in the reduced image RIMG.

このように、関心画像生成部３０は、複数のサンプリング位相の関心領域画像ＳＲＯＩを、各関心画像ＳＩＭＧに１位相ずつ割り付ける。関心画像ＳＩＭＧ内の関心領域画像ＳＲＯＩ以外の領域は、例えば、縮小画像ＲＩＭＧの対応する位置の画像をコピーした画像である。このときには、関心画像ＳＩＭＧ内の関心領域画像ＳＲＯＩ以外の領域で、ローカルデコード画像ＬＤＥＣ１（縮小画像データＲＩＭＧの符号化データを復号した画像データ）を参照画像としたビュー間予測での差分値および動きベクトルをほぼゼロにできる。なお、関心画像ＳＩＭＧ内の関心領域画像ＳＲＯＩ以外の領域は、例えば、固定色（単色）の画像でもよい。このときにも、関心画像ＳＩＭＧ内の関心領域画像ＲＯＩ以外の領域の画像データの圧縮率を高くすることができる。 In this way, the interested image generation unit 30 assigns the regions of interest image SROI having a plurality of sampling phases to each interested image SIMG by one phase. The region other than the region-of-interest image SROI in the image of interest SIMG is, for example, an image obtained by copying an image at a corresponding position in the reduced image RIMG. At this time, in the region other than the region-of-interest image SROI in the image of interest SIMG, the difference value and motion in inter-view prediction using the local decoded image LDEC1 (image data obtained by decoding the encoded data of the reduced image data RIMG) as a reference image The vector can be almost zero. The region other than the region-of-interest image SROI in the image of interest SIMG may be, for example, a fixed color (single color) image. Also at this time, the compression rate of the image data in the region other than the region of interest image ROI in the image of interest SIMG can be increased.

このように、図５に示した方法においても、図４に示した方法と同様の効果を得ることができる。例えば、図５に示した方法においても、関心画像ＳＩＭＧと縮小画像ＲＩＭＧとのビュー間での予測が当たる確率を向上できる。また、図５に示した方法においても、動きベクトルを求める際の探索範囲を狭くでき、動きベクトルの探索処理を効率よく実行できる。 As described above, the method shown in FIG. 5 can achieve the same effect as the method shown in FIG. For example, also in the method illustrated in FIG. 5, it is possible to improve the probability that prediction between the views of the image of interest SIMG and the reduced image RIMG will be achieved. In the method shown in FIG. 5 as well, the search range for obtaining the motion vector can be narrowed, and the motion vector search process can be executed efficiently.

さらに、図５に示した方法では、関心画像ＳＩＭＧ内の関心領域画像ＳＲＯＩの座標（位置）が縮小画像ＲＩＭＧ内の関心領域ＲＲＯＩの座標（位置）と等しくなるように、関心領域画像ＳＲＯＩが関心画像ＳＩＭＧに割り付けられる。このため、図５に示した方法では、動きベクトルの符号量を図４に示した方法に比べて低減できる。 Further, in the method shown in FIG. 5, the region of interest image SROI is interested so that the coordinates (position) of the region of interest image SROI in the image of interest SIMG are equal to the coordinates (position) of the region of interest RROI in the reduced image RIMG. Assigned to image SIMG. For this reason, in the method shown in FIG. 5, the code amount of the motion vector can be reduced as compared with the method shown in FIG.

なお、各サンプリング位相Ａ、Ｂ、Ｃ、Ｄの関心領域画像ＳＲＯＩの配置は、この例に限定されない。例えば、縮小画像ＲＩＭＧが画素の間引きにより生成されているときには、縮小画像ＲＩＭＧを生成したときのサンプリング位相に対応するサンプリング位相の関心領域画像ＳＲＯＩは、関心画像ＳＩＭＧ内に割り付けられなくてもよい。このときには、縮小画像ＲＩＭＧの縮小率に応じて算出される複数のサンプリング位相にそれぞれ対応する関心画像ＳＩＭＧを生成するときに比べて、共通の時刻の関心画像ＳＩＭＧの数を低減でき、圧縮ストリームＳＴＲＭのデータ量を低減できる。 Note that the arrangement of the region-of-interest images SROI of the sampling phases A, B, C, and D is not limited to this example. For example, when the reduced image RIMG is generated by thinning out pixels, the region-of-interest image SROI having a sampling phase corresponding to the sampling phase when the reduced image RIMG is generated may not be allocated in the image of interest SIMG. At this time, compared with the case of generating the image of interest SIMG corresponding to each of a plurality of sampling phases calculated according to the reduction ratio of the reduced image RIMG, the number of images of interest SIMG at the common time can be reduced, and the compressed stream STRM The amount of data can be reduced.

また、例えば、画像合成部３６は、縮小画像ＲＩＭＧの縮小率に応じて算出される複数のサンプリング位相から少なくとも１つのサンプリング位相を選択して、選択したサンプリング位相の関心領域画像ＳＲＯＩのみを関心画像ＳＩＭＧ内に割り付けてもよい。このときにも、縮小画像ＲＩＭＧの縮小率に応じて算出される複数のサンプリング位相にそれぞれ対応する関心画像ＳＩＭＧを生成するときに比べて、共通の時刻の関心画像ＳＩＭＧの数を低減でき、圧縮ストリームＳＴＲＭのデータ量を低減できる。 Further, for example, the image composition unit 36 selects at least one sampling phase from a plurality of sampling phases calculated according to the reduction ratio of the reduced image RIMG, and selects only the region of interest image SROI of the selected sampling phase. You may allocate in SIMG. Also at this time, the number of images of interest SIMG at a common time can be reduced compared with the case of generating images of interest SIMG respectively corresponding to a plurality of sampling phases calculated according to the reduction ratio of the reduced image RIMG. The data amount of the stream STRM can be reduced.

図６は、関心領域画像ＲＯＩの割り付けの別の例を示している。なお、図６は、縮小画像ＲＩＭＧの水平方向および垂直方向のそれぞれの縮小率が１／２倍のときの関心領域画像ＲＯＩの割り付けの一例を示している。図６の各フレーム時刻ｔ（ｔ０、ｔ１、ｔ２、ｔ３、・・・）は、例えば、縮小画像ＲＩＭＧ等に対応付けられた時刻を示している。図６では、各フレーム時刻ｔ（ｔ０、ｔ１、ｔ２、ｔ３、・・・）に対応する縮小画像ＲＩＭＧおよび関心画像ＳＩＭＧには、フレーム時刻ｔの符号の末尾の数字を、符号の末尾に付している。例えば、縮小画像ＲＩＭＧ０および関心画像ＳＩＭＧ０は、フレーム時刻ｔ０の縮小画像ＲＩＭＧおよび関心画像ＳＩＭＧである。 FIG. 6 shows another example of allocation of the region of interest image ROI. FIG. 6 shows an example of the allocation of the region-of-interest image ROI when the respective reduction ratios of the reduced image RIMG in the horizontal direction and the vertical direction are ½ times. Each frame time t (t0, t1, t2, t3,...) In FIG. 6 indicates the time associated with the reduced image RIMG or the like, for example. In FIG. 6, for the reduced image RIMG and the interest image SIMG corresponding to each frame time t (t0, t1, t2, t3,...), The number at the end of the code of the frame time t is added to the end of the code. doing. For example, the reduced image RIMG0 and the interest image SIMG0 are the reduced image RIMG and the interest image SIMG at the frame time t0.

また、図６では、入力画像ＩＩＭＧの記載を省略している。例えば、縮小画像ＲＩＭＧ０および関心画像ＳＩＭＧ０に対応する時刻（フレーム時刻ｔ０）の入力画像ＩＩＭＧは、図４に示した入力画像ＩＩＭＧと同様である。したがって、縮小画像ＲＩＭＧ０は、図４に示した縮小画像ＲＩＭＧと同様である。また、図６の破線の意味は、図４と同様である。なお、図６の関心画像ＳＩＭＧ１、ＳＩＭＧ２、ＳＩＭＧ３では、図を見やすくするために、破線の記載を省略している。 In FIG. 6, the input image IIMG is not shown. For example, the input image IIMG at the time (frame time t0) corresponding to the reduced image RIMG0 and the image of interest SIMG0 is the same as the input image IIMG shown in FIG. Therefore, the reduced image RIMG0 is the same as the reduced image RIMG shown in FIG. The meaning of the broken line in FIG. 6 is the same as that in FIG. In the interest images SIMG1, SIMG2, and SIMG3 in FIG. 6, the broken lines are omitted for easy understanding of the drawing.

図６の例では、１つの関心画像ＳＩＭＧは、複数の領域ＡＲ（ＡＲ０−ＡＲ３）に分けられ、複数の領域ＡＲは、縮小画像ＲＩＭＧの異なる時刻の複数のフレーム（画像）にそれぞれ対応している。例えば、縮小画像ＲＩＭＧの縮小率に応じて算出される複数のサンプリング位相の関心領域画像ＳＲＯＩは、異なるフレーム時刻ｔの複数の関心画像ＳＩＭＧの所定の領域ＡＲにそれぞれ割り付けられる。すなわち、関心画像生成部３０は、複数のサンプリング位相の関心領域画像ＳＲＯＩを、異なるフレーム時刻ｔの複数の関心画像ＳＩＭＧに１位相ずつ割り付ける。 In the example of FIG. 6, one image of interest SIMG is divided into a plurality of areas AR (AR0 to AR3), and the plurality of areas AR correspond to a plurality of frames (images) at different times of the reduced image RIMG. Yes. For example, the region-of-interest images SROI having a plurality of sampling phases calculated according to the reduction ratio of the reduced image RIMG are respectively assigned to the predetermined regions AR of the plurality of images of interest SIMG at different frame times t. That is, the interested image generation unit 30 assigns the region-of-interest images SROI having a plurality of sampling phases one phase at a time to the plurality of images of interest SIMG at different frame times t.

例えば、関心画像ＳＩＭＧ内の左上の領域ＡＲ０には、フレーム時刻ｔ０、ｔ４、ｔ８、・・・の入力画像ＩＩＭＧ内の関心領域ＲＯＩを分割した関心領域画像ＳＲＯＩ（ＳＯＩ１０ａ等）が割り付けられる。また、関心画像ＳＩＭＧ内の右上の領域ＡＲ１には、例えば、フレーム時刻ｔ１、ｔ５、ｔ９、・・・の入力画像ＩＩＭＧ内の関心領域ＲＯＩを分割した関心領域画像ＳＲＯＩ（ＳＯＩ１１ａ等）が割り付けられる。 For example, a region of interest image SROI (such as SOI 10a) obtained by dividing the region of interest ROI in the input image IIMG at the frame times t0, t4, t8,... Is assigned to the upper left region AR0 in the image of interest SIMG. Further, for example, a region of interest image SROI (SOI11a or the like) obtained by dividing the region of interest ROI in the input image IIMG at frame times t1, t5, t9,... Is assigned to the upper right region AR1 in the image of interest SIMG. .

そして、関心画像ＳＩＭＧ内の左下の領域ＡＲ２には、例えば、フレーム時刻ｔ２、ｔ６、ｔ１０、・・・の入力画像ＩＩＭＧ内の関心領域ＲＯＩを分割した関心領域画像ＳＲＯＩ（ＳＯＩ１２ａ等）が割り付けられる。また、関心画像ＳＩＭＧ内の右下の領域ＡＲ３には、例えば、フレーム時刻ｔ３、ｔ９、ｔ１１、・・・の入力画像ＩＩＭＧ内の関心領域ＲＯＩを分割した関心領域画像ＳＲＯＩ（ＳＯＩ１３ａ等）が割り付けられる。 Then, for example, a region-of-interest image SROI (SOI12a or the like) obtained by dividing the region-of-interest ROI in the input image IIMG at frame times t2, t6, t10,... . Further, for example, a region of interest image SROI (SOI13a or the like) obtained by dividing the region of interest ROI in the input image IIMG at frame times t3, t9, t11,... Is assigned to the lower right region AR3 in the image of interest SIMG. It is done.

すなわち、フレーム時刻ｔ０の入力画像ＩＩＭＧの関心領域ＲＯＩを分割したサンプリング位相Ａの関心領域画像ＳＲＯＩ１０ａ、ＳＲＯＩ２０ａは、フレーム時刻ｔ０の関心画像ＳＩＭＧ０の領域ＡＲ０に割り付けられる。そして、サンプリング位相Ｂの関心領域画像ＳＲＯＩ１０ｂ、ＳＲＯＩ２０ｂは、フレーム時刻ｔ１の関心画像ＳＩＭＧ１の領域ＡＲ０に割り付けられる。サンプリング位相Ｃの関心領域画像ＳＲＯＩ１０ｃ、ＳＲＯＩ２０ｃは、フレーム時刻ｔ２の関心画像ＳＩＭＧ２の領域ＲＡ０に割り付けられる。サンプリング位相Ｄの関心領域画像ＳＲＯＩ１０ｄ、ＳＲＯＩ２０ｄは、フレーム時刻ｔ３の関心画像ＳＩＭＧ３の領域ＲＡ０に割り付けられる。 That is, the region of interest images SROI10a and SROI20a of the sampling phase A obtained by dividing the region of interest ROI of the input image IIMG at the frame time t0 are assigned to the region AR0 of the image of interest SIMG0 at the frame time t0. Then, the region-of-interest images SROI10b and SROI20b at the sampling phase B are assigned to the region AR0 of the image of interest SIMG1 at the frame time t1. The region-of-interest images SROI10c and SROI20c at the sampling phase C are allocated to the region RA0 of the image of interest SIMG2 at the frame time t2. The region-of-interest images SROI10d and SROI20d of the sampling phase D are allocated to the region RA0 of the image of interest SIMG3 at the frame time t3.

同様に、フレーム時刻ｔ１の入力画像ＩＩＭＧの関心領域ＲＯＩを分割したサンプリング位相Ａ、Ｂ、Ｃ、Ｄの関心領域画像ＳＲＯＩは、フレーム時刻ｔ１−ｔ４の関心画像ＳＩＭＧの領域ＲＡ１にそれぞれ割り付けられる。そして、フレーム時刻ｔ２の入力画像ＩＩＭＧの関心領域ＲＯＩを分割したサンプリング位相Ａ、Ｂ、Ｃ、Ｄの関心領域画像ＳＲＯＩは、フレーム時刻ｔ２−ｔ５の関心画像ＳＩＭＧの領域ＲＡ２にそれぞれ割り付けられる。また、フレーム時刻ｔ３の入力画像ＩＩＭＧの関心領域ＲＯＩを分割したサンプリング位相Ａ、Ｂ、Ｃ、Ｄの関心領域画像ＳＲＯＩは、フレーム時刻ｔ３−ｔ６の関心画像ＳＩＭＧの領域ＲＡ３にそれぞれ割り付けられる。 Similarly, the region of interest images SROI of the sampling phases A, B, C, and D obtained by dividing the region of interest ROI of the input image IIMG at the frame time t1 are respectively allocated to the region RA1 of the image of interest SIMG at the frame times t1 to t4. Then, the region of interest images SROI of the sampling phases A, B, C, and D obtained by dividing the region of interest ROI of the input image IIMG at the frame time t2 are respectively assigned to the region RA2 of the image of interest SIMG at the frame time t2-t5. Further, the region of interest images SROI of the sampling phases A, B, C, and D obtained by dividing the region of interest ROI of the input image IIMG at the frame time t3 are respectively assigned to the region RA3 of the image of interest SIMG at the frame time t3 to t6.

各フレーム時刻ｔの関心画像ＳＩＭＧ内の関心領域画像ＳＲＯＩ以外の領域は、例えば、各フレーム時刻ｔの縮小画像ＲＩＭＧの対応する位置の画像のコピーでもよいし、固定色（単色）の画像でもよい。図６に示した方法においても、図４に示した方法と同様の効果を得ることができる。例えば、図６に示した方法においても、関心画像ＳＩＭＧと縮小画像ＲＩＭＧとのビュー間での予測が当たる確率を向上できる。また、図６に示した方法においても、動きベクトルを求める際の探索範囲を狭くでき、動きベクトルの探索処理を効率よく実行できる。 The region other than the region-of-interest image SROI in the image of interest SIMG at each frame time t may be, for example, a copy of an image at a corresponding position of the reduced image RIMG at each frame time t, or may be a fixed color (single color) image. . Also in the method shown in FIG. 6, the same effect as the method shown in FIG. 4 can be obtained. For example, also in the method illustrated in FIG. 6, it is possible to improve the probability that the prediction between the views of the image of interest SIMG and the reduced image RIMG is successful. In the method shown in FIG. 6 as well, the search range for obtaining the motion vector can be narrowed, and the motion vector search process can be executed efficiently.

さらに、図６に示した方法では、関心画像ＳＩＭＧのフレーム間予測（インター予測）で、共通の入力画像ＩＩＭＧから生成された異なるサンプリング位相の関心領域画像ＳＲＯＩを参照することも可能である。例えば、図６に示した方法では、関心画像ＳＩＭＧ１の関心領域画像ＳＲＯＩ１０ｂを符号化する際に、関心画像ＳＩＭＧ０の関心領域画像ＳＲＯＩ１０ａを参照できる。このため、図６に示した方法では、ビュー間予測とフレーム間予測とで圧縮率の高い方の予測を選択でき、圧縮率を向上できる。 Furthermore, in the method shown in FIG. 6, it is also possible to refer to the region-of-interest image SROI of different sampling phases generated from the common input image IIMG in the inter-frame prediction (inter prediction) of the image of interest SIMG. For example, in the method illustrated in FIG. 6, the region-of-interest image SROI 10 a of the image of interest SIMG 0 can be referred to when the region-of-interest image SROI 10 b of the image of interest SIMG 1 is encoded. For this reason, in the method shown in FIG. 6, prediction with a higher compression rate can be selected between inter-view prediction and inter-frame prediction, and the compression rate can be improved.

なお、各サンプリング位相Ａ、Ｂ、Ｃ、Ｄの関心領域画像ＳＲＯＩの配置は、この例に限定されない。例えば、縮小画像ＲＩＭＧが画素の間引きにより生成されているときには、縮小画像ＲＩＭＧを生成したときのサンプリング位相に対応するサンプリング位相の関心領域画像ＳＲＯＩは、関心画像ＳＩＭＧ内に割り付けられなくてもよい。また、例えば、画像合成部３６は、縮小画像ＲＩＭＧの縮小率に応じて算出される複数のサンプリング位相から少なくとも１つのサンプリング位相を選択して、選択したサンプリング位相の関心領域画像ＳＲＯＩのみを関心画像ＳＩＭＧ内に割り付けてもよい。 Note that the arrangement of the region-of-interest images SROI of the sampling phases A, B, C, and D is not limited to this example. For example, when the reduced image RIMG is generated by thinning out pixels, the region-of-interest image SROI having a sampling phase corresponding to the sampling phase when the reduced image RIMG is generated may not be allocated in the image of interest SIMG. Further, for example, the image composition unit 36 selects at least one sampling phase from a plurality of sampling phases calculated according to the reduction ratio of the reduced image RIMG, and selects only the region of interest image SROI of the selected sampling phase. You may allocate in SIMG.

図７は、符号化の際のフレームの参照関係の一例を示している。なお、図７は、Ｈ．２６４のＭＶＣに準拠した符号化方式で符号化するときのフレームの参照関係の一例を示している。図中の括弧内のＩ、Ｐは、それぞれＩフレーム、Ｐフレームを示している。動画像符号化装置１０は、例えば、縮小画像ＲＩＭＧをベースビューのフレームとして処理し、関心画像ＳＩＭＧを非ベースビューのフレームとして処理する。 FIG. 7 shows an example of a frame reference relationship at the time of encoding. Note that FIG. 2 illustrates an example of a frame reference relationship when encoding is performed using an encoding method compliant with H.264 MVC. In the figure, I and P in parentheses indicate an I frame and a P frame, respectively. For example, the moving image encoding apparatus 10 processes the reduced image RIMG as a frame of a base view, and processes the image of interest SIMG as a frame of a non-base view.

ベースビューとして処理される縮小画像ＲＩＭＧのＩフレームは、例えば、フレーム内予測（イントラ予測）を用いて符号化される。また、縮小画像ＲＩＭＧのＰフレームは、フレーム間予測を用いて符号化される。非ベースビューとして処理される関心画像ＳＩＭＧのＰフレームは、例えば、非ベースビューのフレームを参照するフレーム間予測やベースビューの対応するフレームを参照するビュー間予測を用いて符号化される。動画像符号化装置１０は、ビュー間予測を用いることにより、符号量の大きいＩフレームを用いずに、非ベースビューの関心画像ＳＩＭＧを符号化できる。この結果、この実施形態では、圧縮ストリームＳＴＲＭのデータ量の増加を抑制できる。 The I frame of the reduced image RIMG processed as the base view is encoded using, for example, intra-frame prediction (intra prediction). Also, the P frame of the reduced image RIMG is encoded using inter-frame prediction. The P frame of the image of interest SIMG processed as the non-base view is encoded using, for example, inter-frame prediction that refers to the frame of the non-base view or inter-view prediction that refers to the corresponding frame of the base view. By using inter-view prediction, the moving image encoding apparatus 10 can encode a non-base view interest image SIMG without using an I frame with a large code amount. As a result, in this embodiment, an increase in the data amount of the compressed stream STRM can be suppressed.

図８は、図２に示した動画像符号化装置１０の動作の一例を示している。図８の動作は、ハードウエアのみで実現されてもよく、ハードウエアをソフトウエアにより制御することにより実現されてもよい。例えば、動画像処理プログラム等のソフトウエアは、コンピュータに図８の動作を実行させる。 FIG. 8 shows an example of the operation of the moving picture encoding apparatus 10 shown in FIG. The operation of FIG. 8 may be realized only by hardware, or may be realized by controlling the hardware by software. For example, software such as a moving image processing program causes a computer to execute the operation of FIG.

処理Ｓ１００では、動画像符号化装置１０は、入力画像ＩＩＭＧを取得する。例えば、動画像符号化装置１０は、デジタルビデオカメラ等により撮影された画像ＩＩＭＧの画像データＩＩＭＧをメモリ５０に順次記憶する。 In process S100, the moving image encoding device 10 acquires the input image IIMG. For example, the moving image encoding device 10 sequentially stores in the memory 50 the image data IIMG of the image IIMG taken by a digital video camera or the like.

処理Ｓ１１０では、縮小画像生成部２０は、入力画像ＩＩＭＧの縮小画像ＲＩＭＧを生成する。例えば、縮小画像生成部２０は、画像データＩＩＭＧをメモリ５０から読み出し、画像ＩＩＭＧを予め設定された縮小率で縮小して縮小画像データＲＩＭＧを生成する。縮小画像データＲＩＭＧは、例えば、メモリ５０に記憶される。 In process S110, the reduced image generation unit 20 generates a reduced image RIMG of the input image IIMG. For example, the reduced image generation unit 20 reads the image data IIMG from the memory 50, reduces the image IIMG at a preset reduction rate, and generates reduced image data RIMG. The reduced image data RIMG is stored in the memory 50, for example.

処理Ｓ１２０では、第１符号化部４２は、処理Ｓ１１０で生成された縮小画像ＲＩＭＧをベースビューのフレームとして符号化する。例えば、第１符号化部４２は、縮小画像データＲＩＭＧをメモリ５０から読み出し、縮小画像データＲＩＭＧを符号化して圧縮ストリームＣＳＴ１を生成する。なお、縮小画像データＲＩＭＧの符号化処理中に生成されたローカルデコード画像データＬＤＥＣ１は、例えば、メモリ５０に記憶される。メモリ５０に記憶されたローカルデコード画像データＬＤＥＣ１は、必要に応じて参照画像として使用される。例えば、第１符号化部４２は、フレーム間予測を実行するとき、ローカルデコード画像データＬＤＥＣ１をメモリ５０から読み出す。 In the process S120, the first encoding unit 42 encodes the reduced image RIMG generated in the process S110 as a frame of the base view. For example, the first encoding unit 42 reads the reduced image data RIMG from the memory 50, encodes the reduced image data RIMG, and generates a compressed stream CST1. Note that the local decoded image data LDEC1 generated during the encoding process of the reduced image data RIMG is stored in the memory 50, for example. The local decoded image data LDEC1 stored in the memory 50 is used as a reference image as necessary. For example, the first encoding unit 42 reads the local decoded image data LDEC1 from the memory 50 when performing inter-frame prediction.

処理Ｓ１３０では、関心領域検出部３２は、処理Ｓ１２０の符号化処理中に生成されたローカルデコード画像ＬＤＥＣ１を用いて、関心領域を検出する。例えば、関心領域検出部３２は、ローカルデコード画像データＬＤＥＣ１をメモリ５０から読み出し、顔検出のアルゴリズム等を用いて関心領域を検出する。 In process S130, the region-of-interest detection unit 32 detects a region of interest using the local decoded image LDEC1 generated during the encoding process in step S120. For example, the region of interest detection unit 32 reads the local decoded image data LDEC1 from the memory 50, and detects the region of interest using a face detection algorithm or the like.

処理Ｓ１４０では、切り出し部３４は、処理Ｓ１３０の検出結果に基づいて、入力画像ＩＩＭＧから関心領域ＲＯＩを切り出す。例えば、切り出し部３４は、入力画像データＩＩＭＧをメモリ５０から読み出し、処理Ｓ１３０で検出された関心領域に対応する関心領域画像データＲＯＩを入力画像データＩＩＭＧから抽出する。なお、切り出し部３４は、図２で説明したように、関心領域画像データＲＯＩを入力画像データＩＩＭＧから抽出する際、処理Ｓ１３０で検出された関心領域ＲＯＩに関する情報（座標等）を入力画像ＩＩＭＧの画像サイズに合わせた値に補正している。すなわち、切り出し部３４は、処理Ｓ１３０の検出結果に基づいて、入力画像ＩＩＭＧ内の関心領域ＲＯＩの位置を算出する。 In the process S140, the cutout unit 34 cuts out the region of interest ROI from the input image IIMG based on the detection result of the process S130. For example, the cutout unit 34 reads the input image data IIMG from the memory 50, and extracts the region-of-interest image data ROI corresponding to the region of interest detected in step S130 from the input image data IIMG. As described with reference to FIG. 2, the clipping unit 34 extracts information (coordinates and the like) related to the region of interest ROI detected in step S130 when extracting the region of interest image data ROI from the input image data IIMG. The value is adjusted to match the image size. That is, the cutout unit 34 calculates the position of the region of interest ROI in the input image IIMG based on the detection result of the process S130.

処理Ｓ１５０では、画像合成部３６は、処理Ｓ１４０で切り出された関心領域ＲＯＩを複数のサンプリング位相の関心領域画像ＳＲＯＩに分割し、分割した関心領域画像ＳＲＯＩを合成する。これにより、関心領域画像ＳＲＯＩの合成画像である関心画像ＳＩＭＧが生成される。例えば、画像合成部３６は、図４−図６に示した関心領域画像ＲＯＩの割り付け方法の１つを用いて、関心画像ＳＩＭＧを生成する。なお、画像合成部３６は、図４−図６に示した関心領域画像ＲＯＩの割り付け方法以外の方法を用いて、関心画像ＳＩＭＧを生成してもよい。関心領域画像ＳＲＯＩは、例えば、メモリ５０に記憶される。 In the process S150, the image composition unit 36 divides the region of interest ROI cut out in the process S140 into the region of interest images SROI having a plurality of sampling phases, and synthesizes the divided region of interest image SROI. As a result, an interest image SIMG that is a composite image of the region of interest image SROI is generated. For example, the image composition unit 36 generates the interest image SIMG using one of the region of interest image ROI allocation methods illustrated in FIGS. 4 to 6. The image composition unit 36 may generate the interest image SIMG using a method other than the region of interest image ROI allocation method illustrated in FIGS. 4 to 6. The region-of-interest image SROI is stored in the memory 50, for example.

処理Ｓ１６０では、第２符号化部４４は、処理Ｓ１５０で生成された合成画像（関心画像ＳＩＭＧ）を非ベースビューのフレームとして符号化する。例えば、第２符号化部４４は、関心画像データＳＩＭＧをメモリ５０から読み出し、関心画像データＳＩＭＧを符号化して圧縮ストリームＣＳＴ２を生成する。なお、関心画像データＳＩＭＧの符号化処理中に生成されたローカルデコード画像データＬＤＥＣ２は、例えば、メモリ５０に記憶される。 In the process S160, the second encoding unit 44 encodes the composite image (interest image SIMG) generated in the process S150 as a non-base view frame. For example, the second encoding unit 44 reads out the interest image data SIMG from the memory 50, encodes the interest image data SIMG, and generates a compressed stream CST2. Note that the local decoded image data LDEC2 generated during the encoding process of the interest image data SIMG is stored in the memory 50, for example.

メモリ５０に記憶されたローカルデコード画像データＬＤＥＣ１、ＬＤＥＣ２は、必要に応じて参照画像として使用される。例えば、第２符号化部４４は、非ベースビューのフレームを参照するフレーム間予測を実行するとき、ローカルデコード画像データＬＤＥＣ２をメモリ５０から読み出す。また、例えば、第２符号化部４４は、ビュー間予測を実行するとき、ローカルデコード画像データＬＤＥＣ１をメモリ５０から読み出す。 The local decoded image data LDEC1 and LDEC2 stored in the memory 50 are used as reference images as necessary. For example, the second encoding unit 44 reads the local decoded image data LDEC2 from the memory 50 when executing inter-frame prediction with reference to a non-base view frame. For example, the second encoding unit 44 reads the local decoded image data LDEC1 from the memory 50 when performing inter-view prediction.

処理Ｓ１７０では、ストリーム合成部４６は、処理Ｓ１２０で生成された圧縮ストリームＣＳＴ１と処理Ｓ１６０で生成された圧縮ストリームＣＳＴ２とを合成し、圧縮ストリームＳＴＲＭを生成する。すなわち、圧縮ストリームＳＴＲＭは、縮小画像データＲＩＭＧの符号化データと関心画像データＳＩＭＧの符号化データとを有している。 In process S170, the stream synthesis unit 46 synthesizes the compressed stream CST1 generated in process S120 and the compressed stream CST2 generated in process S160 to generate a compressed stream STRM. That is, the compressed stream STRM includes encoded data of reduced image data RIMG and encoded data of interest image data SIMG.

動画像符号化装置１０は、処理Ｓ１００−Ｓ１７０を繰り返すことにより、入力画像ＩＩＭＧを順次符号化する。なお、圧縮ストリームＳＴＲＭは、例えば、図１に示したストリーム記憶部１００に記憶される。ストリーム記憶部１００に記憶された圧縮ストリームＳＴＲＭは、例えば、図１に示した動画像再生装置１１０により再生される。 The moving image encoding device 10 sequentially encodes the input image IIMG by repeating the processes S100 to S170. Note that the compressed stream STRM is stored in, for example, the stream storage unit 100 illustrated in FIG. The compressed stream STRM stored in the stream storage unit 100 is reproduced by, for example, the moving image reproduction apparatus 110 illustrated in FIG.

なお、動画像符号化装置１０の動作は、この例に限定されない。例えば、関心画像生成部３０は、関心領域ＲＯＩを複数の関心領域画像ＳＲＯＩに分割する処理を、入力画像ＩＩＭＧのフレーム間隔（例えば、図６に示したフレーム時刻ｔ０からフレーム時刻ｔ１までの間隔）より大きい間隔で実行してもよい。すなわち、関心領域検出部３２は、例えば、関心領域の検出をフレーム間隔より大きい間隔で実行してもよい。また、切り出し部３４は、関心領域ＲＯＩの切り出し処理（入力画像ＩＩＭＧから関心領域ＲＯＩを切り出す処理）を、フレーム間隔より大きい間隔で実行してもよい。 Note that the operation of the video encoding device 10 is not limited to this example. For example, the interested image generation unit 30 performs a process of dividing the region of interest ROI into a plurality of regions of interest image SROI. The frame interval of the input image IIMG (for example, the interval from the frame time t0 to the frame time t1 illustrated in FIG. 6). It may be performed at larger intervals. That is, the region-of-interest detection unit 32 may execute, for example, detection of a region of interest at an interval larger than the frame interval. In addition, the cutout unit 34 may execute the region of interest ROI cutout processing (processing of cutting out the region of interest ROI from the input image IIMG) at an interval larger than the frame interval.

関心領域ＲＯＩの切り出し処理が実行されないフレームの関心画像ＳＩＭＧは、例えば、前のフレームの関心画像ＳＩＭＧと同様である。なお、関心領域ＲＯＩの切り出し処理が実行されないフレームの関心画像ＳＩＭＧは、前のフレームの関心画像ＳＩＭＧと同一でもよい。このため、切り出し処理の実行間隔がフレーム間隔より大きい動画像符号化装置１０では、関心画像データＳＩＭＧの符号化データを含む圧縮ストリームＣＳＴ１のデータ量を、切り出し処理をフレーム毎に実行するときに比べて低減できる。 The interest image SIMG of the frame in which the region of interest ROI cut-out process is not executed is the same as the interest image SIMG of the previous frame, for example. Note that the interest image SIMG of the frame in which the region of interest ROI cut-out process is not executed may be the same as the interest image SIMG of the previous frame. For this reason, in the moving image encoding apparatus 10 in which the execution interval of the clipping process is larger than the frame interval, the data amount of the compressed stream CST1 including the encoded data of the image data of interest SIMG is compared with that when the clipping process is performed for each frame. Can be reduced.

図９は、図１に示した動画像再生装置１１０の一例を示している。図９の例では、画像データＲＤＩＭＧ等の転送は、メモリ１４０を介して実行される。例えば、動画像再生装置１１０は、画像復号部１２０、表示画像生成部１３０およびメモリ１４０を有している。なお、メモリ１４０は、画像復号部１２０等のモジュール内に設けられてもよいし、動画像再生装置１１０の外部に設けられてもよい。 FIG. 9 shows an example of the moving image playback device 110 shown in FIG. In the example of FIG. 9, transfer of image data RDIMG or the like is executed via the memory 140. For example, the moving image reproduction device 110 includes an image decoding unit 120, a display image generation unit 130, and a memory 140. Note that the memory 140 may be provided in a module such as the image decoding unit 120 or may be provided outside the moving image reproduction apparatus 110.

画像復号部１２０は、例えば、ストリーム分離部１２２、第１復号部１２４および第２復号部１２６を有している。ストリーム分離部１２２は、例えば、図１に示したストリーム記憶部１００から圧縮ストリームＳＴＲＭを受ける。そして、ストリーム分離部１２２は、圧縮ストリームＳＴＲＭを、ベースビュー用の圧縮ストリームＣＳＴ１と非ベースビュー用の圧縮ストリームＣＳＴ２とに分離する。圧縮ストリームＣＳＴ１は、縮小画像データＲＩＭＧの符号化データを有し、圧縮ストリームＣＳＴ２は、関心画像データＳＩＭＧの符号化データを有している。 The image decoding unit 120 includes, for example, a stream separation unit 122, a first decoding unit 124, and a second decoding unit 126. For example, the stream separation unit 122 receives the compressed stream STRM from the stream storage unit 100 illustrated in FIG. Then, the stream separation unit 122 separates the compressed stream STRM into a compressed stream CST1 for base view and a compressed stream CST2 for non-base view. The compressed stream CST1 has encoded data of reduced image data RIMG, and the compressed stream CST2 has encoded data of interest image data SIMG.

第１復号部１２４は、圧縮ストリームＣＳＴ１をストリーム分離部１２２から受け、圧縮ストリームＣＳＴ１を復号する。例えば、第１復号部１２４は、Ｈ．２６４に準拠したベースビューの復号処理を圧縮ストリームＣＳＴ１に対して実行する。これにより、圧縮ストリームＣＳＴ１が復号され、縮小画像データＲＤＩＭＧが生成される。例えば、第１復号部１２４は、縮小画像データＲＤＩＭＧをメモリ１４０に書き込む。 The first decoding unit 124 receives the compressed stream CST1 from the stream separation unit 122, and decodes the compressed stream CST1. For example, the first decoding unit 124 may H.264-compliant base view decoding processing is executed on the compressed stream CST1. As a result, the compressed stream CST1 is decoded, and reduced image data RDIMG is generated. For example, the first decoding unit 124 writes the reduced image data RDIMG in the memory 140.

縮小画像データＲＤＩＭＧは、縮小画像データＲＩＭＧの符号化データを復号したデコード画像データＬＤＥＣ１であり、表示画像として使用される。以下、縮小画像データＲＤＩＭＧをデコード画像データＬＤＥＣ１とも称する。デコード画像データＬＤＥＣ１は、復号処理の参照画像として必要に応じて使用される。例えば、第１復号部１２４は、復号処理の参照画像としてデコード画像データＬＤＥＣ１を使用するとき、デコード画像データＬＤＥＣ１をメモリ１４０から読み出す。 The reduced image data RDIMG is decoded image data LDEC1 obtained by decoding the encoded data of the reduced image data RIMG, and is used as a display image. Hereinafter, the reduced image data RDIMG is also referred to as decoded image data LDEC1. The decoded image data LDEC1 is used as necessary as a reference image for decoding processing. For example, the first decoding unit 124 reads the decoded image data LDEC1 from the memory 140 when using the decoded image data LDEC1 as a reference image for decoding processing.

第２復号部１２６は、圧縮ストリームＣＳＴ２をストリーム分離部１２２から受ける。そして、第２復号部１２６は、例えば、関心領域の精細表示の指示等があるとき、圧縮ストリームＣＳＴ２を復号する。例えば、第２復号部１２６は、Ｈ．２６４に準拠した非ベースビューの復号処理を圧縮ストリームＣＳＴ２に対して実行する。これにより、圧縮ストリームＣＳＴ２が復号され、関心画像データＳＤＩＭＧが生成される。例えば、第２復号部１２６は、関心画像データＳＤＩＭＧをメモリ１４０に書き込む。 The second decoding unit 126 receives the compressed stream CST2 from the stream separation unit 122. Then, the second decoding unit 126 decodes the compressed stream CST2 when there is an instruction for fine display of the region of interest, for example. For example, the second decoding unit 126 is configured to transmit the H.264 data. H.264 compliant non-base view decoding processing is performed on the compressed stream CST2. Thereby, the compressed stream CST2 is decoded, and the interest image data SDIMG is generated. For example, the second decoding unit 126 writes the interest image data SDIMG in the memory 140.

関心画像データＳＤＩＭＧは、関心画像データＳＩＭＧの符号化データを復号したデコード画像データＬＤＥＣ２であり、関心領域を精細に表示するときに使用される。以下、関心画像データＳＤＩＭＧをデコード画像データＬＤＥＣ２とも称する。デコード画像データＬＤＥＣ２は、復号処理の参照画像として必要に応じて使用される。例えば、第２復号部１２６は、復号処理の参照画像としてデコード画像データＬＤＥＣ２を使用するとき、デコード画像データＬＤＥＣ２をメモリ１４０から読み出す。 The interest image data SDIMG is decoded image data LDEC2 obtained by decoding the encoded data of the interest image data SIMG, and is used when the region of interest is displayed finely. Hereinafter, the interest image data SDIMG is also referred to as decoded image data LDEC2. The decoded image data LDEC2 is used as necessary as a reference image for decoding processing. For example, the second decoding unit 126 reads the decoded image data LDEC2 from the memory 140 when using the decoded image data LDEC2 as a reference image for decoding processing.

表示画像生成部１３０は、例えば、関心領域検出部１３２、切り出し部１３４および表示画像合成部１３６を有している。関心領域検出部１３２は、例えば、関心領域の精細表示の指示等があるとき、縮小画像データＲＤＩＭＧをメモリ５０から読み出す。そして、関心領域検出部１３２は、読み出した縮小画像データＲＤＩＭＧを用いて関心領域を検出する。すなわち、関心領域検出部１３２は、関心領域の精細表示の指示等があるとき、関心領域を検出する。 The display image generation unit 130 includes, for example, a region of interest detection unit 132, a cutout unit 134, and a display image synthesis unit 136. The region-of-interest detection unit 132 reads the reduced image data RDIMG from the memory 50 when there is an instruction for fine display of the region of interest, for example. Then, the region of interest detection unit 132 detects the region of interest using the read reduced image data RDIMG. That is, the region-of-interest detection unit 132 detects the region of interest when there is an instruction for fine display of the region of interest.

例えば、関心領域検出部１３２は、動画像符号化装置１０の関心領域検出部３２で実行される関心領域の検出アルゴリズムを用いて、関心領域を検出する。そして、関心領域検出部３２は、検出した関心領域に関する関心領域情報ＣＩＮＦ（関心領域の個数、位置、サイズ等）を、切り出し部１３４および表示画像合成部１３６に通知する。 For example, the region of interest detection unit 132 detects the region of interest using a region of interest detection algorithm executed by the region of interest detection unit 32 of the video encoding device 10. Then, the region-of-interest detection unit 32 notifies the segmentation unit 134 and the display image synthesis unit 136 of the region-of-interest information CINF (number of regions of interest, position, size, etc.) regarding the detected region of interest.

ここで、関心領域検出部１３２および関心領域検出部３２で使用される関心領域の検出アルゴリズムでは、検出される関心領域の個数、位置およびサイズは、検出対象の画像に対して一意的に決まる。例えば、縮小画像データＲＤＩＭＧは、縮小画像データＲＩＭＧの符号化データを復号した画像データであるため、関心領域検出部３２で使用されるローカルデコード画像データＬＤＥＣ１と同様の画像データである。 Here, in the region-of-interest detection algorithm used by the region-of-interest detection unit 132 and the region-of-interest detection unit 32, the number, position, and size of the region of interest to be detected are uniquely determined with respect to the detection target image. For example, since the reduced image data RDIMG is image data obtained by decoding the encoded data of the reduced image data RIMG, the reduced image data RDIMG is the same image data as the local decoded image data LDEC1 used in the region of interest detection unit 32.

したがって、関心領域検出部１３２は、動画像符号化装置１０の関心領域検出部３２で検出された関心領域と同様の関心領域を検出する。このため、この実施形態では、動画像符号化装置１０は、関心領域の位置情報等を圧縮ストリームＳＴＲＭ内に埋め込まなくてもよい。 Therefore, the region of interest detection unit 132 detects a region of interest similar to the region of interest detected by the region of interest detection unit 32 of the moving image encoding device 10. For this reason, in this embodiment, the moving image encoding apparatus 10 does not have to embed the position information of the region of interest or the like in the compressed stream STRM.

切り出し部１３４は、例えば、関心領域の精細表示の指示等があるとき、関心画像データＳＤＩＭＧをメモリ１４０から読み出す。そして、切り出し部１３４は、指示された関心領域に対応する関心領域画像ＳＲＯＩを、関心画像ＳＤＩＭＧから関心領域情報ＣＩＮＦに基づいて切り出す。この実施形態では、動画像符号化装置１０の関心画像生成部３０により割り付けられる関心領域画像ＳＲＯＩの配置は、検出された関心領域に対して一意的に決まる。このため、切り出し部１３４は、例えば、関心領域情報ＣＩＮＦを用いることにより、関心画像ＳＤＩＭＧ内の関心領域画像ＳＲＯＩの配置を認識できる。 The cutout unit 134 reads out the interest image data SDIMG from the memory 140 when, for example, there is an instruction for fine display of the region of interest. Then, the cutout unit 134 cuts out the region-of-interest image SROI corresponding to the designated region of interest from the region-of-interest image SDIMG based on the region-of-interest information CINF. In this embodiment, the arrangement of the region-of-interest image SROI allocated by the image-of-interest generation unit 30 of the moving image encoding device 10 is uniquely determined for the detected region of interest. Therefore, the cutout unit 134 can recognize the arrangement of the region of interest image SROI in the image of interest SDIMG, for example, by using the region of interest information CINF.

表示画像合成部１３６は、縮小画像データＲＤＩＭＧをメモリ１４０から読み出す。そして、表示画像合成部１３６は、関心領域の精細表示の指示等がないとき（通常時）、縮小画像データＲＤＩＭＧを表示画像データＯＩＭＧとして出力する。また、表示画像合成部１３６は、関心領域の精細表示の指示等があるとき、縮小画像データＲＤＩＭＧと切り出し部１３４から受けた関心領域画像データＳＲＯＩとを関心領域情報ＣＩＮＦ等に基づいて合成して、表示画像データＯＩＭＧを生成する。これにより、注目する関心領域（ユーザに指示された関心領域）を精細に表示するための画像データを含む表示画像データＯＩＭＧが生成される。 The display image composition unit 136 reads the reduced image data RDIMG from the memory 140. The display image composition unit 136 outputs the reduced image data RDIMG as the display image data OIMG when there is no instruction for fine display of the region of interest (normal time). Further, when there is an instruction for fine display of the region of interest, the display image composition unit 136 synthesizes the reduced image data RDIMG and the region of interest image data SROI received from the clipping unit 134 based on the region of interest information CINF and the like. Display image data OIMG is generated. Thereby, display image data OIMG including image data for finely displaying a region of interest of interest (region of interest designated by the user) is generated.

例えば、表示画像合成部１３６は、関心領域の精細表示の指示等があるとき、切り出し部１３４で切り出された関心領域の関心領域画像データＳＲＯＩを受ける。そして、表示画像合成部１３６は、注目する関心領域（ユーザに指示された関心領域）の関心領域画像データＲＯＩを、切り出し部１３４から受けた各サンプリング位相の関心領域画像データＳＲＯＩを用いて生成する。 For example, the display image composition unit 136 receives the region-of-interest image data SROI of the region of interest cut out by the cut-out unit 134 when there is an instruction for fine display of the region of interest. Then, the display image composition unit 136 generates the region-of-interest image data ROI of the region of interest of interest (region of interest designated by the user) using the region-of-interest image data SROI of each sampling phase received from the clipping unit 134. .

これにより、縮小画像ＲＤＩＭＧより精細な関心領域画像ＲＯＩが生成される。関心領域画像データＳＲＯＩに基づいて生成された関心領域画像データＲＯＩは、縮小画像データＲＤＩＭＧと合成され、表示画像データＯＩＭＧに含まれる。これにより、この実施形態では、指示された関心領域ＲＯＩを縮小画像ＲＩＭＧより精細に表示できる。 As a result, a region of interest image ROI that is finer than the reduced image RDIMG is generated. The region-of-interest image data ROI generated based on the region-of-interest image data SROI is combined with the reduced image data RDIMG and included in the display image data OIMG. Thereby, in this embodiment, the designated region of interest ROI can be displayed more finely than the reduced image RIMG.

例えば、表示画像合成部１３６は、縮小画像ＲＩＭＧの水平方向および垂直方向のそれぞれの縮小率が１／２倍のとき、縮小画像ＲＤＩＭＧを水平方向および垂直方向に２倍した拡大画像と関心領域画像データＳＲＯＩから生成した関心領域画像ＲＯＩとを合成して表示画像ＯＩＭＧを生成する。なお、表示画像合成部１３６は、関心領域画像データＳＲＯＩから生成した関心領域画像ＲＯＩ（縮小画像ＲＤＩＭＧに対して拡大された関心領域画像ＲＯＩ）を縮小画像ＲＤＩＭＧの任意の位置に配置して、表示画像ＯＩＭＧを生成してもよい。 For example, when the respective reduction ratios in the horizontal direction and the vertical direction of the reduced image RIMG are ½ times, the display image composition unit 136 enlarges the reduced image RDIMG in the horizontal direction and the vertical direction and the region-of-interest image. The region of interest image ROI generated from the data SROI is synthesized to generate a display image OIMG. The display image composition unit 136 arranges the region of interest image ROI generated from the region of interest image data SROI (region of interest image ROI enlarged with respect to the reduced image RDIMG) at an arbitrary position of the reduced image RDIMG, and displays it. An image OIMG may be generated.

図１０は、図９に示した動画像再生装置１１０の動作の一例を示している。図１０の動作は、ハードウエアのみで実現されてもよく、ハードウエアをソフトウエアにより制御することにより実現されてもよい。例えば、動画像処理プログラム等のソフトウエアは、コンピュータに図１０の動作を実行させる。 FIG. 10 shows an example of the operation of the moving image playback apparatus 110 shown in FIG. The operation of FIG. 10 may be realized only by hardware, or may be realized by controlling the hardware by software. For example, software such as a moving image processing program causes a computer to execute the operation of FIG.

処理Ｓ２００では、動画像再生装置１１０は、圧縮ストリームＳＴＲＭを取得する。例えば、ストリーム分離部１２２は、図１に示したストリーム記憶部１００から圧縮ストリームＳＴＲＭを受ける。 In process S200, the moving image playback device 110 acquires the compressed stream STRM. For example, the stream separation unit 122 receives the compressed stream STRM from the stream storage unit 100 illustrated in FIG.

処理Ｓ２１０では、ストリーム分離部１２２は、処理Ｓ２００で取得した圧縮ストリームＳＴＲＭを、ベースビュー用の圧縮ストリームＣＳＴ１と非ベースビュー用の圧縮ストリームＣＳＴ２とに分離する。 In process S210, the stream separation unit 122 separates the compressed stream STRM acquired in process S200 into a compressed stream CST1 for base view and a compressed stream CST2 for non-base view.

処理Ｓ２２０では、第１復号部１２４は、処理Ｓ２１０で分離された圧縮ストリームＣＳＴ１に対して、例えば、Ｈ．２６４に準拠したベースビューの復号処理を実行する。これにより、縮小画像データＲＩＭＧの符号化データが復号され、縮小画像データＲＤＩＭＧが生成される。縮小画像データＲＤＩＭＧは、例えば、メモリ１４０に記憶される。 In the process S220, the first decoding unit 124 applies, for example, H.264 to the compressed stream CST1 separated in the process S210. H.264-compliant base view decoding processing is executed. Thereby, the encoded data of the reduced image data RIMG is decoded, and reduced image data RDIMG is generated. The reduced image data RDIMG is stored in the memory 140, for example.

処理Ｓ２３０では、表示画像合成部１３６は、処理Ｓ２２０で復号された縮小画像データＲＤＩＭＧを表示画像データＯＩＭＧとして出力する。これにより、縮小画像データＲＤＩＭＧに基づく表示画像ＯＩＭＧが表示される。なお、表示画像ＯＩＭＧを表示する表示装置は、動画像再生装置１１０の一部として設けられてもよいし、動画像再生装置１１０の外部に設けられてもよい。 In process S230, the display image composition unit 136 outputs the reduced image data RDIMG decoded in process S220 as display image data OIMG. Thereby, the display image OIMG based on the reduced image data RDIMG is displayed. The display device that displays the display image OIMG may be provided as a part of the moving image reproduction device 110 or may be provided outside the moving image reproduction device 110.

処理Ｓ２４０では、動画像再生装置１１０は、関心領域の精細表示要求があるか否かを判定する。例えば、ユーザは、関心領域を精細に表示したいときに、動画像再生装置１１０のユーザインターフェース等を用いて、精細表示要求を動画像再生装置１１０に通知する。これにより、表示画像生成部１３０等は、関心領域の精細表示要求があるとき、ユーザインターフェース等から精細表示要求を受ける。 In process S240, the moving image playback device 110 determines whether or not there is a request for fine display of the region of interest. For example, when the user wants to display the region of interest in detail, the user notifies the moving image playback device 110 of a fine display request using the user interface of the video playback device 110 or the like. Thereby, the display image generation unit 130 or the like receives a fine display request from the user interface or the like when there is a fine display request for the region of interest.

関心領域の精細表示要求がないとき（処理Ｓ２４０のＮｏ）、動画像再生装置１１０は、一連の処理を終了する。すなわち、ベースビュー用の圧縮ストリームＣＳＴ１（縮小画像データＲＩＭＧの符号化データ）に対しては、復号処理が常に実行される。そして、ユーザからの精細表示要求がないときには、縮小画像ＲＤＩＭＧに基づく画像のみが表示される。一方、関心領域の精細表示要求があるとき（処理Ｓ２４０のＹｅｓ）、動画像再生装置１１０の動作は、処理Ｓ２５０に移る。 When there is no request for fine display of the region of interest (No in process S240), the moving image reproducing apparatus 110 ends the series of processes. That is, the decoding process is always performed on the compressed stream CST1 for base view (encoded data of the reduced image data RIMG). When there is no fine display request from the user, only an image based on the reduced image RDIMG is displayed. On the other hand, when there is a request for fine display of the region of interest (Yes in process S240), the operation of the moving image reproduction apparatus 110 proceeds to process S250.

処理Ｓ２５０では、関心領域検出部１３２は、処理Ｓ２２０で復号された縮小画像データＲＤＩＭＧを用いて関心領域を検出する。なお、動画像再生装置１１０は、検出した関心領域を示すマーク等を、表示画像に反映してもよい。 In process S250, the region of interest detection unit 132 detects the region of interest using the reduced image data RDIMG decoded in process S220. Note that the moving image playback device 110 may reflect a mark or the like indicating the detected region of interest on the display image.

処理Ｓ２６０では、動画像再生装置１１０は、精細表示する関心領域の指定があるか否かを判定する。例えば、ユーザは、関心領域を精細に表示したいときに、動画像再生装置１１０のユーザインターフェース等を用いて、精細表示する関心領域を指定する。これにより、画像復号部１２０、表示画像生成部１３０等は、例えば、指定された関心領域を示す情報をユーザインターフェース等から受ける。関心領域指定がないとき（処理Ｓ２６０のＮｏ）、動画像再生装置１１０は、一連の処理を終了する。一方、関心領域指定があるとき（処理Ｓ２６０のＹｅｓ）、動画像再生装置１１０の動作は、処理Ｓ２７０に移る。 In step S260, the moving image playback device 110 determines whether there is a designation of a region of interest for fine display. For example, when the user wants to display the region of interest in detail, the user designates the region of interest to be displayed in detail using the user interface of the moving image reproduction apparatus 110 or the like. Accordingly, the image decoding unit 120, the display image generation unit 130, and the like receive, for example, information indicating the designated region of interest from the user interface or the like. When there is no region of interest designation (No in process S260), the moving image reproducing apparatus 110 ends a series of processes. On the other hand, when there is a region of interest designation (Yes in process S260), the operation of the moving image reproduction device 110 proceeds to process S270.

処理Ｓ２７０では、第２復号部１２６は、処理Ｓ２１０で分離された圧縮ストリームＣＳＴ２に対して、例えば、Ｈ．２６４に準拠した非ベースビューの復号処理を実行する。これにより、関心画像データＳＩＭＧの符号化データが復号され、関心画像データＳＤＩＭＧが生成される。すなわち、非ベースビュー用の圧縮ストリームＣＳＴ２（関心画像データＳＩＭＧの符号化データ）に対しては、関心領域指定があるとき、復号処理が実行される。関心画像データＳＤＩＭＧは、例えば、メモリ１４０に記憶される。 In the process S270, the second decoding unit 126 applies, for example, H.264 to the compressed stream CST2 separated in the process S210. H.264 compliant non-base view decoding processing is executed. Thereby, the encoded data of the interest image data SIMG is decoded, and the interest image data SDIMG is generated. That is, for the compressed stream CST2 for non-base view (encoded data of the image data SIMG of interest), the decoding process is executed when the region of interest is designated. The interest image data SDIMG is stored in the memory 140, for example.

処理Ｓ２８０では、切り出し部１３４は、指定された関心領域に対応する関心領域画像ＳＲＯＩを、処理Ｓ２５０で検出された関心領域に関する関心領域情報ＣＩＮＦに基づいて、関心画像ＳＤＩＭＧから切り出す。 In the process S280, the cutout unit 134 cuts out the region of interest image SROI corresponding to the designated region of interest from the image of interest SDIMG based on the region of interest information CINF regarding the region of interest detected in the process S250.

処理Ｓ２９０では、処理Ｓ２８０で切り出された関心領域画像ＳＲＯＩと処理Ｓ２２０で復号された縮小画像ＲＤＩＭＧとを合成し、表示画像ＯＩＭＧを生成する。これにより、処理Ｓ３００において、他の領域より高精細な関心領域を含む合成画像（表示画像ＯＩＭＧ）が表示される。このように、この実施形態では、指定された関心領域ＲＯＩを縮小画像ＲＩＭＧより高精細に表示できる。例えば、この実施形態では、関心領域ＲＯＩを拡大して精細に表示できる。 In process S290, the region-of-interest image SROI cut out in process S280 and the reduced image RDIMG decoded in process S220 are combined to generate a display image OIMG. Thereby, in process S300, the synthesized image (display image OIMG) including the region of interest with higher definition than other regions is displayed. Thus, in this embodiment, the designated region of interest ROI can be displayed with higher definition than the reduced image RIMG. For example, in this embodiment, the region of interest ROI can be enlarged and displayed finely.

以上、この実施形態では、動画像処理システムＳＹＳは、入力画像ＩＩＭＧの縮小画像ＲＩＭＧと入力画像ＩＩＭＧ内の関心領域ＲＯＩを精細表示するための関心画像ＳＩＭＧとを符号化する動画像符号化装置１０を有している。例えば、動画像符号化装置１０は、図３で説明したように、入力画像ＩＩＭＧ内の関心領域ＲＯＩを縮小画像ＲＩＭＧの縮小率に対応する間引き率で間引いて、関心画像ＳＩＭＧに割り付ける。 As described above, in this embodiment, the moving image processing system SYS encodes the reduced image RIMG of the input image IIMG and the interest image SIMG for finely displaying the region of interest ROI in the input image IIMG. have. For example, as described with reference to FIG. 3, the moving image encoding apparatus 10 thins out the region of interest ROI in the input image IIMG at a thinning rate corresponding to the reduction rate of the reduced image RIMG, and assigns it to the interested image SIMG.

そして、動画像符号化装置１０は、例えば、関心画像ＳＩＭＧを符号化する際に、縮小画像ＲＩＭＧを参照画像として必要に応じて使用する。これにより、この実施形態では、関心領域画像ＳＲＯＩを含む関心画像ＳＩＭＧを符号化する際の圧縮率を高くすることができ、データ量の増加を抑制できる。なお、関心領域ＲＯＩを分割した複数の関心領域画像ＳＲＯＩは、例えば、図４−図６で説明したように、１つの関心画像ＳＩＭＧに割り付けられてもよいし、複数の関心画像ＳＩＭＧにそれぞれ割り付けられてもよい。 For example, when encoding the image of interest SIMG, the moving image encoding device 10 uses the reduced image RIMG as a reference image as necessary. Thereby, in this embodiment, the compression rate at the time of encoding the interested image SIMG including the region-of-interest image SROI can be increased, and an increase in the data amount can be suppressed. Note that the plurality of region-of-interest images SROI obtained by dividing the region of interest ROI may be allocated to one region-of-interest image SIMG, for example, as described with reference to FIGS. May be.

例えば、この実施形態では、関心画像生成部３０は、図５に示すように、複数の関心領域画像ＳＲＯＩを共通の時刻の複数の関心画像ＳＩＭＧにそれぞれ割り付け、複数の関心画像ＳＩＭＧにそれぞれ対応する関心画像データＳＩＭＧを生成する。これにより、この実施形態では、動きベクトルの符号量を相対的に低減できる。 For example, in this embodiment, as shown in FIG. 5, the interest image generation unit 30 assigns a plurality of region-of-interest images SROI to a plurality of interest images SIMG at a common time, and corresponds to the plurality of interest images SIMG, respectively. Interest image data SIMG is generated. Thereby, in this embodiment, the code amount of a motion vector can be reduced relatively.

また、例えば、この実施形態では、関心画像生成部３０は、図６に示すように、複数の関心領域画像ＳＲＯＩを互いに異なる時刻の複数の関心画像ＳＩＭＧにそれぞれ割り付け、複数の関心画像ＳＩＭＧにそれぞれ対応する関心画像データＳＩＭＧを生成してもよい。これにより、この実施形態では、関心画像ＳＩＭＧを符号化する際に、ビュー間予測とフレーム間予測とで圧縮率の高い方の予測を選択でき、圧縮率を向上できる。 Further, for example, in this embodiment, as shown in FIG. 6, the interest image generation unit 30 assigns a plurality of region-of-interest images SROI to a plurality of interest images SIMG at different times, and assigns each of the plurality of interest images SIMG. Corresponding image data of interest SIMG may be generated. Thereby, in this embodiment, when encoding the image of interest SIMG, the prediction with the higher compression rate can be selected between the inter-view prediction and the inter-frame prediction, and the compression rate can be improved.

さらに、この実施形態では、動画像処理システムＳＹＳは、動画像符号化装置１０で生成された圧縮ストリームＳＴＲＭを復号する動画像再生装置１１０を有している。動画像再生装置１１０は、例えば、関心領域の位置を算出する関心領域検出部１３２を有している。 Furthermore, in this embodiment, the moving image processing system SYS includes a moving image reproduction device 110 that decodes the compressed stream STRM generated by the moving image encoding device 10. The moving image reproduction apparatus 110 includes, for example, a region of interest detection unit 132 that calculates the position of the region of interest.

関心領域検出部１３２は、動画像符号化装置１０の関心領域検出部３２による関心領域ＲＯＩの位置の算出方法を用いて、入力画像ＩＩＭＧ内の関心領域ＲＯＩの位置を算出する。例えば、関心領域検出部１３２は、縮小画像データＲＩＭＧの符号化データを復号した縮小画像データＲＤＩＭＧから関心領域を検出し、検出結果に基づいて、入力画像ＩＩＭＧ内の関心領域ＲＯＩの位置を算出する。これにより、この実施形態では、動画像符号化装置１０は、関心領域の位置情報等を圧縮ストリームＳＴＲＭ内に埋め込まなくてもよい。この結果、この実施形態では、圧縮ストリームＳＴＲＭのデータ量を低減できる。 The region of interest detection unit 132 calculates the position of the region of interest ROI in the input image IIMG by using the method of calculating the position of the region of interest ROI by the region of interest detection unit 32 of the moving image encoding device 10. For example, the region of interest detection unit 132 detects a region of interest from the reduced image data RDIMG obtained by decoding the encoded data of the reduced image data RIMG, and calculates the position of the region of interest ROI in the input image IIMG based on the detection result. . Thereby, in this embodiment, the moving image encoding device 10 does not have to embed the position information of the region of interest or the like in the compressed stream STRM. As a result, in this embodiment, the data amount of the compressed stream STRM can be reduced.

図１１は、別の実施形態における動画像符号化装置１０Ａの一例を示している。図１−図１０で説明した要素と同様の要素については、同様の符号を付し、これ等については、詳細な説明を省略する。この実施形態では、動画像処理システムＳＹＳは、例えば、動画像符号化装置１０Ａと、図１に示したストリーム記憶部１００と、図１４に示す動画像再生装置１１０Ａとを有している。動画像符号化装置１０Ａは、図２に示した関心画像生成部３０および画像符号化部４０の代わりに、関心画像生成部３０Ａおよび画像符号化部４０Ａを有している。動画像符号化装置１０Ａのその他の構成は、図１−図１０で説明した実施形態と同様である。 FIG. 11 shows an example of a moving image encoding apparatus 10A according to another embodiment. Elements similar to those described in FIGS. 1 to 10 are denoted by the same reference numerals, and detailed description thereof will be omitted. In this embodiment, the moving image processing system SYS includes, for example, the moving image encoding device 10A, the stream storage unit 100 illustrated in FIG. 1, and the moving image reproduction device 110A illustrated in FIG. The moving image encoding apparatus 10A includes an interest image generation unit 30A and an image encoding unit 40A instead of the interest image generation unit 30 and the image encoding unit 40 illustrated in FIG. Other configurations of the moving image encoding device 10A are the same as those in the embodiment described with reference to FIGS.

例えば、動画像符号化装置１０Ａは、縮小画像生成部２０、関心画像生成部３０Ａ、画像符号化部４０Ａおよびメモリ５０を有している。縮小画像生成部２０は、図１−図１０で説明した実施形態と同様である。例えば、縮小画像生成部２０は、図２で説明した縮小画像生成部２０と同一である。関心画像生成部３０Ａは、例えば、関心領域検出部３２Ａ、切り出し部３４および画像合成部３６を有している。関心領域検出部３２Ａは、画像データＩＩＭＧをメモリ５０から読み出し、顔検出のアルゴリズム等を用いて関心領域ＲＯＩを検出する。 For example, the moving image encoding device 10A includes a reduced image generation unit 20, an interest image generation unit 30A, an image encoding unit 40A, and a memory 50. The reduced image generation unit 20 is the same as the embodiment described with reference to FIGS. For example, the reduced image generation unit 20 is the same as the reduced image generation unit 20 described in FIG. The interest image generation unit 30A includes, for example, a region of interest detection unit 32A, a cutout unit 34, and an image composition unit 36. The region of interest detection unit 32A reads the image data IIMG from the memory 50, and detects the region of interest ROI using a face detection algorithm or the like.

すなわち、関心領域検出部３２Ａは、入力画像ＩＩＭＧから関心領域ＲＯＩを検出する。そして、関心領域検出部３２Ａは、検出した関心領域ＲＯＩに関する関心領域情報ＣＩＮＦ（関心領域ＲＯＩの個数、位置、サイズ等）を、切り出し部３４、画像合成部３６および画像符号化部４０Ａに通知する。なお、関心領域情報ＣＩＮＦが示す関心領域の位置（座標）等は、画像ＩＩＭＧを基準にした値である。 That is, the region of interest detection unit 32A detects the region of interest ROI from the input image IIMG. The region-of-interest detection unit 32A notifies the region-of-interest information CINF (the number, position, size, etc. of the region of interest ROI) regarding the detected region of interest ROI to the cutout unit 34, the image composition unit 36, and the image encoding unit 40A. . Note that the position (coordinates) of the region of interest indicated by the region-of-interest information CINF is a value based on the image IIMG.

切り出し部３４および画像合成部３６は、図１−図１０で説明した実施形態と同様である。なお、切り出し部３４は、関心領域情報ＣＩＮＦの値（座標等）が画像ＩＩＭＧを基準にした値であるため、関心領域情報ＣＩＮＦの値を補正する必要がない。 The cutout unit 34 and the image composition unit 36 are the same as those in the embodiment described with reference to FIGS. Note that the cutout unit 34 does not need to correct the value of the region-of-interest information CINF because the value (coordinates) of the region-of-interest information CINF is a value based on the image IIMG.

画像符号化部４０Ａは、第１符号化部４２、第２符号化部４４およびストリーム合成部４６Ａを有している。第１符号化部４２および第２符号化部４４は、図１−図１０で説明した実施形態と同様である。例えば、第１符号化部４２および第２符号化部４４は、図２で説明した第１符号化部４２および第２符号化部４４と同一である。 The image encoding unit 40A includes a first encoding unit 42, a second encoding unit 44, and a stream synthesis unit 46A. The 1st encoding part 42 and the 2nd encoding part 44 are the same as that of embodiment demonstrated in FIGS. 1-10. For example, the first encoding unit 42 and the second encoding unit 44 are the same as the first encoding unit 42 and the second encoding unit 44 described in FIG.

ストリーム合成部４６Ａは、第１符号化部４２で生成された圧縮ストリームＣＳＴ１および第２符号化部４４で生成された圧縮ストリームＣＳＴ２を受けるとともに、関心領域情報ＣＩＮＦを関心領域検出部３２Ａから受ける。そして、ストリーム合成部４６Ａは、圧縮ストリームＣＳＴ１、ＣＳＴ２をＨ．２６４のＭＶＣに準拠した形式で合成する。さらに、ストリーム合成部４６Ａは、関心領域ＲＯＩの位置情報等（例えば、関心領域情報ＣＩＮＦ）を、圧縮ストリームＣＳＴ１、ＣＳＴ２を合成した圧縮ストリームＳＴＲＭ内に埋め込む。 The stream synthesis unit 46A receives the compressed stream CST1 generated by the first encoding unit 42 and the compressed stream CST2 generated by the second encoding unit 44, and also receives the region-of-interest information CINF from the region-of-interest detection unit 32A. Then, the stream synthesis unit 46A converts the compressed streams CST1 and CST2 to H.264. It is synthesized in a format compliant with H.264 MVC. Further, the stream synthesizing unit 46A embeds position information of the region of interest ROI or the like (for example, region of interest information CINF) in the compressed stream STRM obtained by synthesizing the compressed streams CST1 and CST2.

すなわち、圧縮ストリームＳＴＲＭは、縮小画像データＲＩＭＧの符号化データと、関心画像データＳＩＭＧの符号化データと、関心領域ＲＯＩの位置情報とを有している。このように、ストリーム合成部４６Ａは、圧縮ストリームＣＳＴ１、ＣＳＴ２および関心領域ＲＯＩの位置情報を１つの圧縮ストリームＳＴＲＭとして、図１に示したストリーム記憶部１００等に出力する。 That is, the compressed stream STRM includes encoded data of the reduced image data RIMG, encoded data of the interest image data SIMG, and position information of the region of interest ROI. In this way, the stream synthesis unit 46A outputs the compressed stream CST1, CST2 and the position information of the region of interest ROI as one compressed stream STRM to the stream storage unit 100 shown in FIG.

なお、動画像符号化装置１０Ａの構成は、この例に限定されない。例えば、動画像符号化装置１０Ａは、図２に示した動画像符号化装置１０の機能を含み、動画像符号化装置１０の動作を必要に応じて実行してもよい。 Note that the configuration of the moving image encoding device 10A is not limited to this example. For example, the moving image encoding device 10A may include the functions of the moving image encoding device 10 illustrated in FIG. 2 and execute the operations of the moving image encoding device 10 as necessary.

図１２は、図１１に示した動画像符号化装置１０Ａの動作の一例を示している。図１２の動作は、ハードウエアのみで実現されてもよく、ハードウエアをソフトウエアにより制御することにより実現されてもよい。例えば、動画像処理プログラム等のソフトウエアは、コンピュータに図１２の動作を実行させる。図１２の動作では、図８に示した処理Ｓ１３０、Ｓ１７０の代わりに、処理Ｓ１３２、Ｓ１７２が実行される。図８で説明した処理については、詳細な説明を省略する。 FIG. 12 shows an example of the operation of the moving picture encoding apparatus 10A shown in FIG. The operation of FIG. 12 may be realized only by hardware, or may be realized by controlling the hardware by software. For example, software such as a moving image processing program causes a computer to execute the operation of FIG. In the operation of FIG. 12, processes S132 and S172 are executed instead of the processes S130 and S170 shown in FIG. Detailed description of the processing described in FIG. 8 is omitted.

処理Ｓ１３２では、関心領域検出部３２Ａは、処理Ｓ１００で取得した入力画像ＩＩＭＧを用いて、関心領域ＲＯＩを検出する。例えば、関心領域検出部３２は、入力画像データＩＩＭＧをメモリ５０から読み出し、顔検出のアルゴリズム等を用いて関心領域ＲＯＩを検出する。 In process S132, the region-of-interest detection unit 32A detects the region of interest ROI using the input image IIMG acquired in process S100. For example, the region of interest detection unit 32 reads the input image data IIMG from the memory 50 and detects the region of interest ROI using a face detection algorithm or the like.

処理Ｓ１７２では、ストリーム合成部４６Ａは、処理Ｓ１２０で生成された圧縮ストリームＣＳＴ１と処理Ｓ１６０で生成された圧縮ストリームＣＳＴ２とを合成し、圧縮ストリームＳＴＲＭを生成する。さらに、ストリーム合成部４６Ａは、処理Ｓ１３２で検出された関心領域ＲＯＩの位置情報等（例えば、関心領域情報ＣＩＮＦ）を、圧縮ストリームＣＳＴ１、ＣＳＴ２を合成した圧縮ストリームＳＴＲＭ内に埋め込む。これにより、関心領域ＲＯＩの位置情報と縮小画像データＲＩＭＧの符号化データと関心画像データＳＩＭＧの符号化データとを含む圧縮ストリームＳＴＲＭが生成される。 In the process S172, the stream synthesizing unit 46A synthesizes the compressed stream CST1 generated in the process S120 and the compressed stream CST2 generated in the process S160, and generates a compressed stream STRM. Furthermore, the stream synthesizing unit 46A embeds the position information of the region of interest ROI detected in step S132 (for example, the region of interest information CINF) in the compressed stream STRM obtained by synthesizing the compressed streams CST1 and CST2. Accordingly, a compressed stream STRM including the position information of the region of interest ROI, the encoded data of the reduced image data RIMG, and the encoded data of the image of interest data SIMG is generated.

なお、動画像符号化装置１０Ａの動作は、この例に限定されない。例えば、関心画像生成部３０Ａは、関心領域ＲＯＩを複数の関心領域画像ＳＲＯＩに分割する処理を、入力画像ＩＩＭＧのフレーム間隔（例えば、図６に示したフレーム時刻ｔ０からフレーム時刻ｔ１までの間隔）より大きい間隔で実行してもよい。すなわち、関心領域検出部３２Ａは、例えば、関心領域ＲＯＩの検出をフレーム間隔より大きい間隔で実行してもよい。また、切り出し部３４は、関心領域ＲＯＩの切り出し処理をフレーム間隔より大きい間隔で実行してもよい。このときには、関心画像データＳＩＭＧの符号化データを含む圧縮ストリームＣＳＴ１のデータ量を、切り出し処理をフレーム毎に実行するときに比べて低減できる。 Note that the operation of the moving image encoding device 10A is not limited to this example. For example, the interest image generation unit 30A performs the process of dividing the region of interest ROI into a plurality of region of interest images SROI, for example, the frame interval of the input image IIMG (for example, the interval from the frame time t0 to the frame time t1 shown in FIG. 6). It may be performed at larger intervals. That is, the region-of-interest detection unit 32A may perform detection of the region of interest ROI at an interval larger than the frame interval, for example. Further, the cutout unit 34 may execute the cutout process of the region of interest ROI at an interval larger than the frame interval. At this time, the data amount of the compressed stream CST1 including the encoded data of the interest image data SIMG can be reduced as compared with the case where the clipping process is executed for each frame.

図１３は、ストリーム構成の一例を示している。なお、図１３は、Ｈ．２６４のＭＶＣに準拠した圧縮ストリームＳＴＲＭの構成の一例を示している。圧縮ストリームＳＴＲＭは、少なくとも１つのシーケンスＳＥＱを有している。シーケンスＳＥＱは、少なくとも１つのアクセスユニットＡＵを有している。アクセスユニットは、ＡＵＤ（Access Unit Delimiter）、ＳＰＳ（Sequence Parameter Set）、ＰＰＳ（Picture Parameter Set）、ＳＥＩ（Supplemental Enhancement Information）の各種ヘッダ情報と、ビューコンポーネントＶＣとを有している。 FIG. 13 shows an example of the stream configuration. Note that FIG. 2 illustrates an example of a configuration of a compressed stream STRM compliant with H.264 MVC. The compressed stream STRM has at least one sequence SEQ. The sequence SEQ has at least one access unit AU. The access unit includes various header information of AUD (Access Unit Delimiter), SPS (Sequence Parameter Set), PPS (Picture Parameter Set), SEI (Supplemental Enhancement Information), and a view component VC.

なお、ＡＵＤ、ＳＰＳ、ＰＰＳ、ＳＥＩは、省かれてもよい。この実施形態では、関心領域ＲＯＩの位置情報は、例えば、ｕｓｅｒ＿ｄａｔａＳＥＩの１つとして、ＳＥＩに埋め込まれる。ＳＥＩに埋め込まれる関心領域ＲＯＩの位置情報は、例えば、入力画像ＩＩＭＧにおける関心領域ＲＯＩの座標、サイズおよび関心画像ＳＩＭＧ内での関心領域ＳＲＯＩの座標である。 AUD, SPS, PPS, and SEI may be omitted. In this embodiment, the position information of the region of interest ROI is embedded in SEI as one of user_data SEI, for example. The position information of the region of interest ROI embedded in the SEI is, for example, the coordinates and size of the region of interest ROI in the input image IIMG and the coordinates of the region of interest SROI in the image of interest SIMG.

ベースビューの画像データ（縮小画像データＲＩＭＧの符号化データ）および非ベースビューの画像データ（関心画像データＳＩＭＧの符号化データ）は、ビューコンポーネントＶＣにそれぞれ埋め込まれる。各ビューコンポーネントＶＣは、例えば、スライスＳＬを有している。スライスＳＬは、複数のマクロブロックＭＢを有している。 Base view image data (encoded data of reduced image data RIMG) and non-base view image data (encoded data of image data of interest SIMG) are embedded in the view component VC, respectively. Each view component VC has a slice SL, for example. The slice SL has a plurality of macro blocks MB.

図１４は、図１１に示した動画像符号化装置１０Ａに対応した動画像再生装置１１０Ａの一例を示している。図１−図１０で説明した要素と同様の要素については、同様の符号を付し、これ等については、詳細な説明を省略する。この実施形態の動画像再生装置１１０Ａは、図９に示した画像復号部１２０および表示画像生成部１３０の代わりに、画像復号部１２０Ａおよび表示画像生成部１３０Ａを有している。動画像再生装置１１０Ａのその他の構成は、図１−図１０で説明した実施形態と同様である。 FIG. 14 illustrates an example of a moving image reproduction device 110A corresponding to the moving image encoding device 10A illustrated in FIG. Elements similar to those described in FIGS. 1 to 10 are denoted by the same reference numerals, and detailed description thereof will be omitted. The moving image reproduction apparatus 110A of this embodiment includes an image decoding unit 120A and a display image generation unit 130A instead of the image decoding unit 120 and the display image generation unit 130 illustrated in FIG. Other configurations of the moving image playback device 110A are the same as those in the embodiment described with reference to FIGS.

例えば、動画像再生装置１１０Ａは、画像復号部１２０Ａ、表示画像生成部１３０Ａおよびメモリ１４０を有している。画像復号部１２０Ａは、例えば、ストリーム分離部１２２Ａ、第１復号部１２４および第２復号部１２６を有している。 For example, the moving image reproduction device 110A includes an image decoding unit 120A, a display image generation unit 130A, and a memory 140. The image decoding unit 120A includes, for example, a stream separation unit 122A, a first decoding unit 124, and a second decoding unit 126.

ストリーム分離部１２２Ａは、例えば、図１に示したストリーム記憶部１００から圧縮ストリームＳＴＲＭを受ける。そして、ストリーム分離部１２２Ａは、圧縮ストリームＳＴＲＭを、ベースビュー用の圧縮ストリームＣＳＴ１と非ベースビュー用の圧縮ストリームＣＳＴ２とに分離する。圧縮ストリームＣＳＴ１は、縮小画像データＲＩＭＧの符号化データを有し、圧縮ストリームＣＳＴ２は、関心画像データＳＩＭＧの符号化データを有している。 For example, the stream separation unit 122A receives the compressed stream STRM from the stream storage unit 100 illustrated in FIG. Then, the stream separation unit 122A separates the compressed stream STRM into a compressed stream CST1 for base view and a compressed stream CST2 for non-base view. The compressed stream CST1 has encoded data of reduced image data RIMG, and the compressed stream CST2 has encoded data of interest image data SIMG.

さらに、ストリーム分離部１２２Ａは、関心領域情報ＣＩＮＦ（関心領域ＲＯＩの位置情報等）を圧縮ストリームＳＴＲＭから取得する。そして、ストリーム分離部１２２Ａは、圧縮ストリームＣＳＴ１、ＣＳＴ２を第１復号部１２４および第２復号部１２６にそれぞれ出力するとともに、関心領域情報ＣＩＮＦを表示画像生成部１３０Ａに出力する。第１復号部１２４および第２復号部１２６は、図１−図１０で説明した実施形態と同様である。例えば、第１復号部１２４および第２復号部１２６は、図９で説明した第１復号部１２４および第２復号部１２６と同一である。 Further, the stream separation unit 122A acquires the region of interest information CINF (position information of the region of interest ROI, etc.) from the compressed stream STRM. Then, the stream separation unit 122A outputs the compressed streams CST1 and CST2 to the first decoding unit 124 and the second decoding unit 126, respectively, and outputs the region-of-interest information CINF to the display image generation unit 130A. The 1st decoding part 124 and the 2nd decoding part 126 are the same as that of embodiment demonstrated in FIGS. For example, the first decoding unit 124 and the second decoding unit 126 are the same as the first decoding unit 124 and the second decoding unit 126 described in FIG.

表示画像生成部１３０Ａは、例えば、図９に示した表示画像生成部１３０から関心領域検出部１３２が省かれている。例えば、表示画像生成部１３０Ａは、切り出し部１３４および表示画像合成部１３６を有している。切り出し部１３４および画像合成部１３６は、図１−図１０で説明した実施形態と同様である。なお、切り出し部１３４および画像合成部１３６は、関心領域情報ＣＩＮＦをストリーム分離部１２２Ａから受ける。これにより、表示画像生成部１３０Ａは、例えば、関心領域の精細表示の指示等があるとき、指示された関心領域ＲＯＩを縮小画像ＲＩＭＧより高精細に表示できる。 In the display image generation unit 130A, for example, the region of interest detection unit 132 is omitted from the display image generation unit 130 illustrated in FIG. For example, the display image generation unit 130A includes a cutout unit 134 and a display image composition unit 136. The cutout unit 134 and the image composition unit 136 are the same as those in the embodiment described with reference to FIGS. Note that the cutout unit 134 and the image composition unit 136 receive the region-of-interest information CINF from the stream separation unit 122A. Thereby, for example, when there is an instruction for fine display of the region of interest, the display image generation unit 130A can display the designated region of interest ROI with higher definition than the reduced image RIMG.

なお、動画像再生装置１１０Ａの構成は、この例に限定されない。例えば、動画像再生装置１１０Ａは、図９に示した動画像再生装置１１０の機能を含み、動画像再生装置１１０の動作を必要に応じて実行してもよい。 Note that the configuration of the moving image playback device 110A is not limited to this example. For example, the moving image reproduction device 110A may include the functions of the moving image reproduction device 110 illustrated in FIG. 9 and execute the operation of the moving image reproduction device 110 as necessary.

図１５は、図１４に示した動画像再生装置１１０Ａの動作の一例を示している。図１５の動作は、ハードウエアのみで実現されてもよく、ハードウエアをソフトウエアにより制御することにより実現されてもよい。例えば、動画像処理プログラム等のソフトウエアは、コンピュータに図１５の動作を実行させる。図１５の動作では、図１０に示した処理Ｓ２１０、Ｓ２５０の代わりに、処理Ｓ２１２、Ｓ２５２が実行される。図１０で説明した処理については、詳細な説明を省略する。 FIG. 15 shows an example of the operation of the moving image playback device 110A shown in FIG. The operation of FIG. 15 may be realized only by hardware, or may be realized by controlling the hardware by software. For example, software such as a moving image processing program causes a computer to execute the operation of FIG. In the operation of FIG. 15, processes S212 and S252 are executed instead of the processes S210 and S250 shown in FIG. Detailed description of the processing described in FIG. 10 is omitted.

処理Ｓ２１２では、ストリーム分離部１２２Ａは、処理Ｓ２００で取得した圧縮ストリームＳＴＲＭを、ベースビュー用の圧縮ストリームＣＳＴ１と非ベースビュー用の圧縮ストリームＣＳＴ２とに分離する。さらに、ストリーム分離部１２２Ａは、関心領域ＲＯＩの位置情報（関心領域情報ＣＩＮＦ）を圧縮ストリームＳＴＲＭから取得する。 In process S212, the stream separation unit 122A separates the compressed stream STRM acquired in process S200 into a compressed stream CST1 for base view and a compressed stream CST2 for non-base view. Further, the stream separation unit 122A acquires the position information (region of interest information CINF) of the region of interest ROI from the compressed stream STRM.

処理Ｓ２５２は、関心領域の精細表示要求があるとき（処理Ｓ２４０のＹｅｓ）実行される。なお、関心領域の精細表示要求がないとき（処理Ｓ２４０のＮｏ）の動画像再生装置１１０Ａの動作は、動画像再生装置１１０と同様である。 The process S252 is executed when there is a fine display request for the region of interest (Yes in the process S240). Note that the operation of the moving image playback device 110A when there is no fine display request for the region of interest (No in step S240) is the same as that of the moving image playback device 110.

処理Ｓ２５２では、動画像再生装置１１０Ａは、処理Ｓ２１２で取得した関心領域ＲＯＩの位置情報（関心領域情報ＣＩＮＦ）を、有効な情報として処理する。例えば、動画像再生装置１１０は、検出した関心領域を示すマーク等を、表示画像に反映してもよい。そして、動画像再生装置１１０Ａは、処理Ｓ２６０−Ｓ３００を実行することにより、指定された関心領域ＲＯＩを縮小画像ＲＩＭＧより高精細に表示できる。なお、処理Ｓ２５２は、省かれてもよい。例えば、処理Ｓ２５２が省かれた動作では、関心領域指定があるとき（処理Ｓ２６０のＹｅｓ）、処理Ｓ２１２で取得した関心領域ＲＯＩの位置情報（関心領域情報ＣＩＮＦ）が有効な情報として処理され、処理Ｓ２７０−Ｓ３００が実行される。 In step S252, the moving image playback device 110A processes the position information (region of interest information CINF) of the region of interest ROI acquired in step S212 as valid information. For example, the moving image playback device 110 may reflect a mark indicating the detected region of interest in the display image. Then, the moving image reproduction device 110A can display the designated region of interest ROI with higher definition than the reduced image RIMG by executing the processes S260 to S300. Note that step S252 may be omitted. For example, in the operation in which the process S252 is omitted, when there is a region of interest designation (Yes in the process S260), the position information (region of interest information CINF) of the region of interest ROI acquired in the process S212 is processed as valid information. S270 to S300 are executed.

以上、この実施形態においても、図１−図１０で説明した実施形態と同様の効果を得ることができる。さらに、この実施形態では、動画像再生装置１１０Ａは、関心領域検出部１３２を有さなくてもよい。このため、この実施形態では、動画像再生装置１１０Ａの構成を簡易にできる。 As described above, also in this embodiment, the same effect as that of the embodiment described with reference to FIGS. 1 to 10 can be obtained. Furthermore, in this embodiment, the moving image playback device 110A does not have to include the region of interest detection unit 132. For this reason, in this embodiment, the configuration of the moving image playback apparatus 110A can be simplified.

以上の実施形態において説明した発明を整理して、付記として開示する。
（付記１）
入力画像を予め設定された縮小率で縮小した第１画像に対応する第１画像データを生成する第１画像生成部と、
前記入力画像内の関心領域を前記縮小率に対応する間引き率で間引いて複数の関心領域画像に分割し、前記複数の関心領域画像の少なくとも１つを含む第２画像に対応する第２画像データを生成する第２画像生成部と、
前記第１画像データおよび前記第２画像データを符号化してストリームデータを生成し、前記第２画像データを符号化するとき、前記第２画像に対応付けられた時刻に対応する前記第１画像に基づいて予測処理を実施可能な画像符号化部と
を備えていることを特徴とする動画像処理装置。
（付記２）
前記第２画像生成部は、前記複数の関心領域画像を共通の時刻の複数の前記第２画像にそれぞれ割り付け、前記複数の第２画像にそれぞれ対応する前記第２画像データを生成すること
を特徴とする付記１記載の動画像処理装置。
（付記３）
前記第２画像生成部は、前記複数の関心領域画像を互いに異なる時刻の複数の前記第２画像にそれぞれ割り付け、前記複数の第２画像にそれぞれ対応する前記第２画像データを生成すること
を特徴とする付記１記載の動画像処理装置。
（付記４）
前記第２画像生成部は、前記複数の関心領域画像を共通の前記第２画像に割り付け、前記複数の関心領域画像が割り付けられた前記第２画像に対応する前記第２画像データを生成すること
を特徴とする付記１記載の動画像処理装置。
（付記５）
前記第２画像生成部は、前記関心領域を前記複数の関心領域画像に分割する処理を、前記入力画像のフレーム間隔より大きい間隔で実施すること
を特徴とする付記１ないし付記４のいずれか１項に記載の動画像処理装置。
（付記６）
前記第２画像生成部は、前記第１画像データを符号化する際に生成されるローカルデコード画像データを用いて前記関心領域を検出し、検出結果に基づいて前記入力画像内の前記関心領域の位置を算出すること
を特徴とする付記１ないし付記５のいずれか１項に記載の動画像処理装置。
（付記７）
付記６記載の動画像処理装置で生成される前記ストリームデータを復号する動画像処理装置において、
前記第１画像データを符号化したデータを復号した画像データを用い、前記ストリームデータを生成した符号化側で実行された前記関心領域の検出方法で、前記関心領域を検出する検出部を備えていること
を特徴とする動画像処理装置。
（付記８）
前記画像符号化部は、前記入力画像内の前記関心領域の位置を示す位置情報を、前記ストリームデータに含めること
を特徴とする付記１ないし付記５のいずれか１項に記載の動画像処理装置。
（付記９）
付記８記載の動画像処理装置で生成される前記ストリームデータを復号する動画像処理装置において、
前記ストリームデータに含まれる前記位置情報に基づいて、前記関心領域の位置を算出する第３画像生成部を備えていること
を特徴とする動画像処理装置。
（付記１０）
動画像を符号化する際の動画像処理方法であって、
入力画像を予め設定された縮小率で縮小した第１画像に対応する第１画像データを生成し、
前記入力画像内の関心領域を前記縮小率に対応する間引き率で間引いて複数の関心領域画像に分割し、前記複数の関心領域画像の少なくとも１つを含む第２画像に対応する第２画像データを生成し、
前記第１画像データおよび前記第２画像データを符号化してストリームデータを生成し、
前記第２画像データを符号化する符号化処理は、前記第２画像に対応付けられた時刻に対応する前記第１画像に基づいて予測処理を実施可能であること
を特徴とする動画像処理方法。
（付記１１）
前記複数の関心領域画像を共通の時刻の複数の前記第２画像にそれぞれ割り付け、前記複数の第２画像にそれぞれ対応する前記第２画像データを生成すること
を特徴とする付記１０記載の動画像処理方法。
（付記１２）
前記複数の関心領域画像を互いに異なる時刻の複数の前記第２画像にそれぞれ割り付け、前記複数の第２画像にそれぞれ対応する前記第２画像データを生成すること
を特徴とする付記１０記載の動画像処理方法。
（付記１３）
前記複数の関心領域画像を共通の前記第２画像に割り付け、前記複数の関心領域画像が割り付けられた前記第２画像に対応する前記第２画像データを生成すること
を特徴とする付記１０記載の動画像処理方法。
（付記１４）
前記関心領域を前記複数の関心領域画像に分割する処理を、前記入力画像のフレーム間隔より大きい間隔で実施すること
を特徴とする付記１０ないし付記１３のいずれか１項に記載の動画像処理方法。
（付記１５）
前記第１画像データを符号化する際に生成されるローカルデコード画像データを用いて前記関心領域を検出し、検出結果に基づいて前記入力画像内の前記関心領域の位置を算出すること
を特徴とする付記１０ないし付記１４のいずれか１項に記載の動画像処理方法。
（付記１６）
付記１５記載の動画像処理方法で生成される前記ストリームデータを復号する動画像処理方法において、
前記第１画像データを符号化したデータを復号した画像データを用い、前記ストリームデータを生成した符号化側で実行された前記関心領域の検出方法で、前記関心領域を検出すること
を特徴とする動画像処理方法。
（付記１７）
前記入力画像内の前記関心領域の位置を示す位置情報を、前記ストリームデータに含めること
を特徴とする付記１０ないし付記１４のいずれか１項に記載の動画像処理方法。
（付記１８）
付記１７記載の動画像処理方法で生成される前記ストリームデータを復号する動画像処理方法において、
前記ストリームデータに含まれる前記位置情報に基づいて、前記関心領域の位置を算出すること
を特徴とする動画像処理方法。
（付記１９）
コンピュータに実行させる動画像処理プログラムであって、
前記コンピュータを付記１ないし付記９のいずれか１項に記載の動画像処理装置として動作させる動画像処理プログラム。
（付記２０）
入力画像を予め設定された縮小率で縮小した第１画像に対応する第１画像データを生成する第１画像生成処理と、
前記入力画像内の関心領域を前記縮小率に対応する間引き率で間引いて複数の関心領域画像に分割し、前記複数の関心領域画像の少なくとも１つを含む第２画像に対応する第２画像データを生成する第２画像生成処理と、
前記第１画像データおよび前記第２画像データを符号化してストリームデータを生成し、前記第２画像データを符号化するとき、前記第２画像に対応付けられた時刻に対応する前記第１画像に基づいて予測処理を実施可能な画像符号化処理と
をコンピュータに実行させることを特徴とする動画像処理プログラム。
（付記２１）
前記第２画像生成処理では、前記複数の関心領域画像を共通の時刻の複数の前記第２画像にそれぞれ割り付け、前記複数の第２画像にそれぞれ対応する前記第２画像データを生成すること
を特徴とする付記２０記載の動画像処理プログラム。
（付記２２）
前記第２画像生成処理では、前記複数の関心領域画像を互いに異なる時刻の複数の前記第２画像にそれぞれ割り付け、前記複数の第２画像にそれぞれ対応する前記第２画像データを生成すること
を特徴とする付記２０記載の動画像処理プログラム。
（付記２３）
前記第２画像生成処理では、前記複数の関心領域画像を共通の前記第２画像に割り付け、前記複数の関心領域画像が割り付けられた前記第２画像に対応する前記第２画像データを生成すること
を特徴とする付記２０記載の動画像処理プログラム。
（付記２４）
前記第２画像生成処理では、前記関心領域を前記複数の関心領域画像に分割する処理を、前記入力画像のフレーム間隔より大きい間隔で実施すること
を特徴とする付記２０ないし付記２３のいずれか１項に記載の動画像処理プログラム。
（付記２５）
前記第２画像生成処理では、前記第１画像データを符号化する際に生成されるローカルデコード画像データを用いて前記関心領域を検出し、検出結果に基づいて前記入力画像内の前記関心領域の位置を算出すること
を特徴とする付記２０ないし付記２４のいずれか１項に記載の動画像処理プログラム。
（付記２６）
付記２５記載の動画像処理プログラムを用いて生成される前記ストリームデータを復号する動画像処理プログラムにおいて、
前記第１画像データを符号化したデータを復号した画像データを用い、前記ストリームデータを生成した符号化側で実行された前記関心領域の検出方法で、前記関心領域を検出する検出処理をコンピュータに実行させることを特徴とする動画像処理プログラム。
（付記２７）
前記画像符号化処理では、前記入力画像内の前記関心領域の位置を示す位置情報を、前記ストリームデータに含めること
を特徴とする付記２０ないし付記２４のいずれか１項に記載の動画像処理プログラム。
（付記２８）
付記２７記載の動画像処理プログラムを用いて生成される前記ストリームデータを復号する動画像処理プログラムにおいて、
前記ストリームデータに含まれる前記位置情報に基づいて、前記関心領域の位置を算出する第３画像生成処理をコンピュータに実行させることを特徴とする動画像処理プログラム。 The invention described in the above embodiments is organized and disclosed as an appendix.
(Appendix 1)
A first image generation unit that generates first image data corresponding to a first image obtained by reducing an input image at a preset reduction rate;
The region of interest in the input image is thinned out at a thinning rate corresponding to the reduction rate and divided into a plurality of region of interest images, and second image data corresponding to a second image including at least one of the plurality of regions of interest images A second image generation unit for generating
When the first image data and the second image data are encoded to generate stream data, and the second image data is encoded, the first image corresponding to the time associated with the second image is added to the first image data. And a video encoding unit capable of performing prediction processing based on the video encoding unit.
(Appendix 2)
The second image generation unit allocates the plurality of region-of-interest images to the plurality of second images at a common time, and generates the second image data respectively corresponding to the plurality of second images. The moving image processing apparatus according to appendix 1.
(Appendix 3)
The second image generation unit allocates the plurality of region-of-interest images to the plurality of second images at different times, and generates the second image data corresponding to the plurality of second images, respectively. The moving image processing apparatus according to appendix 1.
(Appendix 4)
The second image generation unit allocates the plurality of region-of-interest images to the common second image, and generates the second image data corresponding to the second image to which the plurality of region-of-interest images are allocated. The moving image processing apparatus according to appendix 1, characterized by:
(Appendix 5)
The second image generation unit performs the process of dividing the region of interest into the plurality of region-of-interest images at intervals larger than the frame interval of the input image. The moving image processing apparatus according to the item.
(Appendix 6)
The second image generation unit detects the region of interest using local decoded image data generated when the first image data is encoded, and the region of interest in the input image is detected based on the detection result. The moving image processing apparatus according to any one of appendix 1 to appendix 5, wherein the position is calculated.
(Appendix 7)
In the moving image processing apparatus for decoding the stream data generated by the moving image processing apparatus according to attachment 6,
A detection unit configured to detect the region of interest by using the image data obtained by decoding the data obtained by encoding the first image data and detecting the region of interest in the region of interest detection method executed on the encoding side that generated the stream data; A moving image processing apparatus.
(Appendix 8)
The moving image processing apparatus according to any one of Supplementary Note 1 to Supplementary Note 5, wherein the image encoding unit includes position information indicating a position of the region of interest in the input image in the stream data. .
(Appendix 9)
In the moving image processing device for decoding the stream data generated by the moving image processing device according to attachment 8,
A moving image processing apparatus comprising: a third image generation unit that calculates a position of the region of interest based on the position information included in the stream data.
(Appendix 10)
A moving image processing method for encoding a moving image,
Generating first image data corresponding to a first image obtained by reducing the input image at a preset reduction rate;
The region of interest in the input image is thinned out at a thinning rate corresponding to the reduction rate and divided into a plurality of region of interest images, and second image data corresponding to a second image including at least one of the plurality of regions of interest images Produces
Encoding the first image data and the second image data to generate stream data;
The moving image processing method characterized in that the encoding processing for encoding the second image data can perform prediction processing based on the first image corresponding to the time associated with the second image. .
(Appendix 11)
11. The moving image according to claim 10, wherein the plurality of region-of-interest images are respectively assigned to the plurality of second images at a common time, and the second image data corresponding to the plurality of second images is generated. Processing method.
(Appendix 12)
11. The moving image according to claim 10, wherein the plurality of region-of-interest images are respectively assigned to the plurality of second images at different times, and the second image data corresponding to the plurality of second images is generated. Processing method.
(Appendix 13)
The additional image according to claim 10, wherein the plurality of region-of-interest images are allocated to the common second image, and the second image data corresponding to the second image to which the plurality of region-of-interest images are allocated is generated. Video processing method.
(Appendix 14)
The moving image processing method according to any one of appendix 10 to appendix 13, wherein the process of dividing the region of interest into the plurality of region-of-interest images is performed at an interval larger than a frame interval of the input image. .
(Appendix 15)
Detecting the region of interest using local decoded image data generated when the first image data is encoded, and calculating a position of the region of interest in the input image based on a detection result. 15. The moving image processing method according to any one of appendix 10 to appendix 14.
(Appendix 16)
In the moving image processing method for decoding the stream data generated by the moving image processing method according to attachment 15,
Using the image data obtained by decoding the data obtained by encoding the first image data, and detecting the region of interest by the region-of-interest detection method executed on the encoding side that generated the stream data. Video processing method.
(Appendix 17)
15. The moving image processing method according to claim 10, wherein position information indicating the position of the region of interest in the input image is included in the stream data.
(Appendix 18)
In the moving image processing method for decoding the stream data generated by the moving image processing method according to attachment 17,
A moving image processing method, comprising: calculating a position of the region of interest based on the position information included in the stream data.
(Appendix 19)
A moving image processing program to be executed by a computer,
A moving image processing program for causing the computer to operate as the moving image processing apparatus according to any one of appendix 1 to appendix 9.
(Appendix 20)
A first image generation process for generating first image data corresponding to a first image obtained by reducing an input image at a preset reduction rate;
The region of interest in the input image is thinned out at a thinning rate corresponding to the reduction rate and divided into a plurality of region of interest images, and second image data corresponding to a second image including at least one of the plurality of regions of interest images A second image generation process for generating
When the first image data and the second image data are encoded to generate stream data, and the second image data is encoded, the first image corresponding to the time associated with the second image is added to the first image data. A moving image processing program that causes a computer to execute an image encoding process capable of performing a prediction process based on the computer program.
(Appendix 21)
In the second image generation process, the plurality of region-of-interest images are respectively assigned to the plurality of second images at a common time, and the second image data corresponding to the plurality of second images is generated. The moving image processing program according to appendix 20.
(Appendix 22)
In the second image generation process, the plurality of region-of-interest images are respectively assigned to the plurality of second images at different times, and the second image data corresponding to the plurality of second images is generated. The moving image processing program according to appendix 20.
(Appendix 23)
In the second image generation processing, the plurality of region-of-interest images are allocated to the common second image, and the second image data corresponding to the second image to which the plurality of region-of-interest images are allocated is generated. The moving image processing program according to appendix 20, characterized by:
(Appendix 24)
Any one of Supplementary notes 20 to 23, wherein in the second image generation process, the process of dividing the region of interest into the plurality of region-of-interest images is performed at an interval larger than the frame interval of the input image. The moving image processing program according to the item.
(Appendix 25)
In the second image generation process, the region of interest is detected using local decoded image data generated when the first image data is encoded, and the region of interest in the input image is detected based on the detection result. The moving image processing program according to any one of supplementary notes 20 to 24, wherein the position is calculated.
(Appendix 26)
In the moving image processing program for decoding the stream data generated using the moving image processing program according to attachment 25,
Using the image data obtained by decoding the data obtained by encoding the first image data, a detection process for detecting the region of interest is performed on the computer by the region of interest detection method executed on the encoding side that generated the stream data. A moving image processing program that is executed.
(Appendix 27)
25. The moving image processing program according to any one of supplementary notes 20 to 24, wherein in the image encoding processing, position information indicating a position of the region of interest in the input image is included in the stream data. .
(Appendix 28)
In the moving image processing program for decoding the stream data generated using the moving image processing program according to attachment 27,
A moving image processing program for causing a computer to execute a third image generation process for calculating the position of the region of interest based on the position information included in the stream data.

以上の詳細な説明により、実施形態の特徴点および利点は明らかになるであろう。これは、特許請求の範囲がその精神および権利範囲を逸脱しない範囲で前述のような実施形態の特徴点および利点にまで及ぶことを意図するものである。また、当該技術分野において通常の知識を有する者であれば、あらゆる改良および変更に容易に想到できるはずであり、発明性を有する実施形態の範囲を前述したものに限定する意図はなく、実施形態に開示された範囲に含まれる適当な改良物および均等物に拠ることも可能である。 From the above detailed description, features and advantages of the embodiments will become apparent. This is intended to cover the features and advantages of the embodiments described above without departing from the spirit and scope of the claims. Further, any person having ordinary knowledge in the technical field should be able to easily come up with any improvements and changes, and there is no intention to limit the scope of the embodiments having the invention to those described above. It is also possible to rely on suitable improvements and equivalents within the scope disclosed in.

１０、１０Ａ‥動画像符号化装置；２０‥縮小画像生成部；３０、３０Ａ‥関心画像生成部；３２、３２Ａ、１３２‥関心領域検出部；３４、１３４‥切り出し部；３６‥画像合成部；４０、４０Ａ‥画像符号化部；４２‥第１符号化部；４４‥第２符号化部；４６、４６Ａ‥ストリーム合成部；５０、１４０‥メモリ；１００‥ストリーム記憶部；１１０‥動画像再生装置；１２０、１２０Ａ‥画像復号部；１２２、１２２Ａ‥ストリーム分離部；１２４‥第１復号部；１２６‥第２復号部；１３０、１３０Ａ‥表示画像生成部；１３６‥表示画像合成部；ＳＹＳ‥動画像処理システム DESCRIPTION OF SYMBOLS 10, 10A ... Moving image encoding apparatus; 20 ... Reduced image generation part; 30, 30A ... Interest image generation part; 32, 32A, 132 ... Interest area detection part; 34, 134 ... Extraction part; 40, 40A ... Image encoding unit; 42 ... First encoding unit; 44 ... Second encoding unit; 46, 46A ... Stream composition unit; 50, 140 ... Memory; 100 ... Stream storage unit; 120, 120A, image decoding unit; 122, 122A, stream separation unit; 124, first decoding unit; 126, second decoding unit; 130, 130A, display image generation unit; 136, display image synthesis unit; Video processing system

Claims

A first image generation unit that generates first image data corresponding to a first image obtained by reducing an input image at a preset reduction rate;
The region of interest in the input image is thinned out at a thinning rate corresponding to the reduction rate and divided into a plurality of region of interest images, and second image data corresponding to a second image including at least one of the plurality of regions of interest images A second image generation unit for generating
When the first image data and the second image data are encoded to generate stream data, and the second image data is encoded, the first image corresponding to the time associated with the second image is added to the first image data. And a video encoding unit capable of performing prediction processing based on the video encoding unit.

The second image generation unit allocates the plurality of region-of-interest images to the plurality of second images at a common time, and generates the second image data respectively corresponding to the plurality of second images. The moving image processing apparatus according to claim 1.

The second image generation unit allocates the plurality of region-of-interest images to the plurality of second images at different times, and generates the second image data corresponding to the plurality of second images, respectively. The moving image processing apparatus according to claim 1.

The said 2nd image generation part implements the process which divides | segments the said region of interest into these several region of interest image by the space | interval larger than the frame space | interval of the said input image. The moving image processing apparatus according to claim 1.

The second image generation unit detects the region of interest using local decoded image data generated when the first image data is encoded, and the region of interest in the input image is detected based on the detection result. The moving image processing apparatus according to claim 1, wherein the position is calculated.

The moving image processing apparatus for decoding the stream data generated by the moving image processing apparatus according to claim 5,
A detection unit configured to detect the region of interest by using the image data obtained by decoding the data obtained by encoding the first image data and detecting the region of interest in the region of interest detection method executed on the encoding side that generated the stream data; A moving image processing apparatus.

A moving image processing method for encoding a moving image,
Generating first image data corresponding to a first image obtained by reducing the input image at a preset reduction rate;
The region of interest in the input image is thinned out at a thinning rate corresponding to the reduction rate and divided into a plurality of region of interest images, and second image data corresponding to a second image including at least one of the plurality of regions of interest images Produces
Encoding the first image data and the second image data to generate stream data;
The moving image processing method characterized in that the encoding processing for encoding the second image data can perform prediction processing based on the first image corresponding to the time associated with the second image. .

A moving image processing program to be executed by a computer,
A moving image processing program causing the computer to operate as the moving image processing apparatus according to claim 1.