JP2014039219A

JP2014039219A - Video decoder, video transmission reception system, video decoding method and video transmission reception method

Info

Publication number: JP2014039219A
Application number: JP2012181714A
Authority: JP
Inventors: Hideaki Kimata; 英明木全; Daisuke Ochi; 大介越智; Yoshinori Kusachi; 良規草地; Akira Kojima; 明小島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2012-08-20
Filing date: 2012-08-20
Publication date: 2014-02-27
Anticipated expiration: 2032-08-20
Also published as: JP5700703B2

Abstract

PROBLEM TO BE SOLVED: To avoid a situation where an image cannot be displayed since the image to be displayed is not present.SOLUTION: Region video position information showing N (N is natural number) pieces of different coordinate positions to a reference position on a face which is two-dimensionally expressed and N pieces of video streams obtained by encoding a region video corresponding to the region video position information and a common video are inputted; i-th (1≤i≤N) video stream is read so that the number of pixels in an output image becomes the largest from the coordinate position of the output image to the reference position and from the region video position information; the video stream which is read is decoded; a segmenting position of the image is determined so that the coordinate position of the output image to the reference position becomes the same according to i-th region video position information; a part of the region video in a decoded image is segmented according to the determined segmenting position of the image; and the common video in the decoded image and the segmented image are synthesized to output it as the output image.

Description

本発明は、複数の領域映像から出力画像の映像ストリームを取り出して復号する映像復号装置、映像送受信システム、映像復号方法及び映像送受信方法に関する。 The present invention relates to a video decoding apparatus, a video transmission / reception system, a video decoding method, and a video transmission / reception method that extract and decode a video stream of an output image from a plurality of area videos.

ハイビジョンを超える非常に高い解像度の映像に対して、ハイビジョン解像度のディスプレイで、その一部の領域のみを表示する映像システムが研究されている。このようなシステムによれはユーザが視聴したい領域を指定し、その指定された領域の映像を十分な解像度で視聴することができる。システムを実現するには映像の圧縮伝送方式が重要である。圧縮伝送方式としては、高い解像度の映像を複数のタイル状に分割して、個々のタイル映像を圧縮符号化しておき、映像再生時には一部のタイル映像のストリームを復号して表示する方式が研究されている。例えば、非特許文献１によれば、パノラマ映像を同じ大きさのタイルに分割し、一つのタイルを一つのカメラ映像とみなして圧縮し、それらのストリームを多重化しておき、再生する際にはいくつか複数のタイルを選択して復号し、復号したタイルから構成される映像の一部を切り出して表示する手法が提案されている。 Research has been conducted on video systems that display only a part of a high-definition display on a very high-resolution video that exceeds high-definition. According to such a system, a user can designate an area that the user wants to view and can view a video of the designated area with a sufficient resolution. The video compression transmission method is important for realizing the system. As a compression transmission method, research has been conducted on a method that divides a high-resolution video into multiple tiles, compresses and encodes each tile video, and decodes and displays some tile video streams during video playback. Has been. For example, according to Non-Patent Document 1, when panoramic video is divided into tiles of the same size, one tile is regarded as one camera video and compressed, and the streams are multiplexed and played back. A technique has been proposed in which a plurality of tiles are selected and decoded, and a part of a video composed of the decoded tiles is cut out and displayed.

図２０は、従来のタイル構成と出力画像の位置関係の例を示す図である。非特許文献１では、図２０に示すように、９個のタイルを復号して、それらから一つの映像を構成し、その一部を切り出して出力・表示をする例が示されている。この場合に、映像再生装置では、９個のタイル映像を得るために多重化された９個の映像ストリームを復号する必要がある。このタイル数は表示する領域の大きさに依存して変わる特徴がある。最大でデコードするタイル数を予め決めておけば、１個からその最大数までの範囲でダイナミックに変わりうる。このようなダイナミックにデコードするタイル数が変わることにも対応するため、非特許文献１では国際標準方式Ｈ．２６４ＡｎｎｅｘＨ（ＭＶＣ）を用いて多重化された映像ストリームを部分的に復号する仕組みを利用している。一般的にも、複数のタイルを復号する必要があるため、複数のストリームを復号するためのデコード方式が必須である。 FIG. 20 is a diagram illustrating an example of a positional relationship between a conventional tile configuration and an output image. In Non-Patent Document 1, as shown in FIG. 20, an example is shown in which nine tiles are decoded, one video is formed therefrom, and a part thereof is cut out and output / displayed. In this case, the video reproduction device needs to decode nine multiplexed video streams in order to obtain nine tile videos. The number of tiles has a characteristic that changes depending on the size of the display area. If the number of tiles to be decoded is determined in advance, the number of tiles to be decoded can be changed dynamically from one to the maximum number. In order to cope with such a dynamic change in the number of tiles to be decoded, Non-Patent Document 1 discloses an international standard method H.264. A mechanism for partially decoding a video stream multiplexed using H.264 Annex H (MVC) is used. Generally, since it is necessary to decode a plurality of tiles, a decoding scheme for decoding a plurality of streams is essential.

このようなストリームを復号して表示をする際には、非特許文献１に記載の通り、複数タイルの一部を切り出して表示をするため、視聴する領域が上下左右に移動しても、復号したタイルの中であれば、表示する映像の端が切れてしまうことはない。またタイルを超える位置になると、別のタイルの復号を開始するため、やはり表示する映像は無くなることはない。以上のような処理により、高い解像度の映像の一部領域をユーザが自由に指定して表示する映像システムを構成することができる。 When decoding and displaying such a stream, as described in Non-Patent Document 1, since a part of a plurality of tiles is cut out and displayed, decoding is possible even if the viewing area moves up, down, left and right. If it is in the tile, the edge of the displayed video will not be cut off. When the position exceeds the tile, decoding of another tile is started, so that the video to be displayed is not lost. Through the processing as described above, it is possible to configure a video system in which a user freely designates and displays a partial area of a high-resolution video.

ここで、このような一部の領域の映像を復号して表示する装置を使うことで、ユーザが見たいところを移動しながら視聴するインタラクティブな視聴が可能となるが、ユーザ操作による視聴領域の移動が大きい場合には、表示する画像がタイルを超えてしまい、画像がない問題が生じる。このため、予め複数の解像度のタイルを用意しておき、低解像度のタイルと高解像度のタイルを同時に復号して、視聴領域の移動が大きい場合には低解像度のタイルの映像を表示することで画像がない問題を解決できる。 Here, by using a device that decodes and displays the video of such a partial area, interactive viewing can be performed while the user moves while viewing the desired area. When the movement is large, the image to be displayed exceeds the tile, causing a problem that there is no image. For this reason, a plurality of resolution tiles are prepared in advance, and the low resolution tile and the high resolution tile are decoded at the same time. Can solve the problem of missing images.

一方、画像の透明度情報や奥行き情報は画像の各画素位置で定義できるため、グレースケールの情報として扱うフォーマットが提案されている。すなわち、このフォーマットでは、画像情報として、色信号だけではなく透明度情報や奥行き情報も一緒に扱うことになる。これにより、例えば透明度情報を色信号の加工に利用したり、あるいは色信号と奥行き情報から立体映像を生成することができる（例えば、非特許文献２参照）。奥行き情報と色信号から立体映像を生成する方法としては３Ｄｗａｒｐｉｎｇという手法がある。 On the other hand, since the transparency information and depth information of an image can be defined at each pixel position of the image, a format that handles gray scale information has been proposed. That is, in this format, not only color signals but also transparency information and depth information are handled together as image information. Thereby, for example, transparency information can be used for processing color signals, or a stereoscopic video can be generated from color signals and depth information (see, for example, Non-Patent Document 2). As a method for generating a stereoscopic image from depth information and color signals, there is a method called 3D warping.

図２１は、色信号と奥行き情報から別の視点の映像を生成する手法の例を示す図である。３Ｄｗａｒｐｉｎｇ手法は、図２１に示すように奥行き情報を使うことで別の視点からの色信号を生成する。このような、透明度情報や奥行き情報を色信号とともに持つフォーマットで画像情報を定義したものを、仮に、非特許文献１に示すようなパノラマ映像の画像情報として利用することもできる。透明度情報も持つ場合には、パノラマ映像の一部を切り出して前景とし、別に用意した背景画像と合成することができる。また奥行き情報を持つ場合には、パノラマ映像の一部について立体映像として表示をすることができる。あるいは、そこから３次元的に少し視点を変えた映像を表示することができる。 FIG. 21 is a diagram illustrating an example of a technique for generating a video of another viewpoint from the color signal and the depth information. The 3D warping method generates a color signal from another viewpoint by using depth information as shown in FIG. Such image information defined in a format having transparency information and depth information together with color signals can be used as panoramic video image information as shown in Non-Patent Document 1. In the case of having transparency information as well, a part of the panoramic video can be cut out as a foreground and synthesized with a separately prepared background image. When the depth information is included, a part of the panoramic image can be displayed as a stereoscopic image. Alternatively, it is possible to display an image with a slightly different viewpoint in three dimensions.

Hideaki Kimata, Shinya Shimizu, Yutaka Kunita, Megumi Isogai, and Yoshimitsu Ohtani:「Panorama video coding for user-driven interactive video application」,IEEE International Symposium on Consumer Electronics (ISCE) 2009, 2009Hideaki Kimata, Shinya Shimizu, Yutaka Kunita, Megumi Isogai, and Yoshimitsu Ohtani: `` Panorama video coding for user-driven interactive video application '', IEEE International Symposium on Consumer Electronics (ISCE) 2009, 2009 木全英明：「３次元映像に関する標準化の動向−立体映像と多視点映像のＭＰＥＧ標準化−」，高臨場感ディスプレイフォーラム２００７，２００７Hideaki Kizen: "Trends in standardization for 3D video-MPEG standardization for 3D video and multi-view video-", High Reality Display Forum 2007, 2007

複数のタイルに分割して多重化された映像ストリームに対して、復号するタイル数がダイナミックに変化して復号する方式は、非特許文献１にも記載されているとおり、汎用計算機上でソフトウェアを使って実装することができる。 As described in Non-Patent Document 1, a method of decoding video streams multiplexed and divided into a plurality of tiles by dynamically changing the number of tiles to be decoded is not limited to software on a general-purpose computer. Can be implemented using.

しかしながら、演算処理能力が低い計算機では必要なタイル数のストリームを復号できない問題や、専用ハードウェアではタイル数がダイナミックに変化するような場合には対象となる全てのタイルを復号することができないという問題が発生する。例えば、９個のタイルで一つの映像を構成するストリームに対して、一つの映像ストリームを復号する専用ハードウェアを備えた装置の場合には、９個のうちの１個の映像ストリームのみを復号するため、その他のタイルに属する視聴領域の映像を得ることができないという問題が発生する。さらに、ユーザの操作による視聴領域の移動が大きい場合に画像が存在しない問題を解決する方法として、低解像度の映像と高解像度の映像を同時に復号する方法があるが、ハードウェア実装上１つの映像ストリームしか復号できない場合には、低解像度の映像を復号することができず、画像が存在しないという問題が解決しない。 However, a computer with low arithmetic processing capacity cannot decode a stream with the required number of tiles, and dedicated hardware cannot decode all target tiles when the number of tiles changes dynamically. A problem occurs. For example, in the case of a device having dedicated hardware for decoding one video stream for a stream that constitutes one video with nine tiles, only one of the nine video streams is decoded. For this reason, there arises a problem that it is impossible to obtain a video of the viewing area belonging to other tiles. Further, as a method for solving the problem that the image does not exist when the movement of the viewing area by the user's operation is large, there is a method of simultaneously decoding the low resolution video and the high resolution video. When only the stream can be decoded, the low resolution video cannot be decoded, and the problem that no image exists does not solve.

本発明は、このような事情に鑑みてなされたもので、映像の一部の領域のみを視聴する際に、ユーザの操作による視聴領域の移動が大きい場合に表示するべき画像が存在しないために画像の表示ができないという状況を回避することができる映像復号装置、映像送受信システム、映像復号方法及び映像送受信方法を提供することを目的とする。 The present invention has been made in view of such circumstances, and when viewing only a partial area of a video, there is no image to be displayed when the movement of the viewing area by a user operation is large. An object of the present invention is to provide a video decoding device, a video transmission / reception system, a video decoding method, and a video transmission / reception method that can avoid a situation in which an image cannot be displayed.

本発明は、２次元で表現される面上の基準位置に対してＮ（Ｎは自然数）個のそれぞれ異なる座標位置を示す領域映像位置情報と、前記領域映像位置情報に対応した領域映像と共通映像とを符号化したＮ個の映像ストリームとを入力し、前記映像ストリームの一部を復号して出力画像を得る映像復号装置であって、前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、前記出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｎ）の映像ストリームを読み出すことを決定する読み出し決定部と、前記ｉ番目の映像ストリームを読み出す読み出し部と、前記読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部とを備えることを特徴とする。 The present invention is common to area video position information indicating N (N is a natural number) different coordinate positions with respect to a reference position on a plane expressed in two dimensions, and an area video corresponding to the area video position information. A video decoding device that receives N video streams encoded with video and decodes a part of the video stream to obtain an output image, the coordinate position of the output image with respect to the reference position and the region video From the position information, a read determining unit that determines to read the i-th (1 ≦ i ≦ N) video stream so that the number of pixels of the output image is the largest, and reading to read the i-th video stream Unit, a decoding unit for decoding the video stream read by the reading unit, and a coordinate position of the output image with respect to the reference position based on the i-th region video position information Based on the cut-out position of the cut-out position determined by the cut-out position determining unit and the cut-out position of the image determined by the cut-out position determining unit so as to be the same, An image cutout unit that cuts out a part, and an output image composition unit that synthesizes the common video of the decoded images obtained in the decoding unit and the image cut out in the image cutout unit and outputs an output image. It is characterized by that.

この発明によれば、映像に対してＮ個の領域映像を設定して、それらと領域映像の位置に依存しない共通映像から構成される映像のストリームを用意しておき、復号する際には、表示画素数が最も多くなるように映像ストリームを選んでから復号することができる。そして領域映像の位置に合わせて画像を切り出すことができる。そして領域映像の位置に依存しない共通映像と、切り出して得られた切り出し映像を合成して出力画像を得ることができる。このような合成した結果を出力画像に設定することで、視聴者が映像を視聴中に視聴位置を変化した場合に、領域映像の外側に超えた場合に共通映像として用意した映像を表示することが可能となる。 According to the present invention, N area videos are set for a video, and a video stream composed of the common video independent of the position of the area video is prepared and decoded. The video stream can be selected and decoded so as to maximize the number of display pixels. Then, an image can be cut out in accordance with the position of the area video. Then, an output image can be obtained by synthesizing the common video independent of the position of the region video and the cut-out video obtained by the cut-out. By setting the combined result in the output image, when the viewer changes the viewing position while viewing the video, the video prepared as a common video is displayed if the viewing position changes outside the area video Is possible.

ここで、映像に対する領域映像の位置は、図４のように領域映像間で重なりがあるように設定してもよい。このように重なりがあるようにしておくと、指定された出力画像の位置に対して、より多くの画素が含まれるような領域映像のストリームを選ぶことができる。また共通映像は領域映像の位置に依存しない任意の映像を利用してもよい。ここで図２のように領域映像の元となる映像を縮小した映像を共通映像に設定してもよい。この場合には、出力画像として領域映像と共通映像を合成する際に、共通映像を拡大し、領域映像を領域映像位置に重畳する。これにより領域映像のところは高い解像度となるような映像を合成することができる。また共通映像として領域映像の元映像の被写体の内容の合わせた色の映像に設定してもよい。例えば領域映像の元の映像がコンサート会場のような暗いシーンであれば黒色の映像、サッカースタジアムであれば緑色などに設定してもよい。
領域映像と共通映像の合成の方法であるが、共通映像が領域映像の元となる映像を縮小した映像の場合で、共通映像を拡大する際には領域映像の情報を使って超解像処理を行ってもよい。 Here, the position of the area image relative to the image may be set so that there is an overlap between the area images as shown in FIG. If there is an overlap in this way, it is possible to select a stream of area video that includes more pixels with respect to the position of the designated output image. As the common video, any video that does not depend on the position of the area video may be used. Here, as shown in FIG. 2, a video obtained by reducing the video that is the source of the area video may be set as the common video. In this case, when the area video and the common video are synthesized as the output image, the common video is enlarged and the area video is superimposed on the area video position. As a result, it is possible to synthesize an image having a high resolution in the area image. Further, the common video may be set to a color video in which the contents of the subject of the original video of the area video are matched. For example, if the original video of the area video is a dark scene such as a concert venue, a black video may be set, and if it is a soccer stadium, green may be set.
This is a method of compositing the area video and the common video. When the common video is a video obtained by reducing the video that is the source of the area video, super-resolution processing is performed using the information of the area video when the common video is enlarged. May be performed.

ここで領域映像と共通映像とから構成される映像の形式としては、空間的に両者を配置する場合と時間的に配置する場合とがある。空間的な配置の例としては図３のように領域映像と共通映像を上下に並べる場合と左右に並べる場合があげられる。また時間的に配置する場合には図１９のように連続したフレームの形で構成する場合があげられる。なお時間的に配置する際に各フレームの解像度は同一である必要はなく、領域映像と共通映像で異なっていてもよい。また符号化時のフレーム間予測符号化をする際には領域映像のフレームは直前の領域映像のフレームを参照し、共通映像は直前の共通映像のフレームを参照する構成にしてもよい。 Here, the format of the video composed of the area video and the common video may be spatially arranged or temporally arranged. Examples of the spatial arrangement include a case where the area video and the common video are arranged vertically as shown in FIG. 3 and a case where they are arranged horizontally. Further, in the case of arranging them temporally, there is a case where they are configured in the form of continuous frames as shown in FIG. It should be noted that the resolution of each frame does not have to be the same when arranged temporally, and may be different between the region video and the common video. In addition, when performing inter-frame predictive encoding at the time of encoding, the region video frame may refer to the immediately preceding region video frame, and the common video may refer to the immediately preceding common video frame.

本発明は、２次元で表現される面上の基準位置に対してＮ（Ｎは自然数）個のそれぞれ異なる座標位置を示す領域映像位置情報と、前記領域映像位置情報に対応した領域映像と共通映像により構成するＮ個の映像ストリームと該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データとを入力し、前記符号化データの一部を復号して出力画像を得る映像復号装置であって、前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、前記出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｎ）の映像情報のストリームを読み出すことを決定する読み出し決定部と、前記符号化データ中の前記ＩＤ情報から前記ｉ番目の映像ストリームを探索して読み出す探索読み出し部と、前記探索読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部とを備えることを特徴とする。 The present invention is common to area video position information indicating N (N is a natural number) different coordinate positions with respect to a reference position on a plane expressed in two dimensions, and an area video corresponding to the area video position information. Video obtained by inputting N video streams composed of video and encoded data obtained by multiplexing and encoding ID information for distinguishing the video stream, and decoding a part of the encoded data to obtain an output image A decoding device, wherein the i-th (1 ≦ i ≦ N) of the video information of the output image is obtained from the coordinate position of the output image with respect to the reference position and the region video position information so as to obtain the largest number of pixels of the output image; A read determination unit that determines to read a stream, a search read unit that searches and reads the i-th video stream from the ID information in the encoded data, and a search read unit A decoding unit that decodes the read video stream, and a cutout position determination unit that determines a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information And, based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video from the decoded image obtained in the decoding unit, and a decoded image obtained in the decoding unit And an output image synthesis unit that synthesizes the common video and the image cut out by the image cutout unit to output an output image.

この発明によれば、領域映像のストリームが多重化された形になっている場合に、領域映像のストリームを区別するＩＤ情報を使って、多重化された符号化データから所望の領域映像ストリームを指定して、取り出すことができる。これにより多重化されたストリームに対しても、出力画像の画素数が最も多くなるように領域映像を選んで復号することができる。 According to the present invention, when an area video stream is multiplexed, a desired area video stream is obtained from the multiplexed encoded data using ID information for distinguishing the area video stream. You can specify and retrieve. As a result, even for the multiplexed stream, it is possible to select and decode the region video so that the number of pixels of the output image is maximized.

本発明は、２次元で表現される面上で定義される元映像の基準位置に対するＮ（Ｎは自然数）個の部分的な領域映像の座標位置である領域映像位置情報を送信する領域映像位置情報送信部と、前記領域映像と共通映像とを符号化したＮ個の映像ストリームのうち、要求されたｉ番目（１≦ｉ≦Ｎ）の映像ストリームを送信するストリーム送信部とを備える映像送信装置と、前記領域映像位置情報を受信する領域映像位置情報受信部と、前記送信された映像ストリームを受信するストリーム受信部と、前記受信した映像ストリームを読み出す読み出し部と、前記読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部と、次の出力画像の座標位置を指定する出力画像座標指定部と、前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるｊ番目（１≦ｊ≦Ｎ）の映像ストリームの送信を要求する送信要求部とを備える映像受信装置とを具備することを特徴とする。 The present invention relates to an area image position for transmitting area image position information which is a coordinate position of N partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions (N is a natural number). Video transmission comprising: an information transmission unit; and a stream transmission unit that transmits a requested i-th (1 ≦ i ≦ N) video stream among N video streams obtained by encoding the area video and the common video. An apparatus, an area video position information receiving unit that receives the area video position information, a stream receiving unit that receives the transmitted video stream, a read unit that reads the received video stream, and a read unit that reads the received video stream The decoding unit that decodes the video stream and the image position so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information A cutout position determining unit that determines a cutout position, and an image cutout unit that cuts out a part of the region video in the decoded image obtained in the decoding unit based on the cutout position of the image determined by the cutout position determining unit; Specifying the coordinate position of the next output image, and an output image composition unit that outputs the output image by synthesizing the common video of the decoded images obtained by the decoding unit and the image cut out by the image cutout unit Transmission of the j-th (1 ≦ j ≦ N) video stream from which the largest number of pixels of the next output image can be obtained from the output image coordinate designating unit, the coordinate position of the next output image, and the region video position information And a video reception device including a transmission requesting unit that requests the transmission.

この発明によれば、映像ストリームを送信する送信装置と、受信する受信装置があり、受信装置側で次に出力したい画像の画素数が最も多くなるように、送信装置にストリームの要求をすることができる。これにより全ての領域映像を受信することなく、送信装置に必要な領域映像のストリームのみを要求することで、送信装置と受信装置の間の伝送量を減らすことができる。 According to the present invention, there is a transmitting device that transmits a video stream and a receiving device that receives the video stream, and the receiving device requests the stream to the transmitting device so that the number of pixels of the next image to be output is maximized. Can do. Accordingly, the transmission amount between the transmission device and the reception device can be reduced by requesting only a necessary region video stream from the transmission device without receiving all the region images.

本発明は、２次元で表現される面上で定義される元映像の基準位置に対するＮ（Ｎは自然数）個の部分的な領域映像の座標位置である領域映像位置情報を送信する領域映像位置情報送信部と、前記領域映像と共通映像とにより構成するＮ個の映像ストリームと、該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データから、Ｍ個（Ｍ≦Ｎ）の前記符号化データを抽出して送信するストリーム送信部とを備える映像送信装置と、前記領域映像位置情報を受信する領域映像位置情報受信部と、前記符号化データを受信するストリーム受信部と、前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｍ）の映像ストリームを読み出すことを決定する読み出し決定部と、前記符号化データ中のＩＤ情報からｉ番目の映像ストリームを探索して読み出す探索読み出し部と、前記探索読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部と、次の出力画像の座標位置を指定する出力画像座標指定部と、前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるようにＭ個の映像ストリームの送信を要求する送信要求部とを備える映像受信装置とを具備することを特徴とする。 The present invention relates to an area image position for transmitting area image position information which is a coordinate position of N partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions (N is a natural number). M pieces (M ≦ N) of encoded data obtained by multiplexing and encoding an information transmission unit, N video streams formed by the area video and the common video, and ID information for distinguishing the video streams. A video transmission device including a stream transmission unit that extracts and transmits the encoded data; an area video position information reception unit that receives the area video position information; and a stream reception unit that receives the encoded data; It is determined that the i-th (1 ≦ i ≦ M) video stream is read from the coordinate position of the output image with respect to the reference position and the region video position information so that the number of pixels of the output image can be obtained most. A read determination unit that performs search, a search read unit that searches for and reads the i-th video stream from the ID information in the encoded data, a decoding unit that decodes the video stream read by the search read unit, and the i-th video stream A cut-out position determining unit that determines a cut-out position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the region video position information, and the cut-out position of the image determined by the cut-out position determining unit Based on the image cutout unit that cuts out a part of the region video in the decoded image obtained in the decoding unit, the common video in the decoded image obtained in the decoding unit, and the image cut out in the image cutout unit Output image composition unit that outputs and outputs the output image, and output that specifies the coordinate position of the next output image An image coordinate designating unit, and a transmission request unit for requesting transmission of M video streams so that the number of pixels of the next output image can be obtained most from the coordinate position of the next output image and the region video position information. And a video receiving device.

この発明によれば、送信装置から複数の領域映像ストリームを受信しておき、受信装置で出力画像の画素数が最も多くなるように領域映像を選んで復号できる。さらに、受信装置側で、次に出力したい画像の画素数が最も多くなるように、送信装置に複数のストリームの要求をすることができる。これにより受信装置で、次の出力画像の位置を設定した時と、それに対する領域映像のストリームを受信した時の間に、通信遅延などの影響で時間の差があり、その間に出力画像の位置が変化した場合に、指定された出力画像の画素数が最も多くなるように複数の領域映像から選んで復号することができる。 According to the present invention, it is possible to receive a plurality of region video streams from the transmission device, and select and decode the region video so that the number of pixels of the output image is maximized by the reception device. Furthermore, the receiving apparatus can request the transmitting apparatus for a plurality of streams so that the number of pixels of an image to be output next is the largest. As a result, there is a time difference between the time when the position of the next output image is set by the receiving device and the time when the region video stream is received, and the position of the output image changes during that time due to communication delays. In this case, it is possible to select and decode from a plurality of area videos so that the number of pixels of the designated output image is maximized.

本発明は、２次元で表現される面上で定義される元映像の基準位置に対するＮ（Ｎは自然数）個の部分的な領域映像の座標位置である領域映像位置情報を送信する領域映像位置情報送信部と、前記元映像の一部であるｉ番目（１≦ｉ≦Ｎ）の領域映像と共通映像とを符号化した映像ストリームを送信するストリーム送信部とを備える映像送信装置と、前記領域映像位置情報を受信する領域映像位置情報受信部と、前記映像ストリームを受信するストリーム受信部と、前記受信した映像ストリームを読み出す読み出し部と、前記読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部と、次の出力画像の座標位置を指定する出力画像座標指定部と、前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるｊ番目（１≦ｊ≦Ｎ）の映像ストリームの送信を要求する送信要求部とを備える映像受信装置とを具備することを特徴とする。 The present invention relates to an area image position for transmitting area image position information which is a coordinate position of N partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions (N is a natural number). A video transmission device comprising: an information transmission unit; and a stream transmission unit configured to transmit a video stream obtained by encoding an i-th (1 ≦ i ≦ N) region video that is a part of the original video and a common video; An area video position information receiving unit that receives area video position information, a stream receiving unit that receives the video stream, a reading unit that reads the received video stream, and a decoding that decodes the video stream read by the reading unit A cut-out position for determining the cut-out position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information A position determination unit, an image cutout unit that cuts out a part of the region video out of the decoded image obtained in the decoding unit based on the cutout position of the image determined by the cutout position determination unit, and obtained in the decoding unit; An output image synthesizing unit that outputs the output image by synthesizing the common video of the decoded images and the image cut out by the image cutout unit, and an output image coordinate designating unit that designates the coordinate position of the next output image And a transmission request unit that requests transmission of a j-th (1 ≦ j ≦ N) video stream from which the number of pixels of the next output image is most obtained from the coordinate position of the next output image and the region video position information. And a video receiving device.

この発明によれば、送信装置側で領域映像ストリームを符号化した後に送信することができる。これにより、予め送信装置側で領域映像ストリームの符号化データを用意する必要はなく、受信側で必要な領域映像に対してのみ符号化して送信することができる。 According to the present invention, an area video stream can be encoded and transmitted on the transmission apparatus side. Thereby, it is not necessary to prepare the encoded data of the area video stream in advance on the transmitting device side, and it is possible to encode and transmit only the necessary area video on the receiving side.

本発明は、２次元で表現される面上で定義される元映像の基準位置に対するＮ（Ｎは自然数）個の部分的な領域映像の座標位置である領域映像位置情報を送信する領域映像位置情報送信部と、前記元映像の一部であるＭ個（Ｍ≦Ｎ）の領域映像と共通映像とにより構成するＮ個の映像ストリームと、該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データを送信するストリーム送信部とを備える映像送信装置と、前記領域映像位置情報を受信する領域映像位置情報受信部と、前記符号化データを受信するストリーム受信部と、前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｍ）の映像ストリームを読み出すことを決定する読み出し決定部と、前記符号化データ中のＩＤ情報からｉ番目の映像ストリームを探索して読み出す探索読み出し部と、前記探索読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部と、次の出力画像の座標位置を指定する出力画像座標指定部と、前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるようにＭ個の映像ストリームの送信を要求する送信要求部とを備える映像受信装置とを具備することを特徴とする。 The present invention relates to an area image position for transmitting area image position information which is a coordinate position of N partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions (N is a natural number). An information transmission unit, N video streams composed of M (M ≦ N) region videos and a common video, which are part of the original video, and ID information for distinguishing the video streams are multiplexed. A video transmission device including a stream transmission unit that transmits encoded data, a region video position information reception unit that receives the region video position information, a stream reception unit that receives the encoded data, and the reference Reading that determines to read out the i-th (1 ≦ i ≦ M) video stream so that the number of pixels of the output image can be obtained most from the coordinate position of the output image with respect to the position and the region video position information A determination unit; a search / read unit that searches for and reads an i-th video stream from ID information in the encoded data; a decoding unit that decodes the video stream read by the search / read unit; and the i-th region Based on video position information, the cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same, and the cutout position of the image determined by the cutout position determination unit An image cutout unit that cuts out a part of the region video in the decoded image obtained in the decoding unit, the common video in the decoded image obtained in the decoding unit, and an image cut out in the image cutout unit Output image composition unit that synthesizes and outputs the output image, and output image coordinate specification that specifies the coordinate position of the next output image And a transmission requesting unit that requests transmission of M video streams so that the number of pixels of the next output image can be obtained most from the coordinate position of the next output image and the region video position information. And a device.

本発明は、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上の基準位置に対してＫ（１≦Ｋ≦Ｌ）個のそれぞれ異なる座標位置を示す領域映像位置情報と、前記領域映像位置情報に対応した領域映像と共通映像とを符号化したＬ個の映像ストリームとを入力し、前記映像ストリームの一部を復号して出力画像を得る映像復号装置であって、前記出力画像を切り出す面Ｃ（１≦Ｃ≦Ｌ）における前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、前記出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｃ）の映像ストリームを読み出すことを決定する読み出し決定部と、前記ｉ番目の映像ストリームを読み出す読み出し部と、前記読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部とを備えることを特徴とする。 The present invention relates to L (L ≧ 2) planes expressed in two dimensions, and region video positions indicating K (1 ≦ K ≦ L) different coordinate positions with respect to a reference position on each plane. The video decoding apparatus is configured to input information and L video streams obtained by encoding a region video corresponding to the region video position information and a common video, and decoding a part of the video stream to obtain an output image. In order to obtain the largest number of pixels of the output image from the coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) from which the output image is cut out and the region video position information, A read determination unit that determines to read a video stream of (1 ≦ i ≦ C), a read unit that reads the i-th video stream, and the video stream read by the read unit The decoding unit, the cutout position determining unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information, and the cutout position determining unit determine Based on the cutout position of the image, an image cutout unit that cuts out a part of the region video in the decoded image obtained in the decoding unit, the common video in the decoded image obtained in the decoding unit, And an output image synthesis unit that synthesizes the image cut out by the image cutout unit and outputs an output image.

この発明によれば、例えば大きく離れた空間位置について、複数の元映像を設定し、それに領域映像を設定することで、出力画像の空間的な位置に対応した元映像を選んだ上で領域映像を選んで復号することができる。あるいは、元映像として複数の解像度の映像を用意しておき、それらに対して領域映像を設定することで、出力画像の解像度に対応した領域映像を選んで復号することができる。これにより出力画像の解像度も選択することが可能となる。 According to the present invention, for example, by setting a plurality of original images for spatial positions that are far apart and setting an area image on the images, an area image is selected after selecting an original image corresponding to the spatial position of the output image. Can be decrypted. Alternatively, by preparing videos of a plurality of resolutions as the original video and setting the area video for them, it is possible to select and decode the area video corresponding to the resolution of the output image. As a result, the resolution of the output image can be selected.

本発明は、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上の基準位置に対してＫ（１≦Ｋ≦Ｌ）個のそれぞれ異なる座標位置を示す領域映像位置情報と、前記領域映像位置情報に対応した領域映像と共通映像により構成するＮ個の映像ストリームと該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データとを入力し、前記符号化データの一部を復号して出力画像を得る映像復号装置であって、前記出力画像を切り出す面Ｃ（１≦Ｃ≦Ｌ）における前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、前記出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｃ）の映像ストリームを読み出すことを決定する読み出し決定部と、前記符号化データ中の前記ＩＤ情報からｉ番目の映像ストリームを探索して読み出す探索読み出し部と、前記探索読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部とを備えることを特徴とする。 The present invention relates to L (L ≧ 2) planes expressed in two dimensions, and region video positions indicating K (1 ≦ K ≦ L) different coordinate positions with respect to a reference position on each plane. Information, and N encoded video data composed of N video streams composed of a regional video corresponding to the regional video position information and a common video and ID information for distinguishing the video stream, and encoded data, and A video decoding apparatus that obtains an output image by decoding a part of encoded data, wherein a coordinate position of the output image with respect to the reference position and a region video position on a plane C (1 ≦ C ≦ L) from which the output image is cut out From the information, a read determination unit that determines to read the i-th (1 ≦ i ≦ C) video stream so that the number of pixels of the output image is the largest, and the ID information in the encoded data i A search reading unit for searching for and reading the video stream of the eye; a decoding unit for decoding the video stream read by the search reading unit; and a coordinate position of the output image with respect to the reference position based on the i th region video position information The region video of the decoded image obtained by the decoding unit based on the cutout position of the cutout position determining unit that determines the cutout position of the image and the cutout position of the image determined by the cutout position determining unit An image cutout unit that cuts out a part of the decoded image, and an output image composition unit that outputs the output image by combining the common video of the decoded images obtained in the decoding unit and the image cut out in the image cutout unit It is characterized by providing.

本発明は、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上で定義される基準位置に対するＫ（１≦Ｋ≦Ｌ）個の部分的な領域映像の座標位置である全ての領域映像位置情報を送信する領域映像位置情報送信部と、出力画像を切り出す面Ｃ（１≦Ｃ≦Ｌ）における領域映像と共通映像とを符号化したＫ個の映像ストリームのうち、要求されたｉ番目（１≦ｉ≦Ｋ）の映像ストリームを送信するストリーム送信部とを備える映像送信装置と、前記領域映像位置情報を受信する領域映像位置情報受信部と、前記送信された映像ストリームを受信するストリーム受信部と、前記受信した映像ストリームを読み出す読み出し部と、前記読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部と、次の出力画像を切り出す面Ｄ（１≦Ｄ≦Ｌ）を指定する出力画像面指定部と、前記次の出力画像の座標位置を指定する出力画像座標指定部と、前記出力画像を切り出す面Ｄにおける前記基準位置に対する次の出力画像の座標位置と前記領域映像位置情報とから、次の出力画像の画素数が最も多く得られるｊ番目（１≦ｊ≦Ｄ）の映像ストリームの送信を要求する送信要求部とを備える映像受信装置とを具備することを特徴とする。 According to the present invention, for L (L ≧ 2) planes expressed in two dimensions, the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each plane Among the K video streams obtained by encoding the area video and common video on the plane C (1 ≦ C ≦ L) where the output image is cut out, A video transmission device including a stream transmission unit that transmits the requested i-th (1 ≦ i ≦ K) video stream, an area video position information reception unit that receives the area video position information, and the transmitted A stream receiving unit for receiving a video stream; a reading unit for reading the received video stream; a decoding unit for decoding the video stream read by the reading unit; and the i-th region video position Based on the information, the cutout position determining unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same, and the decoding based on the cutout position of the image determined by the cutout position determining unit An image cutout unit that cuts out a part of the region video in the decoded image obtained in the unit, the common video out of the decoded image obtained in the decoding unit, and the image cut out in the image cutout unit. An output image composition unit for outputting an output image, an output image surface designating unit for designating a surface D (1 ≦ D ≦ L) for cutting out the next output image, and an output image for designating the coordinate position of the next output image From the coordinate designating unit, the coordinate position of the next output image with respect to the reference position on the surface D from which the output image is cut out, and the region video position information, the next output image Characterized by comprising a video receiver and a transmission request unit to request transmission of the video stream of the j-th pixel count is obtained most (1 ≦ j ≦ D).

この発明によれば、送信装置側で、例えば大きく離れた空間位置について、複数の元映像を設定し、それに領域映像を設定することで、受信装置において、出力画像の空間的な位置に対応した元映像を選んだ上で領域映像を選んで復号することができる。あるいは、送信装置側で、元映像として複数の解像度の映像を用意しておき、それらに対して領域映像を設定することで、受信装置において、出力画像の解像度に対応した領域映像を選んで復号することができる。これにより出力画像の解像度も選択することが可能となる。 According to the present invention, on the transmission device side, for example, a plurality of original images are set for spatial positions that are largely separated, and an area image is set for the original images. After selecting the original video, it is possible to select and decode the area video. Alternatively, the transmitter device prepares video images of a plurality of resolutions as the original video image, and sets the region video image for them, so that the receiver device selects and decodes the region video image corresponding to the resolution of the output image. can do. As a result, the resolution of the output image can be selected.

本発明は、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上で定義される基準位置に対するＫ（１≦Ｋ≦Ｌ）個の部分的な領域映像の座標位置である全ての領域映像位置情報を送信する領域映像位置情報送信部と、前記領域映像と共通映像とにより構成するＮ個の映像ストリームと、該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データから、Ｍ個（Ｍ≦Ｋ）の前記符号化データを抽出して送信するストリーム送信部とを備える映像送信装置と、前記領域映像位置情報を受信する領域映像位置情報受信部と、前記符号化データを受信するストリーム受信部と、出力画像を切り出す面Ｃ（１≦Ｃ≦Ｌ）における前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｍ）の映像ストリームを読み出すことを決定する読み出し決定部と、前記符号化データ中のＩＤ情報からｉ番目の映像ストリームを探索して読み出す探索読み出し部と、前記探索読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部と、次の出力画像を切り出す面Ｄ（１≦Ｄ≦Ｌ）を指定する出力画像面指定部と、次の出力画像の座標位置を指定する出力画像座標指定部と、前記出力画像面指定部で指定した面Ｄにおける前記基準位置に対する前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるようにＭ個の映像ストリームの送信を要求する送信要求部とを備える映像受信装置とを具備することを特徴とする。 According to the present invention, for L (L ≧ 2) planes expressed in two dimensions, the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each plane A region image position information transmitting unit for transmitting all region image position information, N video streams composed of the region image and the common image, and ID information for distinguishing the image streams are multiplexed and encoded A video transmission apparatus including a stream transmission unit that extracts and transmits M (M ≦ K) pieces of the encoded data from the encoded data, and an area video position information reception unit that receives the area video position information A stream receiving unit that receives the encoded data, a pixel position of the output image from the coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) from which the output image is cut out and the region video position information A determination unit that determines to read the i-th (1 ≦ i ≦ M) video stream so as to obtain the largest number, and a search for searching for and reading the i-th video stream from the ID information in the encoded data Extracting an image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information, and a reading unit, a decoding unit that decodes the video stream read by the search reading unit A cutout position determining unit that determines a position; an image cutout unit that cuts out a part of the region video in the decoded image obtained in the decoding unit based on the cutout position of the image determined by the cutout position determining unit; The common video out of the decoded images obtained in the decoding unit and the image cut out in the image cutout unit An output image composition unit configured to output an output image, an output image surface designating unit designating a surface D (1 ≦ D ≦ L) to cut out the next output image, and an output designating the coordinate position of the next output image The maximum number of pixels of the next output image can be obtained from the coordinate position of the next output image with respect to the reference position on the plane D specified by the image coordinate specifying unit and the output image surface specifying unit and the region video position information. As described above, the video receiving apparatus includes a transmission request unit that requests transmission of M video streams.

この発明によれば、送信装置から複数の領域映像ストリームを受信しておき、受信装置で出力画像の画素数が最も多くなるように領域映像を選んで復号できる。このとき異なる面の映像の領域映像のストリームを受信することもできる。この場合には、出力画像を切り出す位置に近い面の領域映像を選んで復号することができる。例えば、このとき異なる解像度の映像の領域映像のストリームを受信することもできる。この場合には、出力画像の解像度を選びつつ、領域映像を選んで復号することができる。 According to the present invention, it is possible to receive a plurality of region video streams from the transmission device, and select and decode the region video so that the number of pixels of the output image is maximized by the reception device. At this time, it is also possible to receive an area image stream of images of different planes. In this case, it is possible to select and decode an area image close to the position where the output image is cut out. For example, at this time, it is also possible to receive a region video stream of a video having a different resolution. In this case, the region video can be selected and decoded while selecting the resolution of the output image.

本発明は、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上で定義される基準位置に対するＫ（１≦Ｋ≦Ｌ）個の部分的な領域映像の座標位置である全ての領域映像位置情報を送信する領域映像位置情報送信部と、一つ以上の面Ｃ（１≦Ｃ≦Ｌ）における元映像の一部であるｉ番目（１≦ｉ≦Ｋ）の領域映像と共通映像とを符号化した映像ストリームを送信するストリーム送信部とを備える映像送信装置と、前記領域映像位置情報を受信する領域映像位置情報受信部と、前記映像ストリームを受信するストリーム受信部と、前記受信した映像ストリームを読み出す読み出し部と、前記読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部と、次の出力画像を切り出す面Ｄ（１≦Ｄ≦Ｌ）を指定する出力画像面指定部と、次の出力画像の座標位置を指定する出力画像座標指定部と、前記出力画像面指定部で指定した面Ｄにおける前記基準位置に対する前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるｊ番目（１≦ｊ≦Ｍ）の映像ストリームの送信を要求する送信要求部とを備える映像受信装置とを具備することを特徴とする。 According to the present invention, for L (L ≧ 2) planes expressed in two dimensions, the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each plane A region image position information transmitting unit for transmitting all region image position information, and an i-th (1 ≦ i ≦ K) part of the original image in one or more planes C (1 ≦ C ≦ L). A video transmission apparatus comprising a stream transmission unit that transmits a video stream obtained by encoding a region video and a common video, a region video position information reception unit that receives the region video position information, and a stream reception that receives the video stream Unit, a reading unit that reads the received video stream, a decoding unit that decodes the video stream read by the reading unit, and an output for the reference position based on the i-th region video position information Among the decoded images obtained in the decoding unit based on the cutout position determination unit that determines the cutout position of the image and the cutout position of the image determined by the cutout position determination unit so that the coordinate positions of the images are the same An output image that outputs an output image by synthesizing the common video of the decoded image obtained by the decoding unit and the image extracted by the image clipping unit A composition unit; an output image surface designating unit for designating a surface D (1 ≦ D ≦ L) for cutting out the next output image; an output image coordinate designating unit for designating a coordinate position of the next output image; and the output image surface The maximum number of pixels of the next output image can be obtained from the coordinate position of the next output image with respect to the reference position on the surface D designated by the designation unit and the region video position information j Characterized by comprising a video receiver and a transmission request unit to request transmission of the video stream of the eye (1 ≦ j ≦ M).

本発明は、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上で定義される基準位置に対するＫ（１≦Ｋ≦Ｌ）個の部分的な領域映像の座標位置である全ての領域映像位置情報を送信する領域映像位置情報送信部と、一つ以上の面Ｃ（１≦Ｃ≦Ｌ）における元映像の一部であるＭ個（Ｍ≦Ｋ）の領域映像と共通映像とにより構成するＮ個の映像ストリームと、該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データを送信するストリーム送信部とを備える映像送信装置と、前記領域映像位置情報を受信する領域映像位置情報受信部と、前記映像ストリームを受信するストリーム受信部と、出力画像を切り出す面Ｃ（１≦Ｃ≦Ｌ）における前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｍ）の映像ストリームを読み出すことを決定する読み出し決定部と、前記符号化データ中のＩＤ情報からｉ番目の映像ストリームを探索して読み出す探索読み出し部と、前記探索読み出し部において読み出した前記映像ストリームを復号する復号部と、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定部と、前記切り出し位置決定部が決定した前記画像の切り出し位置に基づき、前記復号部において得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出し部と、前記復号部において得られる復号画像のうちの前記共通映像と、前記画像切り出し部において切り出した画像とを合成して出力画像を出力する出力画像合成部と、次の出力画像を切り出す面Ｄ（１≦Ｄ≦Ｌ）を指定する出力画像面指定部と、前記次の出力画像の座標位置を指定する出力画像座標指定部と、前記出力画像面指定部で指定した面Ｄにおける前記基準位置に対する次の出力画像の座標位置と前記領域映像位置情報とから、次の出力画像の画素数が多く得られるようにＭ個の映像ストリームの送信を要求する送信要求部とを備える映像受信装置とを具備することを特徴とする。 According to the present invention, for L (L ≧ 2) planes expressed in two dimensions, the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each plane A region image position information transmitting unit that transmits all region image position information, and M (M ≦ K) region images that are part of the original image on one or more planes C (1 ≦ C ≦ L). A video transmission apparatus comprising: a stream transmission unit that transmits encoded data obtained by multiplexing and encoding N video streams composed of a common video and ID information that distinguishes the video streams; An area video position information receiving unit for receiving position information, a stream receiving unit for receiving the video stream, a coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) for cutting out the output image, and the region Image position information To the i-th video image from the ID information in the encoded data, and a read determination unit that determines to read the i-th (1 ≦ i ≦ M) video stream so that the number of pixels of the output image is the largest. A search reading unit that searches and reads a stream, a decoding unit that decodes the video stream read by the search reading unit, and the coordinate position of the output image with respect to the reference position based on the i-th region video position information are the same As described above, a cutout position determination unit that determines a cutout position of an image, and a part of the region video in the decoded image obtained by the decoding unit based on the cutout position of the image determined by the cutout position determination unit In the image cutout unit that cuts out the image, the common video of the decoded images obtained in the decoding unit, and the image cutout unit. An output image composition unit that synthesizes the cut out image and outputs an output image, an output image surface designation unit that designates a surface D (1 ≦ D ≦ L) from which the next output image is cut out, and the next output image The output image coordinate designating unit for designating the coordinate position of the image, the coordinate position of the next output image with respect to the reference position on the surface D designated by the output image surface designating unit, and the region video position information, And a video reception device including a transmission requesting unit that requests transmission of M video streams so that a large number of pixels can be obtained.

本発明は、２次元で表現される面上の基準位置に対してＮ（Ｎは自然数）個のそれぞれ異なる座標位置を示す領域映像位置情報と、前記領域映像位置情報に対応した領域映像と共通映像とを符号化したＮ個の映像ストリームとを入力し、前記映像ストリームの一部を復号して出力画像を得る映像復号方法であって、前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、前記出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｎ）の映像ストリームを読み出すことを決定する読み出し決定ステップと、前記ｉ番目の映像ストリームを読み出す読み出しステップと、前記読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップとを備えることを特徴とする。 The present invention is common to area video position information indicating N (N is a natural number) different coordinate positions with respect to a reference position on a plane expressed in two dimensions, and an area video corresponding to the area video position information. A video decoding method of inputting N video streams obtained by encoding video and decoding a part of the video stream to obtain an output image, the coordinate position of the output image with respect to the reference position and the region video A read determination step for determining to read out the i-th (1 ≦ i ≦ N) video stream so as to obtain the largest number of pixels of the output image from the position information; and reading out the i-th video stream A decoding step for decoding the video stream read in the reading step, and the reference position based on the i-th region video position information. And a decoded image obtained in the decoding step based on the cutout position determination step for determining the cutout position of the image and the cutout position of the image determined in the cutout position determination step so that the coordinate positions of the output images to be performed are the same An image cutout step of cutting out a part of the region video, and the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step are combined to output an output image And an output image synthesis step.

本発明は、２次元で表現される面上の基準位置に対してＮ（Ｎは自然数）個のそれぞれ異なる座標位置を示す領域映像位置情報と、前記領域映像位置情報に対応した領域映像と共通映像により構成するＮ個の映像ストリームと該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データとを入力し、前記符号化データの一部を復号して出力画像を得る映像復号方法であって、前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、前記出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｎ）の映像情報のストリームを読み出すことを決定する読み出し決定ステップと、前記符号化データ中の前記ＩＤ情報から前記ｉ番目の映像ストリームを探索して読み出す探索読み出しステップと、前記探索読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップとを備えることを特徴とする。 The present invention is common to area video position information indicating N (N is a natural number) different coordinate positions with respect to a reference position on a plane expressed in two dimensions, and an area video corresponding to the area video position information. Video obtained by inputting N video streams composed of video and encoded data obtained by multiplexing and encoding ID information for distinguishing the video stream, and decoding a part of the encoded data to obtain an output image In the decoding method, the i-th (1 ≦ i ≦ N) of the video information of the output image is obtained from the coordinate position of the output image with respect to the reference position and the region video position information so as to obtain the largest number of pixels of the output image. A read determination step for determining to read a stream; a search read step for searching for and reading the i-th video stream from the ID information in the encoded data; and the search A decoding step for decoding the video stream read in the extraction step and a cutout for determining the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information A position determination step, an image cutout step of cutting out a part of the region video in the decoded image obtained in the decoding step based on the cutout position of the image determined by the cutout position determination step, and obtained in the decoding step An output image synthesis step of synthesizing the common video of the decoded images and the image cut out in the image cutout step to output an output image.

本発明は、映像送信装置と映像受信装置とからなる映像送受信システムが行う映像送受信方法であって、２次元で表現される面上で定義される元映像の基準位置に対するＮ（Ｎは自然数）個の部分的な領域映像の座標位置である領域映像位置情報を送信する領域映像位置情報送信ステップと、前記領域映像と共通映像とを符号化したＮ個の映像ストリームのうち、要求されたｉ番目（１≦ｉ≦Ｎ）の映像ストリームを送信するストリーム送信ステップと、前記領域映像位置情報を受信する領域映像位置情報受信ステップと、前記送信された映像ストリームを受信するストリーム受信ステップと、前記受信した映像ストリームを読み出す読み出しステップと、前記読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップと、次の出力画像の座標位置を指定する出力画像座標指定ステップと、前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるｊ番目（１≦ｊ≦Ｎ）の映像ストリームの送信を要求する送信要求ステップとを有することを特徴とする。 The present invention is a video transmission / reception method performed by a video transmission / reception system including a video transmission device and a video reception device, and N (N is a natural number) with respect to a reference position of an original video defined on a plane expressed in two dimensions. An area image position information transmission step for transmitting area image position information, which is the coordinate position of each of the partial area images, and the requested i among N video streams obtained by encoding the area image and the common image. A stream transmission step for transmitting the video stream of the first (1 ≦ i ≦ N), an area video position information receiving step for receiving the area video position information, a stream receiving step for receiving the transmitted video stream, A reading step for reading the received video stream, and a decoding step for decoding the video stream read in the reading step. And a cut-out position determining step for determining a cut-out position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information, and the cut-out position determining step is determined Based on the image cutout position, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step, the common video of the decoded image obtained in the decoding step, and the image cutout An output image synthesis step for synthesizing the image cut out in the step to output an output image, an output image coordinate designation step for designating a coordinate position of the next output image, a coordinate position of the next output image, and the region video The j-th (1 ≦ j ≦ N) from which the largest number of pixels of the next output image can be obtained from the position information And a sending requesting step of requesting transmission of video streams.

本発明は、映像送信装置と映像受信装置とからなる映像送受信システムが行う映像送受信方法であって、２次元で表現される面上で定義される元映像の基準位置に対するＮ（Ｎは自然数）個の部分的な領域映像の座標位置である領域映像位置情報を送信する領域映像位置情報送信ステップと、前記領域映像と共通映像とにより構成するＮ個の映像ストリームと、該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データから、Ｍ個（Ｍ≦Ｎ）の前記符号化データを抽出して送信するストリーム送信ステップと、前記領域映像位置情報を受信する領域映像位置情報受信ステップと、前記符号化データを受信するストリーム受信ステップと、前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｍ）の映像ストリームを読み出すことを決定する読み出し決定ステップと、前記符号化データ中のＩＤ情報からｉ番目の映像ストリームを探索して読み出す探索読み出しステップと、前記探索読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップと、次の出力画像の座標位置を指定する出力画像座標指定ステップと、前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるようにＭ個の映像ストリームの送信を要求する送信要求ステップとを有することを特徴とする。 The present invention is a video transmission / reception method performed by a video transmission / reception system including a video transmission device and a video reception device, and N (N is a natural number) with respect to a reference position of an original video defined on a plane expressed in two dimensions. Distinguishing the video stream from an area video position information transmission step for transmitting area video position information, which is the coordinate position of each partial area video, and N video streams composed of the area video and the common video A stream transmission step of extracting and transmitting M (M ≦ N) encoded data from encoded data obtained by multiplexing and encoding ID information, and region video position information for receiving the region video position information From the reception step, the stream reception step for receiving the encoded data, the coordinate position of the output image with respect to the reference position, and the region video position information, the output image A read decision step for deciding to read the i-th (1 ≦ i ≦ M) video stream so as to obtain the largest number of pixels, and searching for the i-th video stream from the ID information in the encoded data The search and read step for reading, the decoding step for decoding the video stream read in the search and read step, and the coordinate position of the output image with respect to the reference position based on the i-th region video position information are the same. A cut-out position determining step for determining a cut-out position of the image, and an image cut-out step for cutting out a part of the area video in the decoded image obtained in the decoding step based on the cut-out position of the image determined by the cut-out position determining step And of the decoded images obtained in the decoding step An output image composition step of combining the common video and the image cut out in the image cutout step to output an output image, an output image coordinate designating step of designating a coordinate position of the next output image, and the next output A transmission requesting step for requesting transmission of M video streams so that the number of pixels of the next output image can be maximized from the coordinate position of the image and the region video position information.

本発明は、映像送信装置と映像受信装置とからなる映像送受信システムが行う映像送受信方法であって、２次元で表現される面上で定義される元映像の基準位置に対するＮ（Ｎは自然数）個の部分的な領域映像の座標位置である領域映像位置情報を送信する領域映像位置情報送信ステップと、前記元映像の一部であるｉ番目（１≦ｉ≦Ｎ）の領域映像と共通映像とを符号化した映像ストリームを送信するストリーム送信ステップと、前記領域映像位置情報を受信する領域映像位置情報受信ステップと、前記映像ストリームを受信するストリーム受信ステップと、前記受信した映像ストリームを読み出す読み出しステップと、前記読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップと、次の出力画像の座標位置を指定する出力画像座標指定ステップと、前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるｊ番目（１≦ｊ≦Ｎ）の映像ストリームの送信を要求する送信要求ステップとを有することを特徴とする。 The present invention is a video transmission / reception method performed by a video transmission / reception system including a video transmission device and a video reception device, and N (N is a natural number) with respect to a reference position of an original video defined on a plane expressed in two dimensions. A region image position information transmission step for transmitting region image position information, which is the coordinate position of each partial region image, and an i-th (1 ≦ i ≦ N) region image that is a part of the original image and a common image A stream transmission step for transmitting a video stream encoded with the above, a region video position information reception step for receiving the region video position information, a stream reception step for receiving the video stream, and a reading for reading the received video stream A decoding step for decoding the video stream read in the reading step, and the i-th region video position Based on the information, the cutout position determining step for determining the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same, and the decoding based on the cutout position of the image determined by the cutout position determining step An image cutout step of cutting out a part of the area video in the decoded image obtained in the step, and the common video of the decoded image obtained in the decoding step and the image cut out in the image cutout step are combined. From the output image composition step for outputting the output image, the output image coordinate designating step for designating the coordinate position of the next output image, the coordinate position of the next output image and the region video position information, Requests transmission of the j-th (1 ≦ j ≦ N) video stream with the largest number of pixels And a sending requesting step.

本発明は、映像送信装置と映像受信装置とからなる映像送受信システムが行う映像送受信方法であって、２次元で表現される面上で定義される元映像の基準位置に対するＮ（Ｎは自然数）個の部分的な領域映像の座標位置である領域映像位置情報を送信する領域映像位置情報送信ステップと、前記元映像の一部であるＭ個（Ｍ≦Ｎ）の領域映像と共通映像とにより構成するＮ個の映像ストリームと、該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データを送信するストリーム送信ステップと、前記領域映像位置情報を受信する領域映像位置情報受信ステップと、前記符号化データを受信するストリーム受信ステップと、前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｍ）の映像ストリームを読み出すことを決定する読み出し決定ステップと、前記符号化データ中のＩＤ情報からｉ番目の映像ストリームを探索して読み出す探索読み出しステップと、前記探索読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップと、次の出力画像の座標位置を指定する出力画像座標指定ステップと、前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるようにＭ個の映像ストリームの送信を要求する送信要求ステップとを有することを特徴とする。 The present invention is a video transmission / reception method performed by a video transmission / reception system including a video transmission device and a video reception device, and N (N is a natural number) with respect to a reference position of an original video defined on a plane expressed in two dimensions. An area image position information transmission step for transmitting area image position information, which is the coordinate position of each of the partial area images, and M (M ≦ N) area images and a common image that are part of the original image A stream transmitting step for transmitting encoded data obtained by multiplexing and encoding N video streams constituting the video stream and ID information for distinguishing the video streams; and a region video position information receiving step for receiving the region video position information And the stream reception step for receiving the encoded data, the coordinate position of the output image with respect to the reference position, and the region video position information, the number of pixels of the output image is the maximum. A read determination step for determining to read out the i-th (1 ≦ i ≦ M) video stream so as to obtain many, and a search-reading step for searching for and reading out the i-th video stream from the ID information in the encoded data And a decoding step for decoding the video stream read in the search reading step, and an image cut-out position so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information. A cutout position determination step to be determined; an image cutout step of cutting out a part of the region video in the decoded image obtained in the decoding step based on the cutout position of the image determined by the cutout position determination step; and the decoding Of the decoded images obtained in the step, the common video An output image synthesis step for synthesizing the image cut out in the image cutout step to output an output image, an output image coordinate designation step for designating a coordinate position of the next output image, and a coordinate position of the next output image And a transmission requesting step for requesting transmission of M video streams so as to obtain the largest number of pixels of the next output image from the region video position information.

本発明は、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上の基準位置に対してＫ（１≦Ｋ≦Ｌ）個のそれぞれ異なる座標位置を示す領域映像位置情報と、前記領域映像位置情報に対応した領域映像と共通映像とを符号化したＬ個の映像ストリームとを入力し、前記映像ストリームの一部を復号して出力画像を得る映像復号方法であって、前記出力画像を切り出す面Ｃ（１≦Ｃ≦Ｌ）における前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、前記出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｃ）の映像ストリームを読み出すことを決定する読み出し決定ステップと、前記ｉ番目の映像ストリームを読み出す読み出しステップと、前記読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップとを備えることを特徴とする。 The present invention relates to L (L ≧ 2) planes expressed in two dimensions, and region video positions indicating K (1 ≦ K ≦ L) different coordinate positions with respect to a reference position on each plane. Information and an L video stream obtained by encoding a region video corresponding to the region video position information and a common video, and decoding a part of the video stream to obtain an output image. In order to obtain the largest number of pixels of the output image from the coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) from which the output image is cut out and the region video position information, (1 ≦ i ≦ C) a read determination step for determining to read a video stream, a read step for reading the i-th video stream, and the video read in the read step A decoding step for decoding the stream, a cutout position determining step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information, and the cutout position Based on the cut-out position of the image determined in the determination step, an image cut-out step of cutting out a part of the area video in the decoded image obtained in the decoding step, and the common of the decoded images obtained in the decoding step An output image combining step of combining the video and the image cut out in the image cutting step to output an output image is provided.

本発明は、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上の基準位置に対してＫ（１≦Ｋ≦Ｌ）個のそれぞれ異なる座標位置を示す領域映像位置情報と、前記領域映像位置情報に対応した領域映像と共通映像により構成するＮ個の映像ストリームと該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データとを入力し、前記符号化データの一部を復号して出力画像を得る映像復号方法であって、前記出力画像を切り出す面Ｃ（１≦Ｃ≦Ｌ）における前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、前記出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｃ）の映像ストリームを読み出すことを決定する読み出し決定ステップと、前記符号化データ中の前記ＩＤ情報からｉ番目の映像ストリームを探索して読み出す探索読み出しステップと、前記探索読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップとを備えることを特徴とする。 The present invention relates to L (L ≧ 2) planes expressed in two dimensions, and region video positions indicating K (1 ≦ K ≦ L) different coordinate positions with respect to a reference position on each plane. Information, and N encoded video data composed of N video streams composed of a regional video corresponding to the regional video position information and a common video and ID information for distinguishing the video stream, and encoded data, and A video decoding method for obtaining an output image by decoding a part of encoded data, wherein a coordinate position of the output image with respect to the reference position and a region video position on a plane C (1 ≦ C ≦ L) where the output image is cut out A read determination step for determining to read an i-th (1 ≦ i ≦ C) video stream from the information so that the number of pixels of the output image is the largest, and the ID information in the encoded data A search and read step for searching and reading the i-th video stream, a decoding step for decoding the video stream read in the search and read step, and an output image relative to the reference position based on the i-th region video position information. Of the decoded images obtained in the decoding step, based on the cut-out position determination step for determining the cut-out position of the image so that the coordinate positions are the same, and the cut-out position of the image determined by the cut-out position determination step An image cutout step for cutting out a part of the region video, and an output image composition step for outputting the output image by combining the common video among the decoded images obtained in the decoding step and the image cut out in the image cutting step It is characterized by providing.

本発明は、映像送信装置と映像受信装置とからなる映像送受信システムが行う映像送受信方法であって、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上で定義される基準位置に対するＫ（１≦Ｋ≦Ｌ）個の部分的な領域映像の座標位置である全ての領域映像位置情報を送信する領域映像位置情報送信ステップと、出力画像を切り出す面Ｃ（１≦Ｃ≦Ｌ）における領域映像と共通映像とを符号化したＫ個の映像ストリームのうち、要求されたｉ番目（１≦ｉ≦Ｋ）の映像ストリームを送信するストリーム送信ステップと、前記領域映像位置情報を受信する領域映像位置情報受信ステップと、前記送信された映像ストリームを受信するストリーム受信ステップと、前記受信した映像ストリームを読み出す読み出しステップと、前記読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップと、次の出力画像を切り出す面Ｄ（１≦Ｄ≦Ｌ）を指定する出力画像面指定ステップと、前記次の出力画像の座標位置を指定する出力画像座標指定ステップと、前記出力画像を切り出す面Ｄにおける前記基準位置に対する次の出力画像の座標位置と前記領域映像位置情報とから、次の出力画像の画素数が最も多く得られるｊ番目（１≦ｊ≦Ｄ）の映像ストリームの送信を要求する送信要求ステップとを有することを特徴とする。 The present invention is a video transmission / reception method performed by a video transmission / reception system including a video transmission device and a video reception device, and L (L ≧ 2) planes expressed in two dimensions are defined on each plane. A region video position information transmission step for transmitting all region video position information, which are coordinate positions of K (1 ≦ K ≦ L) partial region images with respect to a reference position, and a plane C (1 ≦ A stream transmission step of transmitting a requested i-th (1 ≦ i ≦ K) video stream out of K video streams obtained by encoding the region video and the common video in C ≦ L), and the region video position An area video position information receiving step for receiving information, a stream receiving step for receiving the transmitted video stream, and a reading step for reading the received video stream; A cutout for determining the cutout position of the image so that the decoding position for decoding the video stream read in the read-out step and the coordinate position of the output image with respect to the reference position are the same based on the i-th region video position information A position determination step, an image cutout step of cutting out a part of the region video in the decoded image obtained in the decoding step based on the cutout position of the image determined by the cutout position determination step, and obtained in the decoding step An output image combining step of combining the common video of the decoded images and the image cut out in the image cutout step to output an output image, and a plane D (1 ≦ D ≦ L) for cutting out the next output image An output image plane designating step, and a coordinate position of the next output image The number of pixels of the next output image is obtained from the output image coordinate designating step for designating the image, the coordinate position of the next output image with respect to the reference position on the surface D from which the output image is cut out, and the region video position information. a transmission request step for requesting transmission of a j-th (1 ≦ j ≦ D) video stream.

本発明は、映像送信装置と映像受信装置とからなる映像送受信システムが行う映像送受信方法であって、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上で定義される基準位置に対するＫ（１≦Ｋ≦Ｌ）個の部分的な領域映像の座標位置である全ての領域映像位置情報を送信する領域映像位置情報送信ステップと、前記領域映像と共通映像とにより構成するＮ個の映像ストリームと、該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データから、Ｍ個（Ｍ≦Ｋ）の前記符号化データを抽出して送信するストリーム送信ステップと、前記領域映像位置情報を受信する領域映像位置情報受信ステップと、前記符号化データを受信するストリーム受信ステップと、出力画像を切り出す面Ｃ（１≦Ｃ≦Ｌ）における前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｍ）の映像ストリームを読み出すことを決定する読み出し決定ステップと、前記符号化データ中のＩＤ情報からｉ番目の映像ストリームを探索して読み出す探索読み出しステップと、前記探索読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップと、次の出力画像を切り出す面Ｄ（１≦Ｄ≦Ｌ）を指定する出力画像面指定ステップと、次の出力画像の座標位置を指定する出力画像座標指定ステップと、前記出力画像面指定ステップで指定した面Ｄにおける前記基準位置に対する前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるようにＭ個の映像ストリームの送信を要求する送信要求ステップとを有することを特徴とする。 The present invention is a video transmission / reception method performed by a video transmission / reception system including a video transmission device and a video reception device, and L (L ≧ 2) planes expressed in two dimensions are defined on each plane. A region image position information transmission step for transmitting all region image position information which are coordinate positions of K (1 ≦ K ≦ L) partial region images with respect to a reference position, and the region image and the common image Stream transmitting step of extracting and transmitting M (M ≦ K) encoded data from encoded data obtained by multiplexing and encoding N video streams and ID information for distinguishing the video streams An area image position information receiving step for receiving the area image position information, a stream receiving step for receiving the encoded data, and a plane C (1 ≦ C ≦ L) for cutting out an output image Reading that determines to read out the i-th (1 ≦ i ≦ M) video stream so that the number of pixels of the output image can be obtained most from the coordinate position of the output image with respect to the reference position and the region video position information A determination step; a search / read step for searching and reading out the i-th video stream from ID information in the encoded data; a decoding step for decoding the video stream read out in the search / read step; and the i-th region Based on the video position information, the cutout position determination step for determining the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same, and the cutout position of the image determined by the cutout position determination step, A part of the area video in the decoded image obtained in the decoding step. An output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image clipping step and outputting an output image; An output image plane designating step for designating a plane D (1 ≦ D ≦ L) to cut out an image, an output image coordinate designating step for designating the coordinate position of the next output image, and a plane D designated in the output image plane designating step A transmission requesting step for requesting transmission of M video streams so that the number of pixels of the next output image can be obtained most from the coordinate position of the next output image with respect to the reference position and the region video position information. It is characterized by having.

本発明は、映像送信装置と映像受信装置とからなる映像送受信システムが行う映像送受信方法であって、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上で定義される基準位置に対するＫ（１≦Ｋ≦Ｌ）個の部分的な領域映像の座標位置である全ての領域映像位置情報を送信する領域映像位置情報送信ステップと、一つ以上の面Ｃ（１≦Ｃ≦Ｌ）における元映像の一部であるｉ番目（１≦ｉ≦Ｋ）の領域映像と共通映像とを符号化した映像ストリームを送信するストリーム送信ステップと、前記領域映像位置情報を受信する領域映像位置情報受信ステップと、前記映像ストリームを受信するストリーム受信ステップと、前記受信した映像ストリームを読み出す読み出しステップと、前記読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップと、次の出力画像を切り出す面Ｄ（１≦Ｄ≦Ｌ）を指定する出力画像面指定ステップと、次の出力画像の座標位置を指定する出力画像座標指定ステップと、前記出力画像面指定ステップで指定した面Ｄにおける前記基準位置に対する前記次の出力画像の座標位置および前記領域映像位置情報から、次の出力画像の画素数が最も多く得られるｊ番目（１≦ｊ≦Ｍ）の映像ストリームの送信を要求する送信要求ステップとを有することを特徴とする。 The present invention is a video transmission / reception method performed by a video transmission / reception system including a video transmission device and a video reception device, and L (L ≧ 2) planes expressed in two dimensions are defined on each plane. An area image position information transmission step for transmitting all area image position information which are coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position, and one or more planes C (1 ≦ A stream transmission step of transmitting a video stream obtained by encoding an i-th (1 ≦ i ≦ K) area video and a common video that are part of the original video in C ≦ L), and receiving the area video position information Region video position information receiving step, stream receiving step for receiving the video stream, reading step for reading the received video stream, and reading in the reading step A decoding step for decoding the recorded video stream, a cutout position determining step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i th region video position information, Based on the cut-out position of the image determined by the cut-out position determining step, an image cut-out step of cutting out a part of the area video in the decoded image obtained in the decoding step, and among the decoded images obtained in the decoding step An output image composition step for synthesizing the common video and the image cut out in the image cutout step to output an output image, and an output image surface for designating a surface D (1 ≦ D ≦ L) for cutting out the next output image A specifying step, an output image coordinate specifying step for specifying the coordinate position of the next output image, and a previous From the coordinate position of the next output image with respect to the reference position on the surface D designated in the output image surface designation step and the region video position information, the j-th (1 ≦ j ≦) where the number of pixels of the next output image can be obtained most. M) a transmission requesting step for requesting transmission of the video stream.

本発明は、映像送信装置と映像受信装置とからなる映像送受信システムが行う映像送受信方法であって、２次元で表現されるＬ（Ｌ≧２）個の面について、それぞれの面上で定義される基準位置に対するＫ（１≦Ｋ≦Ｌ）個の部分的な領域映像の座標位置である全ての領域映像位置情報を送信する領域映像位置情報送信ステップと、一つ以上の面Ｃ（１≦Ｃ≦Ｌ）における元映像の一部であるＭ個（Ｍ≦Ｋ）の領域映像と共通映像とにより構成するＮ個の映像ストリームと、該映像ストリームを区別するＩＤ情報とを多重化して符号化した符号化データを送信するストリーム送信ステップと、前記領域映像位置情報を受信する領域映像位置情報受信ステップと、前記映像ストリームを受信するストリーム受信ステップと、出力画像を切り出す面Ｃ（１≦Ｃ≦Ｌ）における前記基準位置に対する出力画像の座標位置と前記領域映像位置情報とから、出力画像の画素数が最も多く得られるようにｉ番目（１≦ｉ≦Ｍ）の映像ストリームを読み出すことを決定する読み出し決定ステップと、前記符号化データ中のＩＤ情報からｉ番目の映像ストリームを探索して読み出す探索読み出しステップと、前記探索読み出しステップにおいて読み出した前記映像ストリームを復号する復号ステップと、前記ｉ番目の領域映像位置情報に基づき前記基準位置に対する出力画像の座標位置が同一になるように、画像の切り出し位置を決定する切り出し位置決定ステップと、前記切り出し位置決定ステップが決定した前記画像の切り出し位置に基づき、前記復号ステップにおいて得られる復号画像のうちの前記領域映像の一部を切り出す画像切り出しステップと、前記復号ステップにおいて得られる復号画像のうちの前記共通映像と、前記画像切り出しステップにおいて切り出した画像とを合成して出力画像を出力する出力画像合成ステップと、次の出力画像を切り出す面Ｄ（１≦Ｄ≦Ｌ）を指定する出力画像面指定ステップと、前記次の出力画像の座標位置を指定する出力画像座標指定ステップと、前記出力画像面指定ステップで指定した面Ｄにおける前記基準位置に対する次の出力画像の座標位置と前記領域映像位置情報とから、次の出力画像の画素数が多く得られるようにＭ個の映像ストリームの送信を要求する送信要求ステップとを有することを特徴とする。 The present invention is a video transmission / reception method performed by a video transmission / reception system including a video transmission device and a video reception device, and L (L ≧ 2) planes expressed in two dimensions are defined on each plane. An area image position information transmission step for transmitting all area image position information which are coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position, and one or more planes C (1 ≦ Codes are obtained by multiplexing N video streams composed of M (M ≦ K) area videos and common videos, which are part of the original video in C ≦ L), and ID information for distinguishing the video streams. A stream transmission step for transmitting the encoded data, an area video position information reception step for receiving the area video position information, a stream reception step for receiving the video stream, and an output image From the coordinate position of the output image with respect to the reference position at C (1 ≦ C ≦ L) and the region image position information, the i-th (1 ≦ i ≦ M) image is obtained so that the number of pixels of the output image is the largest. A read decision step for deciding to read a stream, a search read step for searching for and reading the i-th video stream from the ID information in the encoded data, and a decoding for decoding the video stream read in the search read step A cut-out position determining step for determining a cut-out position of the image, and a cut-out position determining step so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information Of the decoded images obtained in the decoding step based on the cutout positions of the images An image cut-out step for cutting out a part of the region video, and an output image composition for outputting an output image by combining the common video among the decoded images obtained in the decoding step and the image cut out in the image cut-out step An output image plane designating step for designating a plane D (1 ≦ D ≦ L) for cutting out the next output image, an output image coordinate designating step for designating a coordinate position of the next output image, and the output image plane Requests transmission of M video streams so that the number of pixels of the next output image can be obtained from the coordinate position of the next output image with respect to the reference position on the plane D specified in the specifying step and the region video position information. And a transmission requesting step.

本発明によれば、映像の一部の領域のみを視聴する際に、ユーザが指定した出力画像の位置に対して表示する画素数が最も多くなるように、映像ストリームを切り出すことができ、ユーザの操作による視聴領域の移動が大きい場合に表示するべき画像が存在しないために画像の表示ができないという状況を回避することができるという効果が得られる。 According to the present invention, when viewing only a partial area of a video, the video stream can be cut out so that the number of pixels to be displayed is the largest for the position of the output image specified by the user. When the movement of the viewing area by the above operation is large, there is an effect that it is possible to avoid a situation in which an image cannot be displayed because there is no image to be displayed.

本発明の第１の実施形態における映像復号装置の構成を示すブロック図である。It is a block diagram which shows the structure of the video decoding apparatus in the 1st Embodiment of this invention. 共通映像と領域画像との関係を示す説明図である。It is explanatory drawing which shows the relationship between a common image | video and a region image. 共通映像と領域画像の空間的な配置関係を示す説明図である。It is explanatory drawing which shows the spatial arrangement | positioning relationship of a common image | video and a region image. 領域映像と出力画像の位置関係を示す説明図である。It is explanatory drawing which shows the positional relationship of an area | region image | video and an output image. 多重化された符号化データの形式を示す説明図である。It is explanatory drawing which shows the format of the multiplexed encoded data. 本発明の第２の実施形態における映像復号装置の構成を示すブロック図である。It is a block diagram which shows the structure of the video decoding apparatus in the 2nd Embodiment of this invention. 本発明の第３の実施形態おける映像送受信システムを構成する映像送信装置の構成を示すブロック図である。It is a block diagram which shows the structure of the video transmission apparatus which comprises the video transmission / reception system in the 3rd Embodiment of this invention. 本発明の第３の実施形態おける映像送受信システムを構成する映像受信装置の構成を示すブロック図である。It is a block diagram which shows the structure of the video receiver which comprises the video transmission / reception system in the 3rd Embodiment of this invention. 図７に示す映像送信装置の変形例の構成を示すブロック図である。It is a block diagram which shows the structure of the modification of the video transmission apparatus shown in FIG. 本発明の第４の実施形態における映像送信装置の構成を示すブロック図である。It is a block diagram which shows the structure of the video transmission apparatus in the 4th Embodiment of this invention. 本発明の第４の実施形態における映像受信装置の構成を示すブロック図である。It is a block diagram which shows the structure of the video receiver in the 4th Embodiment of this invention. 本発明の第５の実施形態による映像復号装置の構成を示すブロック図である。It is a block diagram which shows the structure of the video decoding apparatus by the 5th Embodiment of this invention. 領域映像と出力画像の位置関係を示す説明図である。It is explanatory drawing which shows the positional relationship of an area | region image | video and an output image. 多重化された符号化データの形式を示す説明図である。It is explanatory drawing which shows the format of the multiplexed encoded data. 本発明の第６の実施形態による映像復号装置の構成を示すブロック図である。It is a block diagram which shows the structure of the video decoding apparatus by the 6th Embodiment of this invention. 本発明の第７の実施形態による映像送信装置の構成を示すブロック図である。It is a block diagram which shows the structure of the video transmission apparatus by the 7th Embodiment of this invention. 本発明の第７の実施形態による映像受信装置の構成を示すブロック図である。It is a block diagram which shows the structure of the video receiver by the 7th Embodiment of this invention. 本発明の第８の実施形態による映像受信装置の構成を示すブロック図である。It is a block diagram which shows the structure of the video receiver by the 8th Embodiment of this invention. 共通映像と領域映像の時間的な配置例を示す説明図である。It is explanatory drawing which shows the example of temporal arrangement | positioning of a common image | video and an area | region image | video. 従来のタイル構成と出力画像の位置関係の例を示す図である。It is a figure which shows the example of the positional relationship of the conventional tile structure and an output image. 色信号と奥行き情報から別の視点の映像を生成する手法の例を示す図である。It is a figure which shows the example of the method of producing | generating the image | video of another viewpoint from a color signal and depth information.

＜第１の実施形態＞
以下、図面を参照して、本発明の第１の実施形態による映像復号装置を説明する。第１の実施形態は、２次元面に対して３個の領域映像のストリーム（Ａ、Ｂ、Ｃ）で構成される映像情報の符号化データのうち１個のストリームを復号して画像情報を得る映像復号装置である。図１は同実施形態における映像復号装置の構成を示すブロック図である。映像復号装置は、３個の領域映像ストリームＡ、Ｂ、Ｃのうち、１個の領域映像ストリームを選択する領域映像ストリーム選択部１００と、ストリームを読み出す読み出し部１０１と、３個のうちの１個のストリームを読み出すことを決定する読み出し決定部１０２と、読み出したストリームを復号する復号部１０３と、復号画像から一部を切り出す画像切り出し部１０４と、切り出し位置を決定する切り出し位置決定部１０５と、領域映像と共通映像から出力画像を合成する出力画像合成部１０７を備えている。 <First Embodiment>
Hereinafter, a video decoding apparatus according to a first embodiment of the present invention will be described with reference to the drawings. In the first embodiment, image information is obtained by decoding one stream of encoded data of video information composed of three area video streams (A, B, C) for a two-dimensional plane. The video decoding device to obtain. FIG. 1 is a block diagram showing the configuration of the video decoding apparatus in the embodiment. The video decoding apparatus includes an area video stream selection unit 100 that selects one area video stream from among the three area video streams A, B, and C, a reading unit 101 that reads a stream, and one of the three video streams. A read decision unit 102 that decides to read one stream, a decoding unit 103 that decodes the read stream, an image cutout unit 104 that cuts out a part from the decoded image, and a cutout position decision unit 105 that decides a cutout position , An output image synthesis unit 107 that synthesizes an output image from the area video and the common video is provided.

図２は、共通映像と領域画像との関係を示す説明図である。共通映像は図２のように領域映像の元の映像を縮小した映像である。図３は、共通映像と領域画像の空間的な配置関係を示す説明図である。図２に示す領域映像Ａ、Ｂ、Ｃそれぞれと共通映像は図３（ａ）のように空間的に上下に配置しているものとする。図４は、領域映像と出力画像の位置関係を示す説明図である。面上における３個の領域映像Ａ、Ｂ、Ｃの位置および出力画像の位置関係は図４に示す位置であるものとする。また３個の映像ストリームＡ、Ｂ、Ｃおよび領域映像位置情報は予め映像復号装置に与えられているものとする。 FIG. 2 is an explanatory diagram showing the relationship between the common video and the region image. The common video is a video obtained by reducing the original video of the area video as shown in FIG. FIG. 3 is an explanatory diagram showing a spatial arrangement relationship between the common video and the region image. It is assumed that each of the area images A, B, and C shown in FIG. 2 and the common image are spatially arranged vertically as shown in FIG. FIG. 4 is an explanatory diagram showing the positional relationship between the region video and the output image. Assume that the positions of the three area videos A, B, and C on the surface and the positional relationship between the output images are as shown in FIG. In addition, it is assumed that the three video streams A, B, and C and the region video position information are given in advance to the video decoding device.

次に、図１を参照して、図１に示す映像復号装置が、領域映像を復号して出力画像を得る動作を説明する。まずユーザによって出力画像の位置が設定されると、読み出し決定部１０２は、２次元面上の出力画像の位置を示す出力画像座標位置と、領域映像の位置を示す領域映像位置情報より、各領域映像について出力画像の画素数が最も多くなる領域映像を判定する。図４に示す例では、領域映像Ｂが最も多くなることから、読み出すストリームは領域映像Ｂの領域映像ストリームＢと決定する。そして、読み出し部１０１は、領域映像ストリーム選択部１００を介して領域映像ストリームＢを読み出し、復号部１０３は読み出したストリームを復号して復号画像を出力する。 Next, with reference to FIG. 1, an operation of the video decoding apparatus shown in FIG. 1 decoding an area video to obtain an output image will be described. First, when the position of the output image is set by the user, the reading determination unit 102 determines each area from the output image coordinate position indicating the position of the output image on the two-dimensional plane and the area video position information indicating the position of the area video. A region image in which the number of pixels of the output image is the largest is determined for the image. In the example illustrated in FIG. 4, since the area video B is the largest, the stream to be read is determined as the area video stream B of the area video B. Then, the reading unit 101 reads the area video stream B via the area video stream selection unit 100, and the decoding unit 103 decodes the read stream and outputs a decoded image.

次に、切り出し位置決定部１０５は、領域映像Ｂの座標位置および出力画像の座標位置から復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部１０４は、復号画像から切り出し位置決定部１０５が決定した画像を切り出して出力する。出力画像合成部１０７は復号部１０３が出力する復号画像から共通映像を取り出し、この共通映像と画像切り出し部１０４が出力する切り出し画像とを合成した出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。 Next, the cutout position determination unit 105 determines a position from which the output image is cut out from the decoded image from the coordinate position of the region video B and the coordinate position of the output image. The image cutout unit 104 cuts out and outputs the image determined by the cutout position determination unit 105 from the decoded image. The output image synthesis unit 107 extracts a common video from the decoded image output from the decoding unit 103, and outputs an output image obtained by synthesizing the common video and the cut-out image output from the image clipping unit 104. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image.

＜第２の実施形態＞
次に、本発明の第２の実施形態による映像復号装置を説明する。第２の実施形態の映像復号装置は、２次元面に対して３個の領域映像のストリーム（Ａ、Ｂ、Ｃ）がストリームを識別するストリームＩＤと多重化された符号化データから、特定の領域映像のストリームを読み出して画像情報を得るものである。図５は、多重化された符号化データの形式を示す説明図である。各領域映像のストリームは適当なところで複数に分割されており、その各分割されたストリームに対して、先頭にスタートコード情報およびストリームＩＤ情報を付与して多重化されている。スタートコード情報は固有の符号化列を持つ符号語であり、ストリームに対して符号語の探索をすることで位置が判定できる。 <Second Embodiment>
Next, a video decoding apparatus according to the second embodiment of the present invention will be described. The video decoding apparatus according to the second embodiment uses a stream ID of three area videos (A, B, and C) with respect to a two-dimensional plane to identify a specific stream from encoded data that is multiplexed with a stream ID that identifies the stream. The image information is obtained by reading the region video stream. FIG. 5 is an explanatory diagram showing a format of multiplexed encoded data. Each area video stream is divided into a plurality of appropriate portions, and each divided stream is multiplexed with start code information and stream ID information at the head. The start code information is a code word having a unique coded sequence, and the position can be determined by searching the code word for the stream.

図６は、本発明の第２の実施形態における映像復号装置の構成を示すブロック図である。この図において、図１に示す装置と同一の部分には同一の符号を付し、その説明を簡単に行う。図６に示す映像復号装置は、多重化された符号化データから特定の領域映像のストリームを探索して読み出す探索読み出し部１０６と、読み出す領域映像を決定する読み出し決定部１０２と、ストリームを復号する復号部１０３と、復号画像から一部を切り出す画像切り出し部１０４と、切り出し位置を決定する切り出し位置決定部１０５と、領域映像と共通映像から出力画像を合成する出力画像合成部１０７とを備える。 FIG. 6 is a block diagram showing the configuration of the video decoding apparatus in the second embodiment of the present invention. In this figure, the same parts as those in the apparatus shown in FIG. The video decoding apparatus shown in FIG. 6 decodes a stream, a search reading unit 106 that searches for and reads a stream of a specific area video from the multiplexed encoded data, a read determination unit 102 that determines a read area video, and A decoding unit 103, an image cutout unit 104 that cuts out a part from the decoded image, a cutout position determination unit 105 that determines a cutout position, and an output image synthesis unit 107 that synthesizes an output image from the region video and the common video are provided.

共通映像は図２のように領域映像の元の映像を縮小した映像とする。また領域映像と共通映像は図３（ａ）のように空間的に上下に配置しているものとする。面上における３個の領域映像の位置および出力画像の位置は図４と同様な位置であるものとする。また符号化データおよび領域映像位置情報は予め映像復号装置に与えられているものとする。 The common video is a video obtained by reducing the original video of the area video as shown in FIG. Further, it is assumed that the area image and the common image are spatially arranged vertically as shown in FIG. It is assumed that the position of the three area videos and the position of the output image on the surface are the same as those in FIG. Further, it is assumed that the encoded data and the region video position information are given to the video decoding device in advance.

次に、図６を参照して、図６に示す映像復号装置が領域映像を復号して出力画像を得る動作を説明する。まず、ユーザが出力画像の位置を設定すると、読み出し決定部１０２は、２次元面上の出力画像の位置を示す出力画像座標位置と、領域映像の位置を示す領域映像位置情報より、各領域映像について出力画像の画素数が最も多くなる領域映像を判定する。図４の場合には、領域映像Ｂが最も多くなることから、読み出すストリームは領域映像Ｂのものと決定する。そして、探索読み出し部１０６は符号化データの中からスタートコード情報を探索し、続くストリームＩＤ情報を読み込むことで、領域映像Ｂのストリームの位置を判定し、領域映像Ｂのストリームを読み出す。 Next, with reference to FIG. 6, the operation of the video decoding apparatus shown in FIG. 6 decoding an area video to obtain an output image will be described. First, when the user sets the position of the output image, the read determination unit 102 determines each area video from the output image coordinate position indicating the position of the output image on the two-dimensional plane and the area video position information indicating the position of the area video. A region image having the largest number of pixels in the output image is determined. In the case of FIG. 4, since the area video B is the largest, the stream to be read is determined to be that of the area video B. Then, the search / read unit 106 searches the encoded data for start code information, reads the subsequent stream ID information, determines the position of the stream of the area video B, and reads the stream of the area video B.

次に、復号部１０３はストリームを復号して復号画像を得る。切り出し位置決定部１０５は、領域映像Ｂの座標位置および出力画像の座標位置から復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部１０４は復号画像から画像を切り出す。出力画像合成部１０７は復号画像から共通映像を取り出し、取り出した共通映像と画像切り出し部１０４が出力する画像とを合成した出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。 Next, the decoding unit 103 decodes the stream to obtain a decoded image. The cutout position determination unit 105 determines the position from which the output image is cut out from the decoded image from the coordinate position of the region video B and the coordinate position of the output image. The image cutout unit 104 cuts out an image from the decoded image. The output image synthesis unit 107 extracts a common video from the decoded image, and outputs an output image obtained by synthesizing the extracted common video and the image output by the image cutout unit 104. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image.

＜第３の実施形態＞
次に、本発明の第３の実施形態による映像送受信システムを説明する。第３の実施形態は、映像送信装置で、２次元面に対して３個（Ａ、Ｂ、Ｃ）の領域映像のストリームで構成される映像情報の符号化データのうち、映像受信装置から要求された１個のストリームを送信し、映像受信装置でストリームを復号して画像情報を得るシステムである。図７は、本発明の第３の実施形態おける映像送受信システムを構成する映像送信装置の構成を示すブロック図である。図８は、本発明の第３の実施形態おける映像送受信システムを構成する映像受信装置の構成を示すブロック図である。 <Third Embodiment>
Next, a video transmission / reception system according to a third embodiment of the present invention will be described. The third embodiment is a video transmission apparatus, which is requested from a video reception apparatus among encoded data of video information composed of three (A, B, C) area video streams for a two-dimensional plane. This is a system for transmitting a single stream and obtaining the image information by decoding the stream with a video receiver. FIG. 7 is a block diagram showing a configuration of a video transmission apparatus constituting a video transmission / reception system according to the third embodiment of the present invention. FIG. 8 is a block diagram showing a configuration of a video reception device constituting the video transmission / reception system according to the third embodiment of the present invention.

図７に示す映像送信装置は、３個のうちの１個のストリームを選択する領域映像ストリーム選択部２００と、選択された領域映像ストリームを送信するストリーム送信部２０１と、領域映像位置情報を送信する領域映像位置情報送信部２０２とを備える。図８に示す映像受信装置は、領域映像位置情報を受信する領域映像位置情報受信部３０１と、ストリームを受信するストリーム受信部３０２と、受信したストリームを読み出す読み出し部３０３と、ストリームを復号する復号部３０４と、復号画像から一部を切り出す画像切り出し部３０５と、切り出し位置を決定する切り出し位置決定部３０６と、領域映像と共通映像とを合成して出力画像を出力する出力画像合成部３０７と、次の出力画像の座標位置を指定する出力画像座標指定部３０８と、領域映像位置情報と次の出力画像の座標位置から要求する領域映像を決定する送信要求決定部３０９と、領域映像を要求する送信要求部３１０とを備える。 The video transmission device shown in FIG. 7 transmits an area video stream selection unit 200 that selects one of the three streams, a stream transmission unit 201 that transmits the selected area video stream, and area video position information. And an area video position information transmitting unit 202. The video receiving apparatus shown in FIG. 8 includes an area video position information receiving unit 301 that receives area video position information, a stream receiving unit 302 that receives a stream, a reading unit 303 that reads the received stream, and a decoding that decodes the stream. Unit 304, image cutout unit 305 that cuts out a part from the decoded image, cutout position determination unit 306 that determines a cutout position, and output image composition unit 307 that combines the region video and the common video and outputs an output image An output image coordinate designating unit 308 for designating the coordinate position of the next output image, a transmission request determining unit 309 for determining the requested region video from the region video position information and the coordinate position of the next output image, and requesting the region video And a transmission request unit 310.

共通映像は図２に示すように領域映像の元の映像を縮小した映像とする。また領域映像と共通映像は図３（ａ）のように空間的に上下に配置しているものとする。面上における３個の領域映像の位置および出力画像の位置および次の出力画像の位置は図４と同様な位置であるものとする。また映像送信装置では３個の映像ストリームおよび領域映像位置情報は予め与えられているものとする。また領域映像を区別する情報として領域映像ＩＤ（Ａ、Ｂ、Ｃのいずれか）があり、映像受信装置の送信要求部３１０では領域映像ＩＤを送信し、映像送信装置のストリーム送信部２０１は領域映像ＩＤを受信するものとする。また映像送信装置のストリーム送信部２０１は、映像受信装置からの要求がきていない場合には、領域映像Ａのストリームを送信するものとする。 The common video is a video obtained by reducing the original video of the area video as shown in FIG. Further, it is assumed that the area image and the common image are spatially arranged vertically as shown in FIG. It is assumed that the position of the three area images on the surface, the position of the output image, and the position of the next output image are the same as those in FIG. In the video transmission apparatus, it is assumed that three video streams and area video position information are given in advance. Further, there is an area video ID (any one of A, B, and C) as information for distinguishing the area video, the transmission request unit 310 of the video reception apparatus transmits the area video ID, and the stream transmission unit 201 of the video transmission apparatus stores the area video ID. Assume that a video ID is received. In addition, the stream transmission unit 201 of the video transmission apparatus transmits the stream of the area video A when there is no request from the video reception apparatus.

次に、図７、図８を参照して、映像送信装置と映像受信装置が領域映像のストリームを送受信して領域映像を復号して出力画像を得る動作を説明する。まず、映像送信装置の領域映像位置情報送信部２０２は領域映像位置情報を送信する。これを受けて、映像受信装置の領域映像位置情報受信部３０１は領域映像位置情報を受信する。続いて、映像送信装置のストリーム送信部２０１は映像受信装置からの要求がきていないので、領域映像ストリーム選択部２００によって領域映像ストリームＡを選択し、領域映像ストリームＡを送信する。 Next, with reference to FIG. 7 and FIG. 8, an operation in which the video transmission device and the video reception device transmit and receive a region video stream, decode the region video, and obtain an output image will be described. First, the area video position information transmission unit 202 of the video transmission apparatus transmits area video position information. In response to this, the area video position information receiving unit 301 of the video receiving apparatus receives the area video position information. Subsequently, since the stream transmission unit 201 of the video transmission apparatus has not received a request from the video reception apparatus, the area video stream selection unit 200 selects the area video stream A and transmits the area video stream A.

次に、映像受信装置のストリーム受信部３０２は映像送信装置からのストリームを受信し、読み出し部３０３が受信したストリームを読み出す。そして、復号部３０４はストリームを復号して復号画像を出力し、切り出し位置決定部３０６は、領域映像Ａの座標位置および出力画像の座標位置から復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部３０５は、決定された切り出し位置に基づいて復号画像から画像を切り出す。出力画像合成部３０７は復号部３０４が出力する復号画像から共通映像を取り出し、この共通映像と画像切り出し部３０５で得られた画像とを合成して出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。 Next, the stream receiving unit 302 of the video receiving device receives the stream from the video transmitting device, and reads the stream received by the reading unit 303. Then, the decoding unit 304 decodes the stream and outputs a decoded image, and the cut-out position determination unit 306 determines a position to cut out the output image from the decoded image from the coordinate position of the area video A and the coordinate position of the output image. . The image cutout unit 305 cuts out an image from the decoded image based on the determined cutout position. The output image synthesis unit 307 extracts a common video from the decoded image output from the decoding unit 304, synthesizes the common video and the image obtained by the image cutout unit 305, and outputs an output image. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image.

次に、出力画像座標指定部３０８は次の出力画像の座標位置を指定する。送信要求決定部３０９は次の出力画像の座標位置から領域映像Ｃを決定する。送信要求部３１０は領域映像Ｃの領域映像ＩＤを送信する。続いて映像送信装置のストリーム送信部２０１は領域映像ＩＤとして領域映像Ｃを受信するので、領域映像ストリームＣを送信する。 Next, the output image coordinate designating unit 308 designates the coordinate position of the next output image. The transmission request determination unit 309 determines the region video C from the coordinate position of the next output image. The transmission request unit 310 transmits the area video ID of the area video C. Subsequently, since the stream transmission unit 201 of the video transmission apparatus receives the area video C as the area video ID, it transmits the area video stream C.

次に、映像受信装置のストリーム受信部３０２は映像送信装置からのストリームを受信し、読み出し部３０３が受信したストリームを読み出し、復号部３０４はストリームを復号して復号画像を出力する。切り出し位置決定部３０６は、領域映像Ｃの座標位置および出力画像の座標位置から復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部３０５は復号画像から画像を切り出し、出力画像合成部３０７は復号画像から共通映像を取り出し、画像切り出し部３０５で得られた画像と合成し出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。以上の処理動作を出力画像の位置が指定されるたびに繰り返し実行する。 Next, the stream receiving unit 302 of the video receiving device receives the stream from the video transmitting device, the reading unit 303 reads the received stream, and the decoding unit 304 decodes the stream and outputs a decoded image. The cutout position determination unit 306 determines a position to cut out the output image from the decoded image from the coordinate position of the region video C and the coordinate position of the output image. The image cutout unit 305 cuts out an image from the decoded image, and the output image combining unit 307 extracts a common video from the decoded image, combines it with the image obtained by the image cutout unit 305, and outputs an output image. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image. The above processing operation is repeatedly executed every time the position of the output image is designated.

なお、図７に示す映像送信装置は、予め映像ストリームを用意するのではなく、送信する際に領域映像を切り出して符号化し、得られたストリームを送信するようにしてもよい。この場合の映像送信装置の構成を図９に示す。図９は、図７に示す映像送信装置の変形例の構成を示すブロック図である。図９において、ストリーム送信部２０１は、送信する際に元映像２０３を読み出し、領域映像を切りだして符号化し、得られたストリームを送信する。この場合、映像送信装置側で領域映像ストリームを符号化した後に送信することができる。 Note that the video transmission apparatus shown in FIG. 7 does not prepare a video stream in advance, but may cut out and encode an area video when transmitting and transmit the obtained stream. The configuration of the video transmission apparatus in this case is shown in FIG. FIG. 9 is a block diagram showing a configuration of a modified example of the video transmission apparatus shown in FIG. In FIG. 9, when transmitting, a stream transmission unit 201 reads an original video 203, cuts out and encodes a region video, and transmits the obtained stream. In this case, it is possible to transmit after encoding the area video stream on the video transmission apparatus side.

＜第４の実施形態＞
次に、本発明の第４の実施形態による映像送受信システムを説明する。第４の実施形態は、映像送信装置で、２次元面に対して３個（Ａ、Ｂ、Ｃ）の領域映像のストリームがストリームＩＤと多重化された符号化データから、映像受信装置から要求された複数の領域映像のストリームを送信し、映像受信装置でストリームを復号して画像情報を得るシステムである。図１０は、本発明の第４の実施形態における映像送信装置の構成を示すブロック図である。図１１は、本発明の第４の実施形態における映像受信装置の構成を示すブロック図である。多重化された符号化データの形式は、図５に示す形式と同様である。図１０、図１１において、図７、図８に示す装置と同一の部分には同一の符号を付し、その説明を簡単に行う。 <Fourth Embodiment>
Next, a video transmission / reception system according to a fourth embodiment of the present invention will be described. The fourth embodiment is a video transmission device that requests from a video reception device from coded data in which three (A, B, C) area video streams are multiplexed with a stream ID for a two-dimensional plane. In this system, a plurality of region video streams are transmitted, and the video reception device decodes the streams to obtain image information. FIG. 10 is a block diagram showing a configuration of a video transmission apparatus according to the fourth embodiment of the present invention. FIG. 11 is a block diagram illustrating a configuration of a video reception device according to the fourth embodiment of the present invention. The format of the multiplexed encoded data is the same as the format shown in FIG. In FIGS. 10 and 11, the same parts as those in the apparatus shown in FIGS. 7 and 8 are denoted by the same reference numerals, and description thereof will be simply given.

図１０に示す映像送信装置は、多重化された符号化データから特定の領域映像のストリームを探索して送信するストリーム送信部２０１ａと、領域映像位置情報を送信する領域映像位置情報送信部２０２とを備える。 The video transmission apparatus shown in FIG. 10 includes a stream transmission unit 201a that searches for and transmits a stream of a specific area video from the multiplexed encoded data, and an area video position information transmission unit 202 that transmits area video position information. Is provided.

図１１に示す映像受信装置は、領域映像位置情報を受信する領域映像位置情報受信部３０１と、ストリームを受信するストリーム受信部３０２と、読み出すストリームを決定する読み出し決定部３１１と、受信したストリームから読み出し決定部３１１で決定したストリームを探索して読み出す探索読み出し部３１２と、ストリームを復号する復号部３０４と、復号画像から一部を切り出す画像切り出し部３０５と、切り出し位置を決定する切り出し位置決定部３０６と、領域映像と共通映像と合成して出力画像を出力する出力画像合成部３０７と、次の出力画像の座標位置を指定する出力画像座標指定部３０８と、領域映像位置情報と次の出力画像の座標位置から要求する領域映像を決定する送信要求決定部３０９と、領域映像を要求する送信要求部３１０とを備える。 The video receiving apparatus shown in FIG. 11 includes an area video position information receiving unit 301 that receives area video position information, a stream receiving unit 302 that receives a stream, a read determining unit 311 that determines a stream to be read, and a received stream. A search reading unit 312 for searching and reading the stream determined by the read determination unit 311, a decoding unit 304 for decoding the stream, an image cutout unit 305 for cutting out a part from the decoded image, and a cutout position determination unit for determining the cutout position 306, an output image composition unit 307 for synthesizing the region image and the common image and outputting an output image, an output image coordinate designating unit 308 for designating the coordinate position of the next output image, the region image position information and the next output A transmission request determination unit 309 that determines a requested region video from the coordinate position of the image, and requests a region video And a signal requesting unit 310.

共通映像は図２に示すように領域映像の元の映像を縮小した映像とする。また領域映像と共通映像は図３（ａ）のように空間的に上下に配置しているものとする。面上における３個の領域映像の位置および出力画像の位置および次の出力画像の位置は図４と同様な位置であるものとする。また映像送信装置では符号化データおよび領域映像位置情報は予め与えられているものとする。また映像送信装置のストリーム送信部２０１ａは、映像受信装置からの要求がきていない場合には、領域映像Ａと領域映像Ｂのストリームを送信するものとする。また映像受信装置の送信要求決定部３０９は、出力画像の画素数が多い２つの領域映像を要求するように決定するものとする。 The common video is a video obtained by reducing the original video of the area video as shown in FIG. Further, it is assumed that the area image and the common image are spatially arranged vertically as shown in FIG. It is assumed that the position of the three area images on the surface, the position of the output image, and the position of the next output image are the same as those in FIG. In the video transmission apparatus, it is assumed that encoded data and area video position information are given in advance. Further, the stream transmission unit 201a of the video transmission apparatus transmits the stream of the area video A and the area video B when there is no request from the video reception apparatus. In addition, the transmission request determination unit 309 of the video reception device determines to request two area videos having a large number of pixels in the output image.

次に、図１０、図１１を参照して、映像送受信システムが領域映像のストリームを送受信して領域映像を復号して出力画像を得る動作を説明する。まず、映像送信装置の領域映像位置情報送信部２０２は領域映像位置情報を送信する。これを受けて、映像受信装置の領域映像位置情報受信部３０１は領域映像位置情報を受信する。 Next, an operation in which the video transmission / reception system transmits / receives a region video stream, decodes the region video, and obtains an output image will be described with reference to FIGS. 10 and 11. First, the area video position information transmission unit 202 of the video transmission apparatus transmits area video position information. In response to this, the area video position information receiving unit 301 of the video receiving apparatus receives the area video position information.

次に、映像送信装置のストリーム送信部２０１ａは映像受信装置からの要求がきていないため、領域映像Ａと領域映像Ｂのストリームを送信する。映像受信装置のストリーム受信部３０２は映像送信装置からのストリームを受信する。そして、読み出し決定部３１１は、２次元面上の出力画像の位置と、領域映像の位置情報より、各領域映像について出力画像の画素数が最も多くなる領域映像を判定する。図４に示す例では、領域映像Ｂが最も多くなることから、読み出すストリームは領域映像Ｂのものと決定する。そして、探索読み出し部３１２は符号化データの中からスタートコード情報を探索し、続くストリームＩＤ情報を読み込むことで、領域映像Ｂのストリームの位置を判定し、領域映像Ｂのストリームを読み出す。 Next, the stream transmission unit 201a of the video transmission apparatus transmits the stream of the area video A and the area video B because there is no request from the video reception apparatus. The stream receiving unit 302 of the video reception device receives a stream from the video transmission device. Then, the readout determination unit 311 determines an area image in which the number of pixels of the output image is the largest for each area image from the position of the output image on the two-dimensional plane and the position information of the area image. In the example shown in FIG. 4, the area video B is the largest, so the stream to be read is determined to be that of the area video B. Then, the search / read unit 312 searches the encoded data for start code information, reads the subsequent stream ID information, determines the position of the stream of the area video B, and reads the stream of the area video B.

次に、復号部３０４はストリームを復号して復号画像を出力する。切り出し位置決定部３０６は、領域映像Ｂの座標位置および出力画像の座標位置から復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部３０５は切り出し位置に基づき復号画像から画像を切り出す。出力画像合成部３０７は復号画像から共通映像を取り出し、この共通映像と画像切り出し部３０５が出力する画像とを合成した出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。 Next, the decoding unit 304 decodes the stream and outputs a decoded image. The cutout position determination unit 306 determines a position to cut out the output image from the decoded image from the coordinate position of the region video B and the coordinate position of the output image. The image cutout unit 305 cuts out an image from the decoded image based on the cutout position. The output image synthesis unit 307 extracts a common video from the decoded image, and outputs an output image obtained by synthesizing the common video and the image output from the image clipping unit 305. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image.

次に、出力画像座標指定部３０８は次の出力画像の座標位置を指定する。送信要求決定部３０９は次の出力画像の座標位置から領域映像Ｂと領域映像Ｃを決定する。送信要求部３１０は領域映像Ｂと領域映像ＣのストリームＩＤを送信する。これを受けて、映像送信装置のストリーム送信部２０１ａは領域映像Ｂと領域映像ＣのストリームＩＤを受信するので、領域映像Ｂと領域映像Ｃのストリームを送信する。 Next, the output image coordinate designating unit 308 designates the coordinate position of the next output image. The transmission request determination unit 309 determines the region video B and the region video C from the coordinate position of the next output image. The transmission request unit 310 transmits the stream IDs of the area video B and the area video C. In response to this, the stream transmission unit 201a of the video transmission apparatus receives the stream IDs of the area video B and the area video C, and therefore transmits the streams of the area video B and the area video C.

次に、映像受信装置のストリーム受信部３０２は映像送信装置からのストリームを受信し、読み出し決定部３１１は、出力画像の画素数は領域映像Ｃが最も多くなることから、読み出すストリームは領域映像Ｃのものと決定する。そして、探索読み出し部３１２は符号化データの中からスタートコード情報を探索し、続くストリームＩＤ情報を読み込むことで、領域映像Ｃのストリームの位置を判定し、領域映像Ｃのストリームを読み出す。復号部３０４はストリームを復号して復号画像を出力する。 Next, the stream reception unit 302 of the video reception device receives the stream from the video transmission device, and the read determination unit 311 has the region image C having the largest number of pixels in the output image. To be determined. Then, the search / read unit 312 searches the encoded data for start code information, reads subsequent stream ID information, determines the position of the stream of the area video C, and reads the stream of the area video C. The decoding unit 304 decodes the stream and outputs a decoded image.

次に、切り出し位置決定部３０６は、領域映像Ｃの座標位置および出力画像の座標位置から復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部３０５は切り出し位置に基づき復号画像から画像を切り出す。出力画像合成部３０７は復号画像から共通映像を取り出し、この共通映像と画像切り出し部３０５から出力する画像とを合成して出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。以上の処理動作を出力画像の位置が指定されるたびに繰り返し実行する。 Next, the cutout position determination unit 306 determines a position from which the output image is cut out from the decoded image from the coordinate position of the region video C and the coordinate position of the output image. The image cutout unit 305 cuts out an image from the decoded image based on the cutout position. The output image synthesis unit 307 extracts a common video from the decoded image, and synthesizes the common video and the image output from the image clipping unit 305 to output an output image. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image. The above processing operation is repeatedly executed every time the position of the output image is designated.

また、映像受信装置が領域映像Ｂと領域映像ＣのストリームＩＤを送信して、映像送信装置から対応するストリームを受信するまでの間に、次の出力画像の位置が以前の出力画像と変わらない位置になる場合（従って出力画像の位置が変化しない場合）には、映像受信装置のストリーム受信部３０２が受信したストリームについて、読み出し決定部３１１は、出力画像の画素数は領域映像Ｂが最も多くなることから、読み出すストリームは領域映像Ｂのものと決定し、探索読み出し部３１２は領域映像Ｂのストリームを読み出し、復号部３０４はストリームを復号して復号画像を出力する。 In addition, the position of the next output image does not change from the previous output image until the video receiving apparatus transmits the stream IDs of the area video B and the area video C and receives the corresponding stream from the video transmitting apparatus. When the position is reached (and thus the position of the output image does not change), the read determination unit 311 has the largest number of pixels of the output image in the region video B for the stream received by the stream reception unit 302 of the video reception device. Therefore, it is determined that the stream to be read is that of the area video B, the search reading unit 312 reads the stream of the area video B, and the decoding unit 304 decodes the stream and outputs a decoded image.

次に、切り出し位置決定部３０６は、領域映像Ｂの座標位置および出力画像の座標位置から復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部３０５は復号画像から画像を切り出す。出力画像合成部は復号画像から共通映像を取り出し、この共通映像と画像切り出し部３０５から出力する画像とを合成して出力画像をする。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。 Next, the cutout position determination unit 306 determines a position to cut out the output image from the decoded image from the coordinate position of the region video B and the coordinate position of the output image. The image cutout unit 305 cuts out an image from the decoded image. The output image composition unit extracts a common video from the decoded image, and synthesizes the common video and the image output from the image cutout unit 305 to form an output image. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image.

なお、映像送信装置は予め映像ストリームを用意するのではなく、送信する際に領域映像を切りだして符号化し、得られたストリームを送信するようにしてもよい。 Note that the video transmission apparatus may not prepare the video stream in advance, but may cut out and encode the area video when transmitting and transmit the obtained stream.

＜第５の実施形態＞
次に、本発明の第５の実施形態による映像復号装置を説明する。第５の実施形態の映像復号装置は、解像度Ｐの２次元面に対して３個の領域映像のストリーム（Ａ、Ｂ、Ｃ）と、解像度Ｑの２次元面に対して２個の領域映像のストリーム（Ｄ、Ｅ）で構成される映像情報の符号化データのうち１個のストリームを復号して画像情報を得るものである。図１２は、本発明の第５の実施形態による映像復号装置の構成を示すブロック図である。図１２において、図１に示す装置と同一の部分には同一の符号を付し、その説明を簡単に行う。 <Fifth Embodiment>
Next, a video decoding apparatus according to the fifth embodiment of the present invention will be described. The video decoding apparatus according to the fifth embodiment includes three area video streams (A, B, C) with respect to a two-dimensional plane with resolution P and two area videos with respect to a two-dimensional plane with resolution Q. The image information is obtained by decoding one stream of the encoded data of the video information composed of the streams (D, E). FIG. 12 is a block diagram showing a configuration of a video decoding apparatus according to the fifth embodiment of the present invention. In FIG. 12, the same parts as those of the apparatus shown in FIG.

図１２に示す映像復号装置は、５個の領域映像ストリームＡ、Ｂ、Ｃ、Ｄ、Ｅのうち、１個の領域映像ストリームを選択する領域映像ストリーム選択部１００ａと、指定された解像度のストリームの中から１個のストリームを読み出すことを決定する読み出し決定部１０２ａと、決定されたストリームを読み出す読み出し部１０１と、ストリームを復号する復号部１０３と、復号画像から一部を切り出す画像切り出し部１０４と、切り出し位置を決定する切り出し位置決定部と１０５と、領域映像と共通映像とを合成して出力画像を出力する出力画像合成部１０７とを備える。 The video decoding apparatus shown in FIG. 12 includes an area video stream selection unit 100a that selects one area video stream from among the five area video streams A, B, C, D, and E, and a stream having a designated resolution. A read determination unit 102a that determines to read one stream, a read unit 101 that reads the determined stream, a decoding unit 103 that decodes the stream, and an image cutout unit 104 that extracts a part from the decoded image. A cut-out position determining unit 105 that determines a cut-out position, and an output image combining unit 107 that combines the region video and the common video to output an output image.

共通映像は図２に示すように領域映像の元の映像を縮小した映像とする。また領域映像と共通映像は図３（ａ）のように空間的に上下に配置しているものとする。解像度Ｐと解像度Ｑの面上における各領域映像の位置および出力画像の位置は図１３に示す位置と同様な位置とする。図１３は、領域映像と出力画像の位置関係を示す説明図である。出力画像の位置は解像度Ｐの面上で設定されるものとする。またすべての映像ストリームおよび領域映像位置情報は予め映像復号装置に与えられているものとする。 The common video is a video obtained by reducing the original video of the area video as shown in FIG. Further, it is assumed that the area image and the common image are spatially arranged vertically as shown in FIG. The position of each area video and the position of the output image on the plane of resolution P and resolution Q are the same positions as shown in FIG. FIG. 13 is an explanatory diagram showing the positional relationship between the region video and the output image. It is assumed that the position of the output image is set on the plane of resolution P. It is assumed that all video streams and area video position information are given to the video decoding device in advance.

次に、図１２を参照して、図１２に示す映像復号装置が領域映像を復号して出力画像を得る動作を説明する。まず、ユーザが出力画像の解像度Ｐおよび位置を設定すると、読み出し決定部１０２ａは、解像度Ｐの面上の出力画像の位置と、領域映像の位置情報より、各領域映像について出力画像の画素数が最も多くなる領域映像を判定する。図１３に示す例では、領域映像Ｂが最も多くなることから、読み出すストリームは領域映像Ｂのものと決定する。そして、読み出し部１０１は領域映像ストリーム選択部１００ａを介して領域映像Ｂのストリームを読み出し、復号部１０３はストリームを復号して復号画像を出力する。 Next, with reference to FIG. 12, an operation in which the video decoding apparatus shown in FIG. 12 decodes a region video to obtain an output image will be described. First, when the user sets the resolution P and position of the output image, the readout determination unit 102a determines the number of pixels of the output image for each area video from the position of the output image on the surface of the resolution P and the position information of the area video. The region image with the largest amount is determined. In the example shown in FIG. 13, since the area video B is the largest, the stream to be read is determined to be that of the area video B. Then, the reading unit 101 reads the stream of the area video B via the area video stream selection unit 100a, and the decoding unit 103 decodes the stream and outputs a decoded image.

次に、切り出し位置決定部１０５は、領域映像Ｂの座標位置および出力画像の座標位置から復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部１０４は切り出し位置に基づき復号画像から画像を切り出す。出力画像合成部１０７は復号画像から共通映像を取り出し、この共通映像と画像切り出し部１０４から出力する画像とを合成して出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。 Next, the cutout position determination unit 105 determines a position from which the output image is cut out from the decoded image from the coordinate position of the region video B and the coordinate position of the output image. The image cutout unit 104 cuts out an image from the decoded image based on the cutout position. The output image composition unit 107 extracts a common video from the decoded image, combines the common video and the image output from the image cutout unit 104, and outputs an output image. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image.

なお、本実施形態は解像度の異なる二つの面で領域映像を定義したが、空間的な位置が離れた二つの面で領域映像を定義してもよい。この場合の動作も本実施形態と同様な処理動作を行えばよい。 In the present embodiment, the area video is defined by two planes having different resolutions, but the area video may be defined by two planes separated in spatial position. The operation in this case may be the same processing operation as in this embodiment.

＜第６の実施形態＞
次に、本発明の第６の実施形態による映像復号装置を説明する。第６の実施形態の映像復号装置は、解像度Ｐの２次元面に対して３個の領域映像のストリーム（Ａ、Ｂ、Ｃ）と、解像度Ｑの２次元面に対して２個の領域映像のストリーム（Ｄ、Ｅ）がストリームＩＤと多重化された符号化データから、特定の領域映像のストリームを読み出して画像情報を得るものである。図１４は、多重化された符号化データの形式を示す説明図である。各領域映像のストリームは適当なところで複数に分割されており、その各分割されたストリームに対して、先頭にスタートコード情報およびストリームＩＤ情報を付与して多重化されている。スタートコード情報は固有の符号化列を持つ符号語であり、ストリームに対して符号語の探索をすることで位置が判定できる。 <Sixth Embodiment>
Next, a video decoding apparatus according to the sixth embodiment of the present invention will be described. The video decoding apparatus according to the sixth embodiment includes a stream (A, B, C) of three area videos for a two-dimensional plane with resolution P and two area videos for a two-dimensional plane with resolution Q. A stream of a specific area video is read out from encoded data obtained by multiplexing the stream (D, E) of the first stream (D, E) with the stream ID to obtain image information. FIG. 14 is an explanatory diagram showing a format of multiplexed encoded data. Each area video stream is divided into a plurality of appropriate portions, and each divided stream is multiplexed with start code information and stream ID information at the head. The start code information is a code word having a unique coded sequence, and the position can be determined by searching the code word for the stream.

図１５は、本発明の第６の実施形態による映像復号装置の構成を示すブロック図である。図１５において、図６に示す装置と同一の部分には同一の符号を付し、その説明を簡単に行う。図１５に示す映像復号装置は、多重化された符号化データから特定の領域映像のストリームを探索して読み出す探索読み出し部１０６と、読み出す領域映像を決定する読み出し決定部１０２ａと、ストリームを復号する復号部１０３と、復号画像から一部を切り出す画像切り出し部１０４と、切り出し位置を決定する切り出し位置決定部１０５と、領域映像と共通映像とを合成して出力画像を出力する出力画像合成部とを備える。 FIG. 15 is a block diagram showing a configuration of a video decoding apparatus according to the sixth embodiment of the present invention. In FIG. 15, the same parts as those in the apparatus shown in FIG. The video decoding apparatus shown in FIG. 15 decodes a stream, a search reading unit 106 that searches and reads a stream of a specific area video from the multiplexed encoded data, a read determination unit 102a that determines a read area video, and the like. A decoding unit 103; an image cutout unit 104 that cuts out a part from the decoded image; a cutout position determination unit 105 that determines a cutout position; and an output image synthesis unit that combines the region video and the common video and outputs an output image; Is provided.

共通映像は図２に示すように領域映像の元の映像を縮小した映像とする。また領域映像と共通映像は図３（ａ）のように空間的に上下に配置しているものとする。解像度Ｐと解像度Ｑの面上における各領域映像の位置および出力画像の位置は図１３に示す位置と同様な位置とする。出力画像の位置は解像度Ｐの面上で設定されるものとする。またすべての映像ストリームおよび領域映像位置情報は予め映像復号装置に与えられているものとする。 The common video is a video obtained by reducing the original video of the area video as shown in FIG. Further, it is assumed that the area image and the common image are spatially arranged vertically as shown in FIG. The position of each area video and the position of the output image on the plane of resolution P and resolution Q are the same positions as shown in FIG. It is assumed that the position of the output image is set on the plane of resolution P. It is assumed that all video streams and area video position information are given to the video decoding device in advance.

次に、図１５を参照して、図１５に示す映像復号装置が領域映像を復号して出力画像を得る動作を説明する。まず、ユーザが出力画像の解像度Ｐおよび位置を設定すると、読み出し決定部１０２ａは、解像度Ｐの面上の出力画像の位置と、領域映像の位置情報より、各領域映像について出力画像の画素数が最も多くなる領域映像を判定する。図１３に示す例では、領域映像Ｂが最も多くなることから、読み出すストリームは領域映像Ｂのものと決定する。そして、探索読み出し部１０６は符号化データの中からスタートコード情報を探索し、続くストリームＩＤ情報を読み込むことで、領域映像Ｂのストリームの位置を判定し、領域映像Ｂのストリームを読み出す。復号部１０３は読み出したストリームを復号して復号画像を出力する。 Next, with reference to FIG. 15, the operation of the video decoding apparatus shown in FIG. First, when the user sets the resolution P and position of the output image, the readout determination unit 102a determines the number of pixels of the output image for each area video from the position of the output image on the surface of the resolution P and the position information of the area video. The region image with the largest amount is determined. In the example shown in FIG. 13, since the area video B is the largest, the stream to be read is determined to be that of the area video B. Then, the search / read unit 106 searches the encoded data for start code information, reads the subsequent stream ID information, determines the position of the stream of the area video B, and reads the stream of the area video B. The decoding unit 103 decodes the read stream and outputs a decoded image.

次に、切り出し位置決定部１０５は、領域映像Ｂの座標位置および出力画像の座標位置に基づき復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部１０４は復号画像から画像を切り出す。出力画像合成部１０７は復号画像から共通映像を取り出し、この共通映像と画像切り出し部１０４から出力する画像とを合成して出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。 Next, the cutout position determination unit 105 determines a position to cut out the output image from the decoded image based on the coordinate position of the region video B and the coordinate position of the output image. The image cutout unit 104 cuts out an image from the decoded image. The output image composition unit 107 extracts a common video from the decoded image, combines the common video and the image output from the image cutout unit 104, and outputs an output image. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image.

＜第７の実施形態＞
次に、本発明の第７の実施形態による映像送受信システムを説明する。第７の実施形態は、映像送信装置で、解像度Ｐの２次元面に対して３個の領域映像のストリーム（Ａ、Ｂ、Ｃ）と、解像度Ｑの２次元面に対して２個の領域映像のストリーム（Ｄ、Ｅ）で構成される映像情報の符号化データのうち、映像受信装置から要求された１個のストリームを送信し、映像受信装置でストリームを復号して画像情報を得る。図１６は、本発明の第７の実施形態による映像送信装置の構成を示すブロック図である。図１７は、本発明の第７の実施形態による映像受信装置の構成を示すブロック図である。図１６、図１７において、図７、図８に示す装置と同一の部分には同一の符号を付し、その説明を簡単に行う。 <Seventh Embodiment>
Next, a video transmission / reception system according to a seventh embodiment of the present invention will be described. The seventh embodiment is a video transmission apparatus in which a three-region video stream (A, B, C) with respect to a two-dimensional surface with resolution P and two regions with respect to a two-dimensional surface with resolution Q Among encoded data of video information composed of video streams (D, E), one stream requested from the video receiving device is transmitted, and the video receiving device decodes the stream to obtain image information. FIG. 16 is a block diagram showing a configuration of a video transmission apparatus according to the seventh embodiment of the present invention. FIG. 17 is a block diagram showing a configuration of a video receiving apparatus according to the seventh embodiment of the present invention. In FIGS. 16 and 17, the same parts as those in the apparatus shown in FIGS. 7 and 8 are denoted by the same reference numerals, and description thereof will be simply given.

図１６に示す映像送信装置は、５個の領域映像ストリームＡ、Ｂ、Ｃ、Ｄ、Ｅのうち、１個の領域映像ストリームを選択する領域映像ストリーム選択部２００ａと、要求された１個のストリームを送信するストリーム送信部２０１と、領域映像位置情報を送信する領域映像位置情報送信部２０２とを備える。 The video transmission apparatus shown in FIG. 16 includes an area video stream selection unit 200a that selects one area video stream from among the five area video streams A, B, C, D, and E, and the requested one. A stream transmission unit 201 that transmits a stream and an area video position information transmission unit 202 that transmits area video position information are provided.

図１７に示す映像受信装置は、領域映像位置情報を受信する領域映像位置情報受信部３０１と、ストリームを受信するストリーム受信部３０２と、受信したストリームを読み出す読み出し部３０３と、ストリームを復号する復号部３０４と、復号画像から一部を切り出す画像切り出し部３０５と、切り出し位置を決定する切り出し位置決定部３０６と、領域映像と共通映像とを合成して出力画像を出力する出力画像合成部３０７と、次の出力画像の解像度を指定する出力画像面指定部３１３と、次の出力画像の座標位置を指定する出力画像座標指定部３０８と、領域映像位置情報と次の出力画像の解像度と座標位置から要求する領域映像を決定する送信要求決定部３０９と、領域映像を要求する送信要求部３１０とを備える。 17 includes an area video position information receiving unit 301 that receives area video position information, a stream receiving unit 302 that receives a stream, a reading unit 303 that reads out the received stream, and a decoding that decodes the stream. Unit 304, image cutout unit 305 that cuts out a part from the decoded image, cutout position determination unit 306 that determines a cutout position, and output image composition unit 307 that combines the region video and the common video and outputs an output image An output image plane designating unit 313 for designating the resolution of the next output image, an output image coordinate designating unit 308 for designating the coordinate position of the next output image, the region video position information, the resolution and coordinate position of the next output image A transmission request determining unit 309 that determines a region video to be requested from, and a transmission request unit 310 that requests a region video.

共通映像は図２に示すように領域映像の元の映像を縮小した映像とする。また領域映像と共通映像は図３（ａ）のように空間的に上下に配置しているものとする。解像度Ｐと解像度Ｑの面上における各領域映像の位置および出力画像の位置は図１３に示す位置と同様な位置とする。出力画像の位置は解像度Ｐの面上で、次の出力画像の位置は解像度Ｑの面上で設定されるものとする。またすべての映像ストリームおよび領域映像位置情報は予め映像送信装置に与えられているものとする。また領域映像位置情報は解像度ごとに異なる基準位置に対しての位置情報であり、解像度の情報と位置情報とからなる。また領域映像を区別する情報として領域映像ＩＤ（Ａ、Ｂ、Ｃ、Ｄ、Ｅのいずれか）があり、映像受信装置の送信要求部３１０は領域映像ＩＤを送信し、映像送信装置のストリーム送信部２０１は領域映像ＩＤを受信するものとする。また映像送信装置のストリーム送信部２０１は、映像受信装置からの要求がきていない場合には、解像度Ｐの領域映像Ａのストリームを送信するものとする。 The common video is a video obtained by reducing the original video of the area video as shown in FIG. Further, it is assumed that the area image and the common image are spatially arranged vertically as shown in FIG. The position of each area video and the position of the output image on the plane of resolution P and resolution Q are the same positions as shown in FIG. Assume that the position of the output image is set on the plane of resolution P, and the position of the next output image is set on the plane of resolution Q. It is assumed that all video streams and area video position information are given to the video transmission device in advance. The area video position information is position information with respect to a reference position that is different for each resolution, and includes resolution information and position information. The area video ID (A, B, C, D, or E) is used as information for distinguishing the area video, and the transmission request unit 310 of the video reception apparatus transmits the area video ID, and the video transmission apparatus performs stream transmission. Assume that unit 201 receives an area video ID. In addition, the stream transmission unit 201 of the video transmission device transmits a stream of the region video A with the resolution P when there is no request from the video reception device.

次に、図１６、図１７を参照して、映像送信装置と映像受信装置が領域映像のストリームを送受信して領域映像を復号して出力画像を得る動作を説明する。まず、映像送信装置の領域映像位置情報送信部２０２は領域映像位置情報を送信する。これを受けて、映像受信装置の領域映像位置情報受信部３０１は領域映像位置情報を受信する。続いて、映像送信装置のストリーム送信部２０１は映像受信装置からの要求がきていないため、解像度Ｐの領域映像Ａのストリームを送信する。 Next, with reference to FIGS. 16 and 17, an operation in which the video transmission device and the video reception device transmit and receive a region video stream and decode the region video to obtain an output image will be described. First, the area video position information transmission unit 202 of the video transmission apparatus transmits area video position information. In response to this, the area video position information receiving unit 301 of the video receiving apparatus receives the area video position information. Subsequently, the stream transmission unit 201 of the video transmission apparatus transmits the stream of the area video A with the resolution P because there is no request from the video reception apparatus.

次に、映像受信装置のストリーム受信部３０２は映像送信装置からのストリームを受信し、読み出し部３０３が受信したストリームを読み出し、復号部３０４はストリームを復号して復号画像を出力する。切り出し位置決定部３０６は、解像度Ｐの領域映像Ａの座標位置および出力画像の座標位置に基づき復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部３０５は切り出し位置に基づき復号画像から画像を切り出す。出力画像合成部３０７は復号画像から共通映像を取り出し、この共通映像と画像切り出し部３０５から出力する画像とを合成した出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。 Next, the stream receiving unit 302 of the video receiving device receives the stream from the video transmitting device, the reading unit 303 reads the received stream, and the decoding unit 304 decodes the stream and outputs a decoded image. The cutout position determination unit 306 determines a position to cut out the output image from the decoded image based on the coordinate position of the region video A with resolution P and the coordinate position of the output image. The image cutout unit 305 cuts out an image from the decoded image based on the cutout position. The output image synthesis unit 307 extracts a common video from the decoded image, and outputs an output image obtained by synthesizing the common video and the image output from the image clipping unit 305. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image.

次に、出力画像面指定部３１３は次の出力画像の解像度Ｑを指定する。出力画像座標指定部３０８は次の出力画像の座標位置を指定する。送信要求決定部３０９は次の出力画像の座標位置から領域映像Ｅを決定する。送信要求部３１０は領域映像Ｅの領域映像ＩＤを送信する。 Next, the output image plane designation unit 313 designates the resolution Q of the next output image. The output image coordinate designation unit 308 designates the coordinate position of the next output image. The transmission request determination unit 309 determines the region video E from the coordinate position of the next output image. The transmission request unit 310 transmits the area video ID of the area video E.

次に、映像送信装置のストリーム送信部２０１は領域映像ＩＤとして領域映像Ｅを受信するので、解像度Ｑの領域映像Ｅのストリームを送信する。映像受信装置のストリーム受信部３０２は映像送信装置からのストリームを受信し、読み出し部３０３が受信したストリームを読み出し、復号部３０４はストリームを復号して復号画像を出力する。切り出し位置決定部３０６は、解像度Ｑの領域映像Ｅの座標位置および出力画像の座標位置に基づき復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部３０５は復号画像から画像を切り出す。出力画像合成部３０７は復号画像から共通映像を取り出し、この共通映像と画像切り出し部３０５から出力する画像とを合成した出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。以上の処理動作を出力画像の位置が指定されるたびに繰り返し実行する。 Next, since the stream transmission unit 201 of the video transmission apparatus receives the area video E as the area video ID, the stream transmission unit 201 transmits a stream of the area video E with the resolution Q. The stream receiving unit 302 of the video receiving device receives the stream from the video transmitting device, the reading unit 303 reads the received stream, and the decoding unit 304 decodes the stream and outputs a decoded image. The cutout position determination unit 306 determines a position to cut out the output image from the decoded image based on the coordinate position of the region video E with the resolution Q and the coordinate position of the output image. The image cutout unit 305 cuts out an image from the decoded image. The output image synthesis unit 307 extracts a common video from the decoded image, and outputs an output image obtained by synthesizing the common video and the image output from the image clipping unit 305. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image. The above processing operation is repeatedly executed every time the position of the output image is designated.

なお、映像送信装置は予め映像ストリームを用意するのではなく、送信する際に領域映像を切りだして符号化し、得られたストリームを送信する方法でもよい。また、本実施形態は解像度の異なる二つの面で領域映像を定義したが、空間的な位置が離れた二つの面で領域映像を定義してもよい。この場合の動作も本実施形態と同様な処理動作を行えばよい。 Note that the video transmission apparatus may not prepare the video stream in advance, but may cut out and encode the area video when transmitting and transmit the obtained stream. In the present embodiment, the region video is defined by two surfaces having different resolutions. However, the region video may be defined by two surfaces having different spatial positions. The operation in this case may be the same processing operation as in this embodiment.

＜第８の実施形態＞
次に、本発明の第８の実施形態による映像送受信システムを説明する。第８の実施形態は、映像送信装置で、解像度Ｐの２次元面に対して３個の領域映像のストリーム（Ａ、Ｂ、Ｃ）と、解像度Ｑの２次元面に対して２個の領域映像のストリーム（Ｄ、Ｅ）がストリームＩＤと多重化された符号化データから、映像受信装置から要求された複数の領域映像のストリームを送信し、映像受信装置でストリームを復号して画像情報を得るものである。第８の実施形態による映像送信装置は図１０に示す構成と同様であるため、説明を省略する。図１８は、本発明の第８の実施形態による映像受信装置の構成を示すブロック図である。図１８において、図１１、図１７に示す装置と同一の部分には同一の符号を付し、その説明を簡単に行う。また、多重化された符号化データの形式を図１４と同様である。 <Eighth Embodiment>
Next, a video transmission / reception system according to an eighth embodiment of the present invention will be described. The eighth embodiment is a video transmission apparatus in which three area video streams (A, B, and C) with respect to a two-dimensional plane with resolution P and two areas with respect to a two-dimensional plane with resolution Q From the encoded data in which the video stream (D, E) is multiplexed with the stream ID, a plurality of area video streams requested from the video receiving device are transmitted, and the video receiving device decodes the stream to obtain the image information. To get. The video transmission apparatus according to the eighth embodiment is the same as that shown in FIG. FIG. 18 is a block diagram showing a configuration of a video reception apparatus according to the eighth embodiment of the present invention. In FIG. 18, the same parts as those in the apparatus shown in FIGS. 11 and 17 are denoted by the same reference numerals, and description thereof will be simply given. The format of the multiplexed encoded data is the same as that in FIG.

図１８に示す映像受信装置は、領域映像位置情報を受信する領域映像位置情報受信部３０１と、ストリームを受信するストリーム受信部３０２と、読み出すストリームを決定する読み出し決定部３１１と、受信したストリームから読み出し決定部３１１で決定したストリームを探索して読み出す探索読み出し部３１２と、ストリームを復号する復号部３０４と、復号画像から一部を切り出す画像切り出し部３０５と、切り出し位置を決定する切り出し位置決定部３０６と、領域映像と共通映像とを合成して出力画像を出力する出力画像合成部３０７と、次の出力画像の解像度を指定する出力画像面指定部３１３と、次の出力画像の座標位置を指定する出力画像座標指定部３０８と、領域映像位置情報と次の出力画像の解像度と座標位置から要求する領域映像を決定する送信要求決定部３０９と、領域映像を要求する送信要求部３１０とを備える。 The video receiver shown in FIG. 18 includes an area video position information receiving unit 301 that receives area video position information, a stream receiving unit 302 that receives a stream, a read determining unit 311 that determines a stream to be read, and a received stream. A search reading unit 312 for searching and reading the stream determined by the read determination unit 311, a decoding unit 304 for decoding the stream, an image cutout unit 305 for cutting out a part from the decoded image, and a cutout position determination unit for determining the cutout position 306, an output image combining unit 307 that combines the region image and the common image and outputs an output image, an output image plane specifying unit 313 that specifies the resolution of the next output image, and the coordinate position of the next output image It is necessary from the output image coordinate specification unit 308 to be specified, the region video position information, the resolution and coordinate position of the next output image. It comprises a transmission request determining unit 309 determines an area image that, the transmission requesting unit 310 for requesting the area image.

共通映像は図２に示すように領域映像の元の映像を縮小した映像とする。また領域映像と共通映像は図３（ａ）のように空間的に上下に配置しているものとする。解像度Ｐと解像度Ｑの面上における各領域映像の位置および出力画像の位置は図１３に示す位置と同様な位置とする。出力画像の位置は解像度Ｐの面上で、次の出力画像の位置は解像度Ｑの面上で設定されるものとする。また映像送信装置では符号化データおよび領域映像位置情報は予め与えられているものとする。領域映像位置情報は解像度ごとに異なる基準位置に対しての位置情報であり、解像度の情報と位置情報とからなる。また映像送信装置は、映像受信装置からの要求がきていない場合には、解像度Ｐの領域映像Ａと領域映像Ｂのストリームを送信するものとする。また映像受信装置では、送信要求決定部３０９は、出力画像の画素数が多い２つの領域映像を要求するように決定するものとする。 The common video is a video obtained by reducing the original video of the area video as shown in FIG. Further, it is assumed that the area image and the common image are spatially arranged vertically as shown in FIG. The position of each area video and the position of the output image on the plane of resolution P and resolution Q are the same positions as shown in FIG. Assume that the position of the output image is set on the plane of resolution P, and the position of the next output image is set on the plane of resolution Q. In the video transmission apparatus, it is assumed that encoded data and area video position information are given in advance. The area video position information is position information with respect to a reference position that is different for each resolution, and includes resolution information and position information. In addition, the video transmission device transmits the stream of the region video A and the region video B with the resolution P when there is no request from the video reception device. In the video reception device, the transmission request determination unit 309 determines to request two area videos having a large number of pixels in the output image.

次に、図１０、図１８を参照して、図１０、図１８に示す映像送信装置と映像受信装置が領域映像のストリームを送受信して領域映像を復号して出力画像を得る動作を説明する。まず、映像送信装置の領域映像位置情報送信部２０２は領域映像位置情報を送信する。これを受けて、映像受信装置の領域映像位置情報受信部３０１は領域映像位置情報を受信する。続いて、映像送信装置のストリーム送信部２０１ａは映像受信装置からの要求がきていないため、解像度Ｐの領域映像Ａと領域映像Ｂのストリームを送信する。 Next, with reference to FIG. 10 and FIG. 18, an operation in which the video transmission device and the video reception device shown in FIG. 10 and FIG. . First, the area video position information transmission unit 202 of the video transmission apparatus transmits area video position information. In response to this, the area video position information receiving unit 301 of the video receiving apparatus receives the area video position information. Subsequently, the stream transmission unit 201a of the video transmission apparatus transmits the stream of the area video A and the area video B with the resolution P because there is no request from the video reception apparatus.

次に、映像受信装置のストリーム受信部３０２は映像送信装置からのストリームを受信する。読み出し決定部３１１は、解像度Ｐの面上の出力画像の位置と、領域映像の位置情報より、各領域映像について出力画像の画素数が最も多くなる領域映像を判定する。図１３に示す例では、領域映像Ｂが最も多くなることから、読み出すストリームは領域映像Ｂのものと決定する。そして、探索読み出し部３１２は符号化データの中からスタートコード情報を探索し、続くストリームＩＤ情報を読み込むことで、領域映像Ｂのストリームの位置を判定し、領域映像Ｂのストリームを読み出す。 Next, the stream reception unit 302 of the video reception device receives a stream from the video transmission device. Based on the position of the output image on the surface having the resolution P and the position information of the region video, the read determination unit 311 determines the region video having the largest number of pixels of the output image for each region video. In the example shown in FIG. 13, since the area video B is the largest, the stream to be read is determined to be that of the area video B. Then, the search / read unit 312 searches the encoded data for start code information, reads the subsequent stream ID information, determines the position of the stream of the area video B, and reads the stream of the area video B.

次に、復号部３０４はストリームを復号して復号画像を出力する。切り出し位置決定部３０６は、領域映像Ｂの座標位置および出力画像の座標位置に基づき復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部３０５は切り出し位置に基づき復号画像から画像を切り出す。出力画像合成部３０７は復号画像から共通映像を取り出し、この共通映像と画像切り出し部３０５から出力する画像とを合成した出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。 Next, the decoding unit 304 decodes the stream and outputs a decoded image. The cutout position determination unit 306 determines a position to cut out the output image from the decoded image based on the coordinate position of the region video B and the coordinate position of the output image. The image cutout unit 305 cuts out an image from the decoded image based on the cutout position. The output image synthesis unit 307 extracts a common video from the decoded image, and outputs an output image obtained by synthesizing the common video and the image output from the image clipping unit 305. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image.

次に、出力画像面指定部３１３は次の出力画像の解像度Ｑを指定する。出力画像座標指定部３０８は次の出力画像の座標位置を指定する。送信要求決定部３０９は次の出力画像の座標位置から領域映像Ｄと領域映像Ｅを決定する。送信要求部３１０は領域映像Ｄと領域映像ＥのストリームＩＤを送信する。続いて、映像送信装置のストリーム送信部２０１ａはストリームＩＤとして領域映像Ｄと領域映像Ｅを受信するので、領域映像Ｄと領域映像Ｅのストリームを送信する。 Next, the output image plane designation unit 313 designates the resolution Q of the next output image. The output image coordinate designation unit 308 designates the coordinate position of the next output image. The transmission request determination unit 309 determines the region video D and the region video E from the coordinate position of the next output image. The transmission request unit 310 transmits the stream IDs of the area video D and the area video E. Subsequently, since the stream transmission unit 201a of the video transmission apparatus receives the region video D and the region video E as the stream ID, the stream of the region video D and the region video E is transmitted.

次に、映像受信装置のストリーム受信部３０２は映像送信装置からのストリームを受信し、読み出し決定部３１１は、解像度Ｑで出力画像の画素数は領域映像Ｅが最も多くなることから、読み出すストリームは領域映像Ｅのものと決定する。そして、探索読み出し部３１２は符号化データの中からスタートコード情報を探索し、続くストリームＩＤ情報を読み込むことで、領域映像Ｅのストリームの位置を判定し、領域映像Ｅのストリームを読み出す。復号部３０４はストリームを復号して復号画像を出力する。切り出し位置決定部３０６は、領域映像Ｅの座標位置および出力画像の座標位置に基づき復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部３０５は切り出し位置に基づき復号画像から画像を切り出す。出力画像合成部３０７は復号画像から共通映像を取り出し、この共通映像と画像切り出し部３０５から出力する画像とを合成した出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。以上の処理動作を出力画像の位置が指定されるたびに繰り返し実行する。 Next, the stream reception unit 302 of the video reception device receives the stream from the video transmission device, and the read determination unit 311 has the resolution Q and the number of pixels of the output image is the largest in the region video E, so the stream to be read is It is determined to be that of the area image E. Then, the search reading unit 312 searches the encoded data for start code information, reads the subsequent stream ID information, determines the position of the stream of the area video E, and reads the stream of the area video E. The decoding unit 304 decodes the stream and outputs a decoded image. The cutout position determination unit 306 determines a position to cut out the output image from the decoded image based on the coordinate position of the region video E and the coordinate position of the output image. The image cutout unit 305 cuts out an image from the decoded image based on the cutout position. The output image synthesis unit 307 extracts a common video from the decoded image, and outputs an output image obtained by synthesizing the common video and the image output from the image clipping unit 305. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image. The above processing operation is repeatedly executed every time the position of the output image is designated.

なお、本実施形態は解像度の異なる二つの面で領域映像を定義したが、空間的な位置が離れた二つの面で領域映像を定義してもよい。この場合の動作も本実施形態と同様な処理動作を行えばよい。また、映像送信装置は予め映像ストリームを用意するのではなく、送信する際に領域映像を切りだして符号化し、得られたストリームを送信する方法でもよい。 In the present embodiment, the area video is defined by two planes having different resolutions, but the area video may be defined by two planes separated in spatial position. The operation in this case may be the same processing operation as in this embodiment. Further, the video transmission apparatus may not prepare the video stream in advance, but may cut out and encode the area video when transmitting and transmit the obtained stream.

また前述した説明においては、領域映像と共通映像は図３（ａ）のように空間的に上下に配置しているものとしたが、図３（ｂ）のように左右に配置してもよい。この時、領域映像と共通映像の位置関係は上下、左右が逆になっていてもよい。空間的に配置するにあたって共有映像か領域映像の一部で足りない部分は、適当な画像で埋めるようにしてもよい。埋める画像は常に同じ色の映像を使ってもよい。 In the above description, the area video and the common video are spatially arranged vertically as shown in FIG. 3A, but may be arranged horizontally as shown in FIG. 3B. . At this time, the positional relationship between the region image and the common image may be reversed vertically and horizontally. A portion of the shared video or area video that is not sufficient for spatial arrangement may be filled with an appropriate image. The image to be filled may always use the same color video.

図１９は、共通映像と領域映像の時間的な配置例を示す説明図である。領域映像と共通映像は、図１９に示すように時間的に配置してもよい。この時の出力画像を得るまでの処理動作を、図１に示す映像復号装置を参照して説明する。まず、ユーザが出力画像の位置を設定すると、読み出し決定部１０２は、２次元面上の出力画像の位置と、領域映像の位置情報より、各領域映像について出力画像の画素数が最も多くなる領域映像を判定する。図４に示す例では、領域映像Ｂが最も多くなることから、読み出すストリームは領域映像Ｂのものと決定する。そして、読み出し部１０１は領域映像Ｂのストリームを読み出し、復号部１０３はストリームを復号して復号画像を出力する。切り出し位置決定部１０５は、領域映像Ｂの座標位置および出力画像の座標位置に基づき復号画像の中から出力画像を切り出す位置を決定する。画像切り出し部１０４は復号画像から画像を切り出す。 FIG. 19 is an explanatory diagram illustrating a temporal arrangement example of the common video and the region video. The area video and the common video may be temporally arranged as shown in FIG. Processing operations up to obtaining an output image at this time will be described with reference to the video decoding apparatus shown in FIG. First, when the user sets the position of the output image, the read determination unit 102 determines the region in which the number of pixels of the output image is the largest for each region video from the position of the output image on the two-dimensional plane and the position information of the region video Determine the video. In the example shown in FIG. 4, the area video B is the largest, so the stream to be read is determined to be that of the area video B. Then, the reading unit 101 reads the stream of the area video B, and the decoding unit 103 decodes the stream and outputs a decoded image. The cutout position determination unit 105 determines a position to cut out the output image from the decoded image based on the coordinate position of the region video B and the coordinate position of the output image. The image cutout unit 104 cuts out an image from the decoded image.

さらに、読み出し部１０１は領域映像Ｂのストリームを読み出し、復号部１０３はストリームを復号して復号画像を得る。この時の復号画像は共通映像となる。出力画像合成部１０７は、画像切り出し部１０４で得られた画像と、共通映像とを合成した出力画像を出力する。このとき共通映像の映像を拡大し、領域映像を重畳して出力画像を得る。すなわち読み出し部１０１が映像ストリームを読み出して復号部１０３がストリームを復号する処理を２回実行することで、領域映像と共通映像を得ることができる。以上の処理動作は上記の他の実施形態についても同様に実施することができる。 Further, the reading unit 101 reads the stream of the area video B, and the decoding unit 103 decodes the stream to obtain a decoded image. The decoded image at this time is a common video. The output image synthesis unit 107 outputs an output image obtained by synthesizing the image obtained by the image cutout unit 104 and the common video. At this time, the video of the common video is enlarged and the region video is superimposed to obtain an output image. That is, the readout unit 101 reads out the video stream and the decoding unit 103 executes the process of decoding the stream twice, so that the region video and the common video can be obtained. The above processing operations can be similarly performed for the other embodiments described above.

以上説明したように、映像の一部の領域のみを視聴する際に、演算処理能力が低く映像ストリームを１つしか復号できない計算機で視聴する場合であっても、ユーザが指定した出力画像の位置に対して表示する画素数が最も多くなるように、映像ストリームを復号し切り出すことができる。そして切り出し映像と領域映像の位置に依存しない共通映像を合成して出力画像を得ることができ、このような合成した結果を出力画像に設定することで、視聴者が映像を視聴中に視聴位置を変化した場合に、領域映像の外側に超えた場合に共通映像として用意した映像を表示することが可能となる。 As described above, when viewing only a partial area of the video, the position of the output image specified by the user, even when viewing with a computer that has a low processing capacity and can only decode one video stream. The video stream can be decoded and cut out so that the number of pixels to be displayed is the largest. An output image can be obtained by synthesizing the common video that does not depend on the position of the clipped video and the region video. By setting such a synthesized result as the output image, the viewer can view the video while viewing the video. When the change is made, it is possible to display a video prepared as a common video when the outside of the area video is exceeded.

なお、画像情報として、色信号だけではなく透明度情報や奥行き情報などのグレースケール画像として扱える情報も一緒に扱ってもよい。この場合には、映像ストリームには色信号だけではなく透明度情報や奥行き情報も含まれており、復号部は色信号だけではなく、透明度情報や奥行き情報も復号して得る。そして画像切り出し部から切り出して得られる画像情報にも透明度情報や奥行き情報が含まれる。また共通映像として得られる画像情報にも透明度情報や奥行情報が含まれる。また映像復号装置あるいは映像受信装置で得られる画像情報の色信号と、透明度情報や奥行き情報から色信号を加工してから表示をしてもよい。例えば、奥行き情報を持つ場合には、立体映像として表示をする、あるいは別の視点からの映像として表示をしてもよい。 As image information, not only color signals but also information that can be handled as grayscale images such as transparency information and depth information may be handled together. In this case, the video stream includes not only color signals but also transparency information and depth information, and the decoding unit obtains not only color signals but also transparency information and depth information. The image information obtained by cutting out from the image cutout unit also includes transparency information and depth information. The image information obtained as a common video also includes transparency information and depth information. Further, the display may be performed after the color signal is processed from the color signal of the image information obtained by the video decoding device or the video receiving device, and the transparency information and depth information. For example, when there is depth information, it may be displayed as a stereoscopic video or may be displayed as a video from another viewpoint.

また、図１、図６、図７、図８、図９、図１０、図１１、図１２、図１５、図１６、図１７、図１８に示す各装置の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより映像復号処理、映像送信処理及び映像受信処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータシステム」は、ホームページ提供環境（あるいは表示環境）を備えたＷＷＷシステムも含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（ＲＡＭ）のように、一定時間プログラムを保持しているものも含むものとする。 In addition, programs for realizing the functions of the devices shown in FIGS. 1, 6, 7, 8, 8, 9, 10, 11, 12, 15, 15, 16, 17, and 18 are provided. A video decoding process, a video transmission process, and a video reception process may be performed by recording in a computer-readable recording medium, reading the program recorded in the recording medium into a computer system, and executing the program. Here, the “computer system” includes an OS and hardware such as peripheral devices. The “computer system” includes a WWW system having a homepage providing environment (or display environment). The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Further, the “computer-readable recording medium” refers to a volatile memory (RAM) in a computer system that becomes a server or a client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line. In addition, those holding programs for a certain period of time are also included.

また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク（通信網）や電話回線等の通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。また、上記プログラムは、前述した機能の一部を実現するためのものであってもよい。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であってもよい。 The program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line. The program may be for realizing a part of the functions described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, what is called a difference file (difference program) may be sufficient.

以上、図面を参照して本発明の実施の形態を説明してきたが、上記実施の形態は本発明の例示に過ぎず、本発明が上記実施の形態に限定されるものではないことは明らかである。したがって、本発明の技術思想及び範囲を逸脱しない範囲で構成要素の追加、省略、置換、その他の変更を行ってもよい。 As mentioned above, although embodiment of this invention has been described with reference to drawings, the said embodiment is only the illustration of this invention, and it is clear that this invention is not limited to the said embodiment. is there. Therefore, additions, omissions, substitutions, and other modifications of the components may be made without departing from the technical idea and scope of the present invention.

映像の一部の領域のみを視聴する際に、ユーザの操作による視聴領域の移動が大きい場合に表示するべき画像が存在しないために画像の表示ができないという状況を回避することが不可欠な用途に適用できる。 When viewing only a partial area of video, it is indispensable to avoid the situation where the image cannot be displayed because there is no image to be displayed when the movement of the viewing area by the user's operation is large. Applicable.

１００…領域映像ストリーム選択部、１０１…読み出し部、１０２…読み出し決定部、１０３…復号部、１０４…画像切り出し部、１０５…切り出し位置決定部、１０６…探索読み出し部、１０７…出力画像合成部、２００…領域映像ストリーム選択部、２０１…ストリーム送信部、２０２…領域映像位置情報送信部、３０１…領域映像位置情報受信部、３０２…ストリーム受信部、３０３…読み出し部、３０４…復号部、３０５…画像切り出し部、３０６…切り出し位置決定部、３０７…出力画像合成部、３０８…出力画像座標指定部、３０９…送信要求決定部、３１０…送信要求部、３１１…読み出し決定部、３１２…探索読み出し部、３１３…出力画像面指定部 DESCRIPTION OF SYMBOLS 100 ... Area | region video stream selection part, 101 ... Reading part, 102 ... Reading determination part, 103 ... Decoding part, 104 ... Image clipping part, 105 ... Clipping position determination part, 106 ... Search reading part, 107 ... Output image composition part, DESCRIPTION OF SYMBOLS 200 ... Area video stream selection part, 201 ... Stream transmission part, 202 ... Area video position information transmission part, 301 ... Area video position information reception part, 302 ... Stream reception part, 303 ... Reading part, 304 ... Decoding part, 305 ... Image cutout unit, 306 ... cutout position determination unit, 307 ... output image composition unit, 308 ... output image coordinate designation unit, 309 ... transmission request decision unit, 310 ... transmission request unit, 311 ... read decision unit, 312 ... search read unit 313: Output image plane designation section

Claims

Area video position information indicating N (N is a natural number) different coordinate positions with respect to a reference position on a plane expressed in two dimensions, and an area video corresponding to the area video position information and a common video are encoded. A video decoding device that receives the converted N video streams and obtains an output image by decoding a part of the video streams,
Read that determines to read the i-th (1 ≦ i ≦ N) video stream from the coordinate position of the output image with respect to the reference position and the region video position information so that the number of pixels of the output image can be obtained most. A decision unit;
A reading unit for reading the i-th video stream;
A decoding unit for decoding the video stream read by the reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
A video decoding apparatus comprising: an output image synthesis unit that outputs an output image by synthesizing the common video among the decoded images obtained in the decoding unit and the image cut out in the image cutout unit.

It is composed of area video position information indicating N (N is a natural number) different coordinate positions with respect to a reference position on a plane expressed in two dimensions, and an area video corresponding to the area video position information and a common video. A video decoding apparatus that receives N pieces of video streams and encoded data obtained by multiplexing and encoding ID information that distinguishes the video streams, and decodes a part of the encoded data to obtain an output image. And
It is determined to read out the i-th (1 ≦ i ≦ N) video information stream from the coordinate position of the output image with respect to the reference position and the region video position information so that the number of pixels of the output image is the largest. A read determination unit to perform,
A search readout unit that searches for and reads out the i-th video stream from the ID information in the encoded data;
A decoding unit for decoding the video stream read by the search reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
A video decoding apparatus comprising: an output image synthesis unit that outputs an output image by synthesizing the common video among the decoded images obtained in the decoding unit and the image cut out in the image cutout unit.

An area image position information transmitting unit for transmitting area image position information, which is the coordinate position of N (N is a natural number) partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions; ,
A video transmission device comprising: a stream transmission unit that transmits a requested i-th (1 ≦ i ≦ N) video stream among N video streams obtained by encoding the region video and the common video;
An area image position information receiving unit for receiving the area image position information;
A stream receiver for receiving the transmitted video stream;
A reading unit for reading the received video stream;
A decoding unit for decoding the video stream read by the reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
An output image combining unit that combines the common video of the decoded images obtained in the decoding unit and the image cut out in the image cutout unit to output an output image;
An output image coordinate designating unit for designating the coordinate position of the next output image;
A transmission requesting unit for requesting transmission of a j-th (1 ≦ j ≦ N) video stream from which the number of pixels of the next output image is most obtained from the coordinate position of the next output image and the region video position information; A video transmission / reception system comprising: a video reception device comprising:

An area image position information transmitting unit for transmitting area image position information, which is the coordinate position of N (N is a natural number) partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions; ,
From the encoded data obtained by multiplexing and encoding the N video streams composed of the area video and the common video and the ID information for distinguishing the video streams, the M (M ≦ N) encoded data A video transmission device comprising a stream transmission unit for extracting and transmitting
An area image position information receiving unit for receiving the area image position information;
A stream receiver for receiving the encoded data;
A read decision for determining that the i-th (1 ≦ i ≦ M) video stream is read from the coordinate position of the output image with respect to the reference position and the region video position information so as to obtain the largest number of pixels of the output image. And
A search readout unit that searches for and reads out the i-th video stream from the ID information in the encoded data;
A decoding unit for decoding the video stream read by the search reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
An output image combining unit that combines the common video of the decoded images obtained in the decoding unit and the image cut out in the image cutout unit to output an output image;
An output image coordinate designating unit for designating the coordinate position of the next output image;
A video receiving device comprising: a transmission requesting unit that requests transmission of M video streams so that the number of pixels of the next output image can be obtained most from the coordinate position of the next output image and the region video position information; A video transmission / reception system comprising:

An area image position information transmitting unit for transmitting area image position information, which is the coordinate position of N (N is a natural number) partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions; ,
A video transmission device comprising: a stream transmission unit that transmits a video stream obtained by encoding an i-th (1 ≦ i ≦ N) region video that is a part of the original video and a common video;
An area image position information receiving unit for receiving the area image position information;
A stream receiver for receiving the video stream;
A reading unit for reading the received video stream;
A decoding unit for decoding the video stream read by the reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
An output image combining unit that combines the common video of the decoded images obtained in the decoding unit and the image cut out in the image cutout unit to output an output image;
An output image coordinate designating unit for designating the coordinate position of the next output image;
A transmission requesting unit for requesting transmission of a j-th (1 ≦ j ≦ N) video stream from which the number of pixels of the next output image is most obtained from the coordinate position of the next output image and the region video position information; A video transmission / reception system comprising: a video reception device comprising:

An area image position information transmitting unit for transmitting area image position information, which is the coordinate position of N (N is a natural number) partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions; ,
Encoding by multiplexing and encoding N video streams composed of M area videos (M ≦ N) and common video, which are part of the original video, and ID information for distinguishing the video streams A video transmission device comprising a stream transmission unit for transmitting data;
An area image position information receiving unit for receiving the area image position information;
A stream receiver for receiving the encoded data;
A read decision for determining that the i-th (1 ≦ i ≦ M) video stream is read from the coordinate position of the output image with respect to the reference position and the region video position information so as to obtain the largest number of pixels of the output image. And
A search readout unit that searches for and reads out the i-th video stream from the ID information in the encoded data;
A decoding unit for decoding the video stream read by the search reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
An output image combining unit that combines the common video of the decoded images obtained in the decoding unit and the image cut out in the image cutout unit to output an output image;
An output image coordinate designating unit for designating the coordinate position of the next output image;
A video receiving device comprising: a transmission requesting unit that requests transmission of M video streams so that the number of pixels of the next output image can be obtained most from the coordinate position of the next output image and the region video position information; A video transmission / reception system comprising:

Region video position information indicating K (1 ≦ K ≦ L) different coordinate positions with respect to a reference position on each surface for L (L ≧ 2) surfaces expressed in two dimensions; A video decoding apparatus for inputting an L video stream obtained by encoding an area video corresponding to area video position information and a common video, and decoding a part of the video stream to obtain an output image,
From the coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) where the output image is cut out and the region image position information, the i-th (1 A read determining unit that determines to read a video stream of ≦ i ≦ C);
A reading unit for reading the i-th video stream;
A decoding unit for decoding the video stream read by the reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
A video decoding apparatus comprising: an output image synthesis unit that outputs an output image by synthesizing the common video among the decoded images obtained in the decoding unit and the image cut out in the image cutout unit.

Region video position information indicating K (1 ≦ K ≦ L) different coordinate positions with respect to a reference position on each surface for L (L ≧ 2) surfaces expressed in two dimensions; N pieces of video streams composed of the region video corresponding to the region video position information and the common video, and encoded data obtained by multiplexing and encoding ID information for distinguishing the video stream are input, and the encoded data A video decoding device that decodes a part to obtain an output image,
From the coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) where the output image is cut out and the region image position information, the i-th (1 A read determining unit that determines to read a video stream of ≦ i ≦ C);
A search and read unit that searches for and reads the i-th video stream from the ID information in the encoded data;
A decoding unit for decoding the video stream read by the search reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
A video decoding apparatus comprising: an output image synthesis unit that outputs an output image by synthesizing the common video among the decoded images obtained in the decoding unit and the image cut out in the image cutout unit.

For L (L ≧ 2) surfaces expressed in two dimensions, all the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each surface An area image position information transmission unit for transmitting area image position information;
The requested i-th (1 ≦ i ≦ K) video stream is transmitted among K video streams obtained by encoding the region video and the common video on the plane C (1 ≦ C ≦ L) where the output image is cut out. A video transmission device comprising a stream transmission unit;
An area image position information receiving unit for receiving the area image position information;
A stream receiver for receiving the transmitted video stream;
A reading unit for reading the received video stream;
A decoding unit for decoding the video stream read by the reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
An output image combining unit that combines the common video of the decoded images obtained in the decoding unit and the image cut out in the image cutout unit to output an output image;
An output image plane designating unit for designating a plane D (1 ≦ D ≦ L) for cutting out the next output image;
An output image coordinate designating unit for designating the coordinate position of the next output image;
From the coordinate position of the next output image with respect to the reference position on the surface D from which the output image is cut out and the region video position information, the j-th (1 ≦ j ≦ D) number of pixels of the next output image is obtained most. A video transmission / reception system comprising: a video reception device including a transmission request unit that requests transmission of a video stream.

For L (L ≧ 2) surfaces expressed in two dimensions, all the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each surface An area image position information transmission unit for transmitting area image position information;
From the encoded data obtained by multiplexing and encoding the N video streams composed of the area video and the common video and the ID information for distinguishing the video streams, the M (M ≦ K) encoded data A video transmission device comprising a stream transmission unit for extracting and transmitting
An area image position information receiving unit for receiving the area image position information;
A stream receiver for receiving the encoded data;
From the coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) where the output image is cut out and the region image position information, the i th (1 ≦ i) A read determination unit that determines to read a video stream of ≦ M);
A search readout unit that searches for and reads out the i-th video stream from the ID information in the encoded data;
A decoding unit for decoding the video stream read by the search reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
An output image combining unit that combines the common video of the decoded images obtained in the decoding unit and the image cut out in the image cutout unit to output an output image;
An output image plane designating unit for designating a plane D (1 ≦ D ≦ L) for cutting out the next output image;
An output image coordinate designating unit for designating the coordinate position of the next output image;
From the coordinate position of the next output image with respect to the reference position on the plane D designated by the output image plane designating unit and the region video position information, M videos are obtained so that the number of pixels of the next output image is maximized. A video receiving system comprising: a video receiving device including a transmission requesting unit that requests transmission of a stream.

For L (L ≧ 2) surfaces expressed in two dimensions, all the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each surface An area image position information transmission unit for transmitting area image position information;
A stream transmission unit that transmits a video stream obtained by encoding an i-th (1 ≦ i ≦ K) region video that is a part of an original video on one or more planes C (1 ≦ C ≦ L) and a common video; A video transmission device comprising:
An area image position information receiving unit for receiving the area image position information;
A stream receiver for receiving the video stream;
A reading unit for reading the received video stream;
A decoding unit for decoding the video stream read by the reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
An output image combining unit that combines the common video of the decoded images obtained in the decoding unit and the image cut out in the image cutout unit to output an output image;
An output image plane designating unit for designating a plane D (1 ≦ D ≦ L) for cutting out the next output image;
An output image coordinate designating unit for designating the coordinate position of the next output image;
From the coordinate position of the next output image with respect to the reference position on the surface D designated by the output image surface designation unit and the region video position information, the jth (1 ≦ j) where the largest number of pixels of the next output image can be obtained. ≦ M), a video receiving apparatus comprising a transmission requesting unit that requests transmission of a video stream.

For L (L ≧ 2) surfaces expressed in two dimensions, all the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each surface An area image position information transmission unit for transmitting area image position information;
Distinguish the video streams from N video streams composed of M (M ≦ K) area videos, which are part of the original video on one or more planes C (1 ≦ C ≦ L), and the common video. A video transmission device comprising: a stream transmission unit that transmits encoded data obtained by multiplexing and encoding the ID information;
An area image position information receiving unit for receiving the area image position information;
A stream receiver for receiving the video stream;
From the coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) where the output image is cut out and the region image position information, the i th (1 ≦ i) A read determination unit that determines to read a video stream of ≦ M);
A search readout unit that searches for and reads out the i-th video stream from the ID information in the encoded data;
A decoding unit for decoding the video stream read by the search reading unit;
A cutout position determination unit that determines the cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination unit, an image cutout unit that cuts out a part of the area video in the decoded image obtained in the decoding unit;
An output image combining unit that combines the common video of the decoded images obtained in the decoding unit and the image cut out in the image cutout unit to output an output image;
An output image plane designating unit for designating a plane D (1 ≦ D ≦ L) for cutting out the next output image;
An output image coordinate designating unit for designating the coordinate position of the next output image;
M video streams so that the number of pixels of the next output image can be obtained from the coordinate position of the next output image with respect to the reference position on the plane D designated by the output image plane designation unit and the region video position information. A video receiving apparatus comprising a transmission requesting unit that requests transmission of the video transmission / reception system.

Area video position information indicating N (N is a natural number) different coordinate positions with respect to a reference position on a plane expressed in two dimensions, and an area video corresponding to the area video position information and a common video are encoded. A video decoding method for inputting an N number of video streams and decoding a part of the video stream to obtain an output image,
Read that determines to read the i-th (1 ≦ i ≦ N) video stream from the coordinate position of the output image with respect to the reference position and the region video position information so that the number of pixels of the output image can be obtained most. A decision step;
A reading step of reading the i-th video stream;
A decoding step of decoding the video stream read in the reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
A video decoding method comprising: an output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image.

It is composed of area video position information indicating N (N is a natural number) different coordinate positions with respect to a reference position on a plane expressed in two dimensions, and an area video corresponding to the area video position information and a common video. This is a video decoding method in which N pieces of video streams and encoded data obtained by multiplexing and encoding ID information for distinguishing the video streams are input, and a part of the encoded data is decoded to obtain an output image. And
It is determined to read out the i-th (1 ≦ i ≦ N) video information stream from the coordinate position of the output image with respect to the reference position and the region video position information so that the number of pixels of the output image is the largest. A read determination step to perform,
A search and read step of searching for and reading out the i-th video stream from the ID information in the encoded data;
A decoding step of decoding the video stream read in the search reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
A video decoding method comprising: an output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image.

A video transmission / reception method performed by a video transmission / reception system comprising a video transmission device and a video reception device,
An area image position information transmission step for transmitting area image position information, which is the coordinate position of N (N is a natural number) partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions; ,
A stream transmission step of transmitting a requested i-th (1 ≦ i ≦ N) video stream among N video streams obtained by encoding the region video and the common video;
An area image position information receiving step for receiving the area image position information;
A stream receiving step of receiving the transmitted video stream;
A reading step of reading the received video stream;
A decoding step of decoding the video stream read in the reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
An output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image;
An output image coordinate designation step for designating the coordinate position of the next output image;
A transmission requesting step for requesting transmission of a j-th (1 ≦ j ≦ N) video stream from which the number of pixels of the next output image is most obtained from the coordinate position of the next output image and the region video position information; A video transmission / reception method comprising:

A video transmission / reception method performed by a video transmission / reception system comprising a video transmission device and a video reception device,
An area image position information transmission step for transmitting area image position information, which is the coordinate position of N (N is a natural number) partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions; ,
From the encoded data obtained by multiplexing and encoding the N video streams composed of the area video and the common video and the ID information for distinguishing the video streams, the M (M ≦ N) encoded data Stream sending step for extracting and sending,
An area image position information receiving step for receiving the area image position information;
A stream receiving step for receiving the encoded data;
A read decision for determining that the i-th (1 ≦ i ≦ M) video stream is read from the coordinate position of the output image with respect to the reference position and the region video position information so as to obtain the largest number of pixels of the output image. Steps,
A search and read step of searching for and reading out the i-th video stream from the ID information in the encoded data;
A decoding step of decoding the video stream read in the search reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
An output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image;
An output image coordinate designation step for designating the coordinate position of the next output image;
A transmission requesting step for requesting transmission of M video streams so that the number of pixels of the next output image can be obtained most from the coordinate position of the next output image and the region video position information. To send and receive video.

A video transmission / reception method performed by a video transmission / reception system comprising a video transmission device and a video reception device,
An area image position information transmission step for transmitting area image position information, which is the coordinate position of N (N is a natural number) partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions; ,
A stream transmission step of transmitting a video stream obtained by encoding an i-th (1 ≦ i ≦ N) region video that is a part of the original video and a common video;
An area image position information receiving step for receiving the area image position information;
A stream receiving step for receiving the video stream;
A reading step of reading the received video stream;
A decoding step of decoding the video stream read in the reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
An output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image;
An output image coordinate designation step for designating the coordinate position of the next output image;
A transmission requesting step for requesting transmission of a j-th (1 ≦ j ≦ N) video stream from which the number of pixels of the next output image is most obtained from the coordinate position of the next output image and the region video position information; A video transmission / reception method comprising:

A video transmission / reception method performed by a video transmission / reception system comprising a video transmission device and a video reception device,
An area image position information transmission step for transmitting area image position information, which is the coordinate position of N (N is a natural number) partial area images with respect to a reference position of an original image defined on a plane expressed in two dimensions; ,
Encoding by multiplexing and encoding N video streams composed of M area videos (M ≦ N) and common video, which are part of the original video, and ID information for distinguishing the video streams A stream transmission step for transmitting data;
An area image position information receiving step for receiving the area image position information;
A stream receiving step for receiving the encoded data;
A read decision for determining that the i-th (1 ≦ i ≦ M) video stream is read from the coordinate position of the output image with respect to the reference position and the region video position information so as to obtain the largest number of pixels of the output image. Steps,
A search and read step of searching for and reading out the i-th video stream from the ID information in the encoded data;
A decoding step of decoding the video stream read in the search reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
An output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image;
An output image coordinate designation step for designating the coordinate position of the next output image;
A transmission requesting step for requesting transmission of M video streams so that the number of pixels of the next output image can be obtained most from the coordinate position of the next output image and the region video position information. Video transmission and reception method.

Region video position information indicating K (1 ≦ K ≦ L) different coordinate positions with respect to a reference position on each surface for L (L ≧ 2) surfaces expressed in two dimensions; A video decoding method for inputting an L video stream obtained by encoding a region video corresponding to region video position information and a common video, and decoding a part of the video stream to obtain an output image,
From the coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) where the output image is cut out and the region image position information, the i-th (1 A read determination step for determining to read a video stream of ≦ i ≦ C);
A reading step of reading the i-th video stream;
A decoding step of decoding the video stream read in the reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
A video decoding method comprising: an output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image.

Region video position information indicating K (1 ≦ K ≦ L) different coordinate positions with respect to a reference position on each surface for L (L ≧ 2) surfaces expressed in two dimensions; N pieces of video streams composed of the region video corresponding to the region video position information and the common video, and encoded data obtained by multiplexing and encoding ID information for distinguishing the video stream are input, and the encoded data A video decoding method for decoding part and obtaining an output image,
From the coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) where the output image is cut out and the region image position information, the i-th (1 A read determination step for determining to read a video stream of ≦ i ≦ C);
A search and read step of searching for and reading out the i-th video stream from the ID information in the encoded data;
A decoding step of decoding the video stream read in the search reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
A video decoding method comprising: an output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image.

A video transmission / reception method performed by a video transmission / reception system comprising a video transmission device and a video reception device,
For L (L ≧ 2) surfaces expressed in two dimensions, all the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each surface An area image position information transmission step for transmitting area image position information;
The requested i-th (1 ≦ i ≦ K) video stream is transmitted among K video streams obtained by encoding the region video and the common video on the plane C (1 ≦ C ≦ L) where the output image is cut out. A stream sending step;
An area image position information receiving step for receiving the area image position information;
A stream receiving step of receiving the transmitted video stream;
A reading step of reading the received video stream;
A decoding step of decoding the video stream read in the reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
An output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image;
An output image plane designating step for designating a plane D (1 ≦ D ≦ L) for cutting out the next output image;
An output image coordinate designating step for designating a coordinate position of the next output image;
From the coordinate position of the next output image with respect to the reference position on the surface D from which the output image is cut out and the region video position information, the j-th (1 ≦ j ≦ D) number of pixels of the next output image is obtained most. And a transmission requesting step for requesting transmission of the video stream.

A video transmission / reception method performed by a video transmission / reception system comprising a video transmission device and a video reception device,
For L (L ≧ 2) surfaces expressed in two dimensions, all the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each surface An area image position information transmission step for transmitting area image position information;
From the encoded data obtained by multiplexing and encoding the N video streams composed of the area video and the common video and the ID information for distinguishing the video streams, the M (M ≦ K) encoded data Stream sending step for extracting and sending,
An area image position information receiving step for receiving the area image position information;
A stream receiving step for receiving the encoded data;
From the coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) where the output image is cut out and the region image position information, the i th (1 ≦ i) A read determination step for determining to read a video stream of ≦ M);
A search and read step of searching for and reading out the i-th video stream from the ID information in the encoded data;
A decoding step of decoding the video stream read in the search reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
An output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image;
An output image plane designating step for designating a plane D (1 ≦ D ≦ L) for cutting out the next output image;
An output image coordinate designation step for designating the coordinate position of the next output image;
From the coordinate position of the next output image with respect to the reference position on the plane D designated in the output image plane designation step and the region video position information, M videos are obtained so that the number of pixels of the next output image can be obtained most. A video transmission / reception method comprising: a transmission request step for requesting transmission of a stream.

A video transmission / reception method performed by a video transmission / reception system comprising a video transmission device and a video reception device,
For L (L ≧ 2) surfaces expressed in two dimensions, all the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each surface An area image position information transmission step for transmitting area image position information;
A stream transmission step of transmitting a video stream obtained by encoding an i-th (1 ≦ i ≦ K) region video that is a part of an original video on one or more planes C (1 ≦ C ≦ L) and a common video; ,
An area image position information receiving step for receiving the area image position information;
A stream receiving step for receiving the video stream;
A reading step of reading the received video stream;
A decoding step of decoding the video stream read in the reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
An output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image;
An output image plane designating step for designating a plane D (1 ≦ D ≦ L) for cutting out the next output image;
An output image coordinate designation step for designating the coordinate position of the next output image;
From the coordinate position of the next output image with respect to the reference position on the surface D designated in the output image surface designation step and the region video position information, the j-th (1 ≦ j) with the largest number of pixels of the next output image is obtained. ≦ M) a transmission request step for requesting transmission of a video stream.

A video transmission / reception method performed by a video transmission / reception system comprising a video transmission device and a video reception device,
For L (L ≧ 2) surfaces expressed in two dimensions, all the coordinate positions of K (1 ≦ K ≦ L) partial area images with respect to a reference position defined on each surface An area image position information transmission step for transmitting area image position information;
Distinguish the video streams from N video streams composed of M (M ≦ K) area videos, which are part of the original video on one or more planes C (1 ≦ C ≦ L), and the common video. A stream transmission step of transmitting encoded data obtained by multiplexing and encoding the ID information to be transmitted;
An area image position information receiving step for receiving the area image position information;
A stream receiving step for receiving the video stream;
From the coordinate position of the output image with respect to the reference position on the plane C (1 ≦ C ≦ L) where the output image is cut out and the region image position information, the i th (1 ≦ i) A read determination step for determining to read a video stream of ≦ M);
A search and read step of searching for and reading out the i-th video stream from the ID information in the encoded data;
A decoding step of decoding the video stream read in the search reading step;
A cutout position determination step for determining a cutout position of the image so that the coordinate position of the output image with respect to the reference position is the same based on the i-th region video position information;
Based on the cutout position of the image determined by the cutout position determination step, an image cutout step of cutting out a part of the area video in the decoded image obtained in the decoding step;
An output image synthesis step of synthesizing the common video of the decoded images obtained in the decoding step and the image cut out in the image cutout step to output an output image;
An output image plane designating step for designating a plane D (1 ≦ D ≦ L) for cutting out the next output image;
An output image coordinate designating step for designating a coordinate position of the next output image;
M video streams so that the number of pixels of the next output image can be obtained from the coordinate position of the next output image with respect to the reference position on the plane D specified in the output image plane specifying step and the region video position information. A video transmission / reception method comprising: a transmission request step for requesting transmission of the video.