JP2014057217A

JP2014057217A - Device, method and program for encoding moving image, and moving image communication device

Info

Publication number: JP2014057217A
Application number: JP2012200832A
Authority: JP
Inventors: Hiroshi Nagaoka; 寛史長岡
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2012-09-12
Filing date: 2012-09-12
Publication date: 2014-03-27
Anticipated expiration: 2032-09-12
Also published as: JP5907016B2

Abstract

PROBLEM TO BE SOLVED: To reduce a delay time in moving image data communication.SOLUTION: A moving image coding device 20A encodes each of selected imaged images successively imaged by a selected camera selected from among a plurality of imaging cameras Cam0A-Cam11A and non-selected imaged images imaged by a non-selected camera other than a selected camera, among imaged images obtained by imaging an imaging object by the plurality of imaging cameras Cam0A-Cam11A, by a coding method using interframe prediction in which a frame image obtained by encoding a past selected moving image by the coding method is used as a common reference image. When the selected camera is switched to another imaging camera, the moving image coding device 20A switches the common reference image to a frame image obtained by encoding the past imaged image imaged by the other imaging camera using the coding method. The moving image coding device 20A also transmits the encoded coded data to a device of an opposite communication party.

Description

開示の技術は動画像符号化装置、動画像符号化方法、動画像符号化プログラム、及び動画像通信装置に関する。 The disclosed technology relates to a moving image encoding apparatus, a moving image encoding method, a moving image encoding program, and a moving image communication apparatus.

高速ネットワーク網の普及や、パーソナルコンピュータ、携帯電話等に搭載されるプロセッサの処理能力の向上によって、テレビ電話やテレビ電話会議等、遠距離にいる人物と動画像通信によるコミュニケーションをとることが可能となっている。 With the widespread use of high-speed networks and the increased processing power of processors installed in personal computers and mobile phones, it is possible to communicate with people at long distances such as videophones and videophone conferences through video communication It has become.

また、動画像通信において、通信相手があたかもそばにいるようなリアリティを追求するには、通信相手との視線一致、高解像度及び実寸大の画像表示、高フレームレート及び低遅延での通信といった要素が非常に重要となる。 Also, in video communication, in order to pursue the reality that the communication partner is as close as possible, elements such as line-of-sight matching with the communication partner, high resolution and actual size image display, communication at a high frame rate and low delay, etc. Is very important.

従来、通信相手をカメラにより撮影した撮影画像から通信相手の位置や視線を検出し、検出した通信相手の位置や視線に基づいて自分を撮影している複数のカメラの中から選択されたカメラの撮影画像を符号化して通信相手に転送する技術がある。この技術によれば、人物の移動による視線方向のずれを解消することができ、あたかも窓を介して会話をしているような臨場感が得られる。 Conventionally, the position or line of sight of a communication partner is detected from a captured image obtained by capturing the communication partner with a camera, and a camera selected from a plurality of cameras that are shooting itself based on the detected position and line of sight of the communication partner is detected. There is a technique for encoding a captured image and transferring it to a communication partner. According to this technology, it is possible to eliminate the shift in the line of sight due to the movement of the person, and it is possible to obtain a sense of reality as if the user is talking through the window.

特開２００４−１９３９６２号公報JP 2004-193962 A 特開２０１１−９１６１４号公報JP 2011-91614 A 特開２００８−２５２６５１号公報JP 2008-252651 A

しかしながら、上記技術は、転送データ量を削減するために動画像データを圧縮符号化して送信する際、通信相手の位置が変化してから撮影するカメラを切り替えて撮影画像の符号化を行うため、符号化に要する時間分、動画像データの通信が遅延してしまう。 However, in the above technique, when moving image data is compressed and transmitted in order to reduce the amount of transfer data, the captured image is switched by switching the camera to be photographed after the position of the communication partner changes, Communication of moving image data is delayed by the time required for encoding.

また、多視点の動画像の符号化方法として、視点間の動き予測を用いた効率的な符号化を行うことによって、複数の視点の動画像を転送する方法がITU-T Rec. H.264 Annex H(Multiview video coding、以下MVC)により規格化されている。 As a multi-view video encoding method, a method of transferring video from multiple viewpoints by performing efficient encoding using motion prediction between viewpoints is ITU-T Rec. H.264. It is standardized by Annex H (Multiview video coding, MVC).

しかしながら、符号化側、復号化側の両方でMVCの規格に対応している必要があり、動画像符号化方法として広く用いられているMPEG-2やITU-T Rec. H.264 (非MVC)等の方法を用いることができない。 However, both the encoding side and the decoding side must support the MVC standard, and MPEG-2 and ITU-T Rec. H.264 (non-MVC), which are widely used as video encoding methods, are required. ) Etc. cannot be used.

開示の技術は、一つの側面として、動画像データの通信の遅延時間を削減することが目的である。 An object of the disclosed technique is to reduce the delay time of communication of moving image data, as one aspect.

開示の技術は、符号化部は、複数の撮影カメラにより撮影対象を各々撮影した撮影画像のうち、複数の撮影カメラから選択された選択カメラにより順次撮影された選択撮影画像及び選択カメラ以外の非選択カメラにより撮影された非選択撮影画像の各々を符号化する。符号化部は、過去の選択撮影画像をフレーム間予測を用いた符号化方法により符号化したフレーム画像を共通参照画像として符号化する。切替部は、前記選択カメラが他の撮影カメラに切り替えられた場合に、前記共通参照画像を、過去の前記他の撮影カメラにより撮影された撮影画像を前記符号化方法により符号化したフレーム画像に切り替える。送信部は、前記符号化部により符号化された符号化データを通信相手側の装置に送信する。 In the disclosed technology, the encoding unit is configured to select a selected captured image sequentially captured by a selected camera selected from a plurality of shooting cameras and a non-selected camera other than the selected camera from among captured images captured by a plurality of shooting cameras. Each non-selected photographed image photographed by the selected camera is encoded. The encoding unit encodes a frame image obtained by encoding a past selected captured image by an encoding method using inter-frame prediction as a common reference image. When the selected camera is switched to another shooting camera, the switching unit converts the common reference image into a frame image obtained by encoding a shot image shot by the other shooting camera in the past by the encoding method. Switch. The transmission unit transmits the encoded data encoded by the encoding unit to the apparatus on the communication partner side.

開示の技術は、一つの側面として、動画像データの通信の遅延時間を削減することができる、という効果を有する。 As one aspect, the disclosed technology has an effect that the delay time of communication of moving image data can be reduced.

動画像通信システムの構成図である。It is a block diagram of a moving image communication system. 動画像通信システムをテレビ電話に適用した場合のイメージ図である。It is an image figure at the time of applying a moving image communication system to a videophone. ステレオ法について説明するための図である。It is a figure for demonstrating the stereo method. 動画像符号化装置の構成図である。It is a block diagram of a moving image encoder. 選択カメラ及び隣接カメラについて説明するための図である。It is a figure for demonstrating a selection camera and an adjacent camera. 選択カメラ及び隣接カメラの撮影画像を符号化したフレーム画像の参照関係について説明するための図である。It is a figure for demonstrating the reference relationship of the frame image which encoded the picked-up image of a selection camera and an adjacent camera. 動画像符号化装置として機能するコンピュータのブロック図である。FIG. 11 is a block diagram of a computer that functions as a moving image encoding apparatus. 動画像符号化処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a moving image encoding process. 入力画像取り込み処理の流れを示すフローチャートである。It is a flowchart which shows the flow of an input image capture process. 符号化処理の流れを示すフローチャートである。It is a flowchart which shows the flow of an encoding process. ストリーム出力処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a stream output process. 動画像通信システムをテレビ電話に適用した場合のイメージ図である。It is an image figure at the time of applying a moving image communication system to a videophone. 各処理の処理タイミングについて説明するための図である。It is a figure for demonstrating the process timing of each process.

以下、図面を参照して開示の技術の実施形態の一例を詳細に説明する。 Hereinafter, an example of an embodiment of the disclosed technology will be described in detail with reference to the drawings.

図１には、本実施形態に係る動画像通信システム１０が示されている。動画像通信システム１０は、動画像通信装置１２Ａ、動画像通信装置１２Ｂがネットワーク１４を介して接続された構成である。 FIG. 1 shows a moving image communication system 10 according to the present embodiment. The moving image communication system 10 has a configuration in which a moving image communication device 12A and a moving image communication device 12B are connected via a network 14.

動画像通信装置１２Ａは、選択カメラ位置特定部１６Ａ、検出用カメラ１８Ａ１、１８Ａ２、動画像符号化装置２０Ａ、複数（本実施形態では一例として１２個）の撮影用カメラＣａｍ０Ａ〜Ｃａｍ１１Ａ、復号化装置２２Ａ、及びディスプレイ２４Ａを備えている。 The moving image communication device 12A includes a selected camera position specifying unit 16A, detection cameras 18A1 and 18A2, a moving image encoding device 20A, a plurality (12 as an example in the present embodiment) of photographing cameras Cam0A to Cam11A, and a decoding device. 22A and a display 24A.

動画像通信装置１２Ｂも動画像通信装置１２Ａと同様の構成であり、選択カメラ位置特定部１６Ｂ、検出用カメラ１８Ｂ１、１８Ｂ２、動画像符号化装置２０Ｂ、撮影用カメラＣａｍ０Ｂ〜Ｃａｍ１１Ｂ、復号化装置２２Ｂ、及びディスプレイ２４Ｂを備えている。 The moving image communication apparatus 12B has the same configuration as that of the moving image communication apparatus 12A. The selected camera position specifying unit 16B, the detection cameras 18B1 and 18B2, the moving image encoding apparatus 20B, the shooting cameras Cam0B to Cam11B, and the decoding apparatus 22B. And a display 24B.

本実施形態では、動画像通信システム１０が、一例としてテレビ電話システムに適用された場合について説明する。なお、動画像通信システム１０は、テレビ電話システムに限らず、テレビ会議システム等の動画像を通信するシステムであれば適用可能である。 In the present embodiment, a case where the moving image communication system 10 is applied to a videophone system as an example will be described. The moving image communication system 10 is not limited to a videophone system, and can be applied to any system that communicates moving images such as a video conference system.

図２には、動画像通信システム１０をテレビ電話システムに適用した場合の一例を示した。同図の例では、部屋Ａに在室している人物２６Ａと、部屋Ｂに在室している人物２６Ｂとが、部屋Ａとの壁面に設けられたディスプレイ２４Ａ、部屋Ｂの壁面に設けられたディスプレイ２４Ｂに各々表示される相手の画像を参照しながら会話することが可能である。なお、図２では、便宜上、部屋Ａと部屋Ｂとが隣り合わせであるように記載したが、実際は離れた位置に存在する。 FIG. 2 shows an example in which the moving image communication system 10 is applied to a videophone system. In the example shown in the figure, a person 26A in the room A and a person 26B in the room B are provided on the display 24A provided on the wall with the room A and on the wall of the room B. It is possible to have a conversation while referring to the images of the opponents displayed on the display 24B. In FIG. 2, for convenience, the room A and the room B are described as being adjacent to each other. However, the room A and the room B are actually separated from each other.

部屋Ａの壁面には、１２台の撮影カメラＣａｍ０Ａ〜Ｃａｍ１１Ａが格子状に取り付けられており、各撮影カメラは部屋Ａの人物２６Ａを撮影する。また、部屋Ｂの壁面にも、１２台の撮影カメラＣａｍ０Ｂ〜Ｃａｍ１１Ｂが、部屋Ａの壁面に設けられた撮影カメラＣａｍ０Ａ〜Ｃａｍ１１Ａと同じ配置で取り付けられている。撮影カメラＣａｍ０Ｂ〜Ｃａｍ１１Ｂは、部屋Ｂの人物２６Ｂを撮影する。 On the wall surface of the room A, twelve photographing cameras Cam0A to Cam11A are attached in a lattice shape, and each photographing camera photographs a person 26A in the room A. Also, twelve photographing cameras Cam0B to Cam11B are attached to the wall surface of the room B in the same arrangement as the photographing cameras Cam0A to Cam11A provided on the wall surface of the room A. The photographing cameras Cam0B to Cam11B photograph the person 26B in the room B.

部屋Ａには、人物２６Ａの位置を検出するための検出カメラ１８Ａ１、１８Ａ２が設けられている。検出カメラ１８Ａ１、１８Ａ２は、異なる方向から部屋Ａの人物２６Ａを撮影する位置に設けられている。 In the room A, detection cameras 18A1 and 18A2 for detecting the position of the person 26A are provided. The detection cameras 18A1 and 18A2 are provided at positions for photographing the person 26A in the room A from different directions.

同様に、部屋Ｂには、人物２６Ｂの位置を検出するための検出カメラ１８Ｂ１、１８Ｂ２が設けられている。検出カメラ１８Ｂ１、１８Ｂ２は、異なる方向から部屋Ｂの人物２６Ｂを撮影する位置に設けられている。 Similarly, the room B is provided with detection cameras 18B1 and 18B2 for detecting the position of the person 26B. The detection cameras 18B1 and 18B2 are provided at positions for photographing the person 26B in the room B from different directions.

選択カメラ位置特定部１６Ａは、検出カメラ１８Ａ１、１８Ａ２により撮影された人物２６Ａの撮影画像に基づいて人物２６Ａの例えば顔の三次元位置を算出する。そして、算出した人物２６Ａの顔の位置に最も近い位置の撮影カメラに対応した部屋Ｂに設けられた撮影カメラの位置を選択カメラ位置として特定して動画像符号化装置２０Ｂに送信する。例えば、算出した人物２６Ａの顔の位置に最も近い位置の撮影カメラが撮影カメラＣａｍ５Ａの場合、撮影カメラＣａｍ５Ａと対応する位置の部屋Ｂの壁面に設けられた撮影カメラの位置を選択カメラ位置として特定して動画像通信装置１２Ｂに送信する。 The selected camera position specifying unit 16A calculates, for example, a three-dimensional position of the face of the person 26A based on the captured image of the person 26A captured by the detection cameras 18A1 and 18A2. Then, the position of the photographing camera provided in the room B corresponding to the photographing camera closest to the calculated position of the face of the person 26A is specified as the selected camera position, and is transmitted to the moving picture coding apparatus 20B. For example, when the photographing camera closest to the calculated face position of the person 26A is the photographing camera Cam5A, the position of the photographing camera provided on the wall surface of the room B at the position corresponding to the photographing camera Cam5A is specified as the selected camera position. And transmitted to the moving image communication apparatus 12B.

具体的には、選択カメラ位置特定部１６Ａは、ステレオ法を用いて人物２６Ａの顔の三次元位置を算出する。以下、ステレオ法について簡単に説明する。 Specifically, the selected camera position specifying unit 16A calculates the three-dimensional position of the face of the person 26A using the stereo method. The stereo method will be briefly described below.

図３に示すように、カメラの視点Ａから撮影した３次元空間における撮影対象点Ｐは、視点Ａと点Ｐを結んだ直線ＬＡと撮影面ＤＡとが交わる点ｉとして観測される。同様に、カメラの視点Ｂから撮影した３次元空間における撮影対象点Ｐは、視点Ｂと点Ｐを結んだ直線ＬＢと撮影面ＤＢとが交わる点ｊとして観測される。従って、撮影対象点Ｐは二つの撮影面ＤＡ、ＤＢの中の点ｉ、ｊとして各々観測され、視点Ａと点ｉを結んだ直線ＬＡと、視点Ｂと点ｊを結んだ直線ＬＢとは点Ｐで交わる。すなわち、視点Ａ，Ｂの位置関係が既知の場合、撮影対象点Ｐに対する二つの撮影面ＤＡ，ＤＢ上の位置を特定することにより、３次元空間における撮影対象点Ｐの三次元位置は一意に定まる。 As shown in FIG. 3, a photographing target point P in a three-dimensional space photographed from the viewpoint A of the camera is observed as a point i where the straight line LA connecting the viewpoint A and the point P and the photographing plane DA intersect. Similarly, the photographing target point P in the three-dimensional space photographed from the viewpoint B of the camera is observed as a point j where the straight line LB connecting the viewpoint B and the point P and the photographing plane DB intersect. Therefore, the shooting target point P is observed as points i and j in the two shooting planes DA and DB, respectively, and the straight line LA connecting the viewpoint A and the point i and the straight line LB connecting the viewpoint B and the point j are as follows. Intersect at point P. That is, when the positional relationship between the viewpoints A and B is known, the three-dimensional position of the shooting target point P in the three-dimensional space is uniquely determined by specifying the positions on the two shooting planes DA and DB with respect to the shooting target point P. Determined.

従って、まず検出カメラ１８Ａ１、１８Ａ２により撮影された撮影画像の各々から、人物２６Ａの顔の特徴点の位置を所謂SIFT(Scale Invariant Feature Transform)やSIFTを高速化したSURF(Speed Up Robust Features)等の方法を用いて各々算出する。次に、各撮影画像における人物２６Ａの顔の位置から、人物２６Ａの顔の三次元位置を算出する。そして、算出した人物２６Ａの顔の位置から最も近い撮影カメラを特定し、特定した撮影カメラに対応する部屋Ｂに設けられた撮影カメラの位置を選択カメラ位置として動画像通信装置１２Ｂに送信する。 Therefore, first, the position of the facial feature point of the person 26A is referred to as a so-called SIFT (Scale Invariant Feature Transform), or SURF (Speed Up Robust Features) that speeds up SIFT, etc. Each of these is calculated using the above method. Next, the three-dimensional position of the face of the person 26A is calculated from the position of the face of the person 26A in each captured image. Then, the camera that is closest to the calculated face position of the person 26A is specified, and the position of the camera provided in the room B corresponding to the specified camera is transmitted to the moving image communication apparatus 12B as the selected camera position.

動画像通信装置１２Ｂの選択カメラ位置特定部１６Ｂも同様に、検出カメラ１８Ｂ１、１８Ｂ２により撮影された人物２６Ｂの撮影画像に基づいて人物２６Ｂの顔の三次元位置を算出する。そして、算出した人物２６Ｂの顔の位置に最も近い位置の撮影カメラに対応した部屋Ａに設けられた撮影カメラの位置を選択カメラ位置として特定して動画像符号化装置２０Ａに送信する。従って、例えば図２に示すように、部屋Ｂの人物２６Ｂが撮影カメラＣａｍ６Ｂに最も近い場合、選択カメラ位置特定部１６Ｂは、撮影カメラＣａｍ６Ｂに対応した撮影カメラＣａｍ６Ａを選択カメラ位置として特定して動画像通信装置１２Ａに送信する。 Similarly, the selected camera position specifying unit 16B of the moving image communication apparatus 12B calculates the three-dimensional position of the face of the person 26B based on the captured image of the person 26B captured by the detection cameras 18B1 and 18B2. Then, the position of the shooting camera provided in the room A corresponding to the shooting camera closest to the calculated position of the face of the person 26B is specified as the selected camera position, and is transmitted to the moving picture coding apparatus 20A. Therefore, for example, as shown in FIG. 2, when the person 26B in the room B is closest to the shooting camera Cam6B, the selected camera position specifying unit 16B specifies the shooting camera Cam6A corresponding to the shooting camera Cam6B as the selected camera position to make a moving image. It transmits to the image communication apparatus 12A.

動画像符号化装置２０Ａは、複数の撮影カメラＣａｍ０Ａ〜Ｃａｍ１１Ａのうち、部屋Ｂの人物２６Ｂの位置に基づいて選択された選択カメラにより撮影された人物２６Ａの撮影画像を、選択カメラにより撮影された過去の撮影画像を共通参照画像として符号化する。符号化方法は、例えばMPEG-2やH.264 (非MVC)等のフレーム間予測を用いて符号化する方法を用いることができ、H.264 (MVC)等の多視点の符号化方法を用いる必要はない。 The moving image encoding device 20 A has captured a captured image of the person 26 A captured by the selected camera selected based on the position of the person 26 B in the room B among the plurality of capturing cameras Cam 0 A to Cam 11 A. A past photographed image is encoded as a common reference image. As an encoding method, for example, an encoding method using inter-frame prediction such as MPEG-2 or H.264 (non-MVC) can be used, and a multi-view encoding method such as H.264 (MVC) can be used. There is no need to use it.

また、動画像符号化装置２０Ａは、部屋Ｂの人物２６Ｂの位置に基づいて選択された選択カメラ以外の非選択カメラにより撮影された撮影画像を、選択カメラにより撮影された過去の撮影画像を共通参照画像として用いて符号化する。 In addition, the moving image encoding device 20A shares a captured image captured by a non-selected camera other than the selected camera selected based on the position of the person 26B in the room B with a past captured image captured by the selected camera. Use as a reference image for encoding.

また、動画像符号化装置２０Ａは、選択カメラが切り替えられた場合に、共通参照画像を、当該切り替えられた選択カメラにより撮影された撮影画像に切り替える。 In addition, when the selected camera is switched, the moving image encoding device 20A switches the common reference image to a captured image captured by the switched selected camera.

図４には、動画像符号化装置２０Ａの具体的な構成を示した。動画像符号化装置２０Ａは、全体制御部３０、入力画像取り込み部３２、符号化部３４、ストリーム出力部３６、メモリコントローラ３８、及びＳＤＲＡＭ４０を備えている。 FIG. 4 shows a specific configuration of the moving image encoding device 20A. The moving image encoding device 20A includes an overall control unit 30, an input image capturing unit 32, an encoding unit 34, a stream output unit 36, a memory controller 38, and an SDRAM 40.

入力画像取り込み部３２は、カメラ選択部４２及び選択カメラセレクタ４４Ｘ、隣接カメラＡセレクタ４４Ａ〜隣接カメラＫセレクタ４４Ｋ、選択カメラ画像入力部４６Ｘ、及び隣接カメラＡ画像入力部４６Ａ〜隣接カメラＫ画像入力部４６Ｋを備えている。 The input image capturing unit 32 includes a camera selection unit 42 and a selected camera selector 44X, an adjacent camera A selector 44A to an adjacent camera K selector 44K, a selected camera image input unit 46X, and an adjacent camera A image input unit 46A to an adjacent camera K image input. 46K is provided.

符号化部３４は、カメラ選択部４８、選択カメラ画像符号化部５２Ｘ、及び隣接カメラＡ符号化部５２Ａ〜隣接カメラＫ符号化部５２Ｋを備えている。 The encoding unit 34 includes a camera selection unit 48, a selected camera image encoding unit 52X, and an adjacent camera A encoding unit 52A to an adjacent camera K encoding unit 52K.

ストリーム出力部３６は、カメラ選択部５４及びストリーム選択部５６を備えている。 The stream output unit 36 includes a camera selection unit 54 and a stream selection unit 56.

図４に示すように、全体制御部３０は、動画像符号化装置２０Ｂから送信された選択カメラ位置をカメラ選択部４２、４８、５４に出力する。また、全体制御部３０は、ユーザーにより動画像通信の開始が指示されると、撮影画像の取り込み開始をカメラ選択部４２に指示する。 As illustrated in FIG. 4, the overall control unit 30 outputs the selected camera position transmitted from the moving image encoding device 20 B to the camera selection units 42, 48, and 54. In addition, when the user instructs the start of moving image communication, the overall control unit 30 instructs the camera selection unit 42 to start capturing captured images.

カメラ選択部４２は、全体制御部３０から入力された選択カメラ位置に対応する撮影カメラを選択カメラとして選択するように選択カメラセレクタ４４Ｘに対して指示する。選択カメラセレクタ４４Ｘには撮影カメラＣａｍ０Ａ〜Ｃａｍ１１Ａが接続されている。 The camera selection unit 42 instructs the selected camera selector 44X to select a photographing camera corresponding to the selected camera position input from the overall control unit 30 as the selected camera. Shooting cameras Cam0A to Cam11A are connected to the selected camera selector 44X.

選択カメラセレクタ４４Ｘは、撮影カメラＣａｍ０Ａ〜Ｃａｍ１１Ａのうちカメラ選択部４２により指示された選択カメラ位置に対応した撮影カメラを選択して、選択した撮影カメラの撮影画像を選択カメラ画像入力部４６Ｘに出力する。 The selected camera selector 44X selects a shooting camera corresponding to the selected camera position designated by the camera selection unit 42 from the shooting cameras Cam0A to Cam11A, and outputs a shot image of the selected shooting camera to the selection camera image input unit 46X. To do.

また、カメラ選択部４２は、選択カメラ位置に対応した撮影カメラに隣接する撮影カメラを隣接カメラとして、各隣接カメラを選択するように隣接カメラＡセレクタ４４Ａ〜隣接カメラＫセレクタ４４Ｋに対して各々指示する。隣接カメラＡセレクタ４４Ａ〜隣接カメラＫセレクタ４４Ｋの各々には、撮影カメラＣａｍ０Ａ〜Ｃａｍ１１Ａが接続されている。 In addition, the camera selection unit 42 instructs each of the adjacent camera A selector 44A to the adjacent camera K selector 44K to select each adjacent camera using the shooting camera adjacent to the shooting camera corresponding to the selected camera position as the adjacent camera. To do. Shooting cameras Cam0A to Cam11A are connected to each of the adjacent camera A selector 44A to the adjacent camera K selector 44K.

なお、例えば図２に示すように撮影カメラを格子状に設けた場合には、選択カメラの上下左右に位置する４個の撮影カメラを隣接カメラＡ〜Ｄとして設定することができる。この場合、カメラ選択部４２は、選択カメラに隣接する撮影カメラを選択するように隣接カメラＡセレクタ４４Ａ〜隣接カメラＤセレクタ４４Ｄに対して各々指示する。以下では、選択カメラの上下左右に隣接する４個の撮影カメラを隣接カメラＡ〜Ｄとして設定する場合について説明する。 For example, when the shooting cameras are provided in a grid pattern as shown in FIG. 2, four shooting cameras positioned on the top, bottom, left, and right of the selected camera can be set as the adjacent cameras A to D. In this case, the camera selection unit 42 instructs each of the adjacent camera A selector 44A to the adjacent camera D selector 44D to select a photographing camera adjacent to the selected camera. Hereinafter, a case where four photographing cameras adjacent to the selected camera in the vertical and horizontal directions are set as the adjacent cameras A to D will be described.

隣接カメラＡセレクタ４４Ａは、撮影カメラＣａｍ０Ａ〜Ｃａｍ１１Ａのうちカメラ選択部４２により指示された隣接カメラＡに対応した撮影カメラを選択して、選択した撮影カメラの撮影画像を隣接カメラＡ画像入力部４６Ａに出力する。隣接カメラＢセレクタ４４Ｂ〜隣接カメラＤセレクタ４４Ｄについても同様である。 The adjacent camera A selector 44A selects a shooting camera corresponding to the adjacent camera A instructed by the camera selection unit 42 from the shooting cameras Cam0A to Cam11A, and sets the captured image of the selected shooting camera to the adjacent camera A image input unit 46A. Output to. The same applies to the adjacent camera B selector 44B to the adjacent camera D selector 44D.

従って、本実施形態では、１１個の隣接カメラＡセレクタ４４Ａ〜隣接カメラＫセレクタ４４Ｋを備えた構成としているが、隣接カメラの数だけセレクタを設けても良い。隣接カメラＡ画像入力部４６Ａ〜隣接カメラＫ画像入力部４６Ｋについても同様であり、隣接カメラの数だけ画像入力部を設けても良い。さらに、後述するＳＤＲＡＭ４０の隣接カメラに関する各記憶領域についても、隣接カメラの数だけ設けても良い。 Therefore, in this embodiment, the eleven adjacent camera A selector 44A to the adjacent camera K selector 44K are provided, but selectors may be provided as many as the number of adjacent cameras. The same applies to the adjacent camera A image input unit 46A to the adjacent camera K image input unit 46K, and as many image input units as the number of adjacent cameras may be provided. Furthermore, each storage area related to the adjacent camera of the SDRAM 40 to be described later may be provided by the number of adjacent cameras.

選択カメラ画像入力部４６Ｘは、入力された撮影画像に予め定めたフィルタ処理等を施して、メモリコントローラ３８に出力する。隣接カメラＡ画像入力部４６Ａ〜隣接カメラＫ画像入力部４６Ｋも同様である。 The selected camera image input unit 46 X performs a predetermined filter process or the like on the input captured image and outputs the result to the memory controller 38. The same applies to the adjacent camera A image input unit 46A to the adjacent camera K image input unit 46K.

例えば撮影カメラＣａｍ６Ａが選択カメラとして選択されている場合、全体制御部３０は、撮影カメラＣａｍ６Ａを選択するように選択カメラセレクタ４４Ｘに対して指示する。これにより、選択カメラセレクタ４４Ｘは、撮影カメラＣａｍ６Ａの撮影画像を選択カメラ画像入力部４６Ｘに出力する。 For example, when the photographing camera Cam6A is selected as the selected camera, the overall control unit 30 instructs the selected camera selector 44X to select the photographing camera Cam6A. As a result, the selected camera selector 44X outputs the captured image of the imaging camera Cam6A to the selected camera image input unit 46X.

また、図５に示す部屋Ｂの人物２６Ｂが移動前の状態において、撮影カメラＣａｍ６Ａが選択カメラとして選択されている場合、撮影カメラＣａｍ６Ａの左右上下に隣接する撮影カメラＣａｍ５Ａ、Ｃａｍ７Ａ、Ｃａｍ２Ａ、Ｃａｍ１０Ａを隣接カメラＡ〜Ｄとする。そして、カメラ選択部４２は、隣接カメラＡセレクタ４４Ａに対しては、撮影カメラＣａｍ５Ａを選択するように指示する。これにより、隣接カメラＡセレクタ４４Ａは、撮影カメラＣａｍ５Ａの撮影画像を隣接カメラＡ画像入力部４６Ａに出力する。 Further, in the state before the person 26B in the room B shown in FIG. 5 moves, when the shooting camera Cam6A is selected as the selection camera, the shooting cameras Cam5A, Cam7A, Cam2A, and Cam10A adjacent to the left and right and up and down of the shooting camera Cam6A are displayed. Let it be adjacent cameras A to D. Then, the camera selection unit 42 instructs the adjacent camera A selector 44A to select the photographing camera Cam5A. Thereby, the adjacent camera A selector 44A outputs the captured image of the shooting camera Cam5A to the adjacent camera A image input unit 46A.

同様に、カメラ選択部４２は、隣接カメラＢセレクタ４４Ｂに対しては、撮影カメラＣａｍ７Ａを選択するように指示する。これにより、隣接カメラＡセレクタ４４Ｂは、撮影カメラＣａｍ７Ａの撮影画像を隣接カメラＢ画像入力部４６Ｂに出力する。以下、隣接カメラＣ、Ｄについても同様である。 Similarly, the camera selection unit 42 instructs the adjacent camera B selector 44B to select the photographing camera Cam7A. Thereby, the adjacent camera A selector 44B outputs the captured image of the shooting camera Cam7A to the adjacent camera B image input unit 46B. The same applies to the adjacent cameras C and D.

メモリコントローラ３８は、選択カメラ画像入力部４６Ｘ、隣接カメラＡ画像入力部４６Ａ〜隣接カメラＫ画像入力部４６Ｋと接続されている。メモリコントローラ３８は、選択カメラ画像入力部４６Ｘから入力された撮影画像を選択カメラ原画像として、ＳＤＲＡＭの選択カメラ原画像記憶領域４０Ｘ１に記憶させる。また、メモリコントローラ３８は、隣接カメラＡ画像入力部４６Ａから入力された撮影画像を隣接カメラＡ原画像として、ＳＤＲＡＭの隣接カメラＡ原画像記憶領域４０Ａ１に記憶させる。隣接カメラＢ〜Ｄについても同様である。 The memory controller 38 is connected to the selected camera image input unit 46X and the adjacent camera A image input unit 46A to the adjacent camera K image input unit 46K. The memory controller 38 stores the captured image input from the selected camera image input unit 46X as a selected camera original image in the selected camera original image storage area 40X1 of the SDRAM. Further, the memory controller 38 stores the captured image input from the adjacent camera A image input unit 46A as the adjacent camera A original image in the adjacent camera A original image storage area 40A1 of the SDRAM. The same applies to the adjacent cameras B to D.

選択カメラ画像符号化部５２Ｘは、全体制御部３０から符号化処理開始が指示されると、選択カメラ原画像記憶領域４０Ｘ１に記憶された選択カメラ原画像と選択カメラ参照画像領域４０Ｘ２に記憶された共通参照画像とに基づいて、撮影画像を符号化する。なお、共通参照画像については後述する。 When an instruction to start the encoding process is given from the overall control unit 30, the selected camera image encoding unit 52X stores the selected camera original image stored in the selected camera original image storage area 40X1 and the selected camera reference image area 40X2. The captured image is encoded based on the common reference image. The common reference image will be described later.

符号化方法は、例えばMPEG-2やMPEG-4、H.264 (非MVC)等のフレーム間予測を用いて符号化する方法を用いる。また、本実施形態では、ＧＯＰ（Group Of Pictures）構造として、ＩＰＰＰ構造を用いる。すなわち、単独で復号可能なＩピクチャと、過去のフレーム画像を参照画像として差分を符号化したＰピクチャを用いた構造とする。このように、過去のフレーム画像だけでなく未来のフレーム画像も参照画像として用いて符号化するＢピクチャを用いないため、符号化に要する時間を短縮することができる。 As an encoding method, for example, an encoding method using inter-frame prediction such as MPEG-2, MPEG-4, H.264 (non-MVC) or the like is used. In this embodiment, an IPPP structure is used as a GOP (Group Of Pictures) structure. That is, a structure using an I picture that can be decoded independently and a P picture in which a difference is encoded using a past frame image as a reference image. In this way, since the B picture that is encoded using not only the past frame image but also the future frame image as the reference image is not used, the time required for encoding can be shortened.

また、選択カメラ画像符号化部５２Ｘは、符号化データ及び動きベクトル等を含むビットストリームデータを生成してＳＤＲＡＭ４０の選択カメラストリーム記憶領域４０Ｘ３に記憶させる。 Further, the selected camera image encoding unit 52X generates bit stream data including encoded data, motion vectors, and the like and stores them in the selected camera stream storage area 40X3 of the SDRAM 40.

隣接カメラＡ符号化部５２Ａ〜隣接カメラＤ符号化部５２Ｄも選択カメラ画像符号化部５２Ｘと同様に、対応する原画像領域に記憶された選択カメラ原画像と、対応する参照画像領域に記憶された共通参照画像と、に基づいて、撮影画像を符号化する。そして、ビットストリームデータを生成してＳＤＲＡＭ４０の対応するストリーム記憶領域に記憶させる。 Similar to the selected camera image encoding unit 52X, the adjacent camera A encoding unit 52A to the adjacent camera D encoding unit 52D are also stored in the corresponding reference image region and the selected camera original image stored in the corresponding original image region. The captured image is encoded based on the common reference image. Then, bit stream data is generated and stored in the corresponding stream storage area of the SDRAM 40.

図６には、選択カメラＸ、隣接カメラＡ〜Ｄにより撮影された画像を符号化したフレーム画像の一例を示した。同図に示すように、ピクチャ０のタイミングで符号化された選択カメラＸのフレーム画像６０Ｘ０、隣接カメラＡ〜Ｄのフレーム画像６０Ａ０〜６０Ｄ０は、Ｉピクチャである。以降、ピクチャ１〜６まで選択カメラＸ、隣接カメラＡ〜Ｄのフレーム画像は全てＰピクチャである。 FIG. 6 shows an example of a frame image obtained by encoding images taken by the selected camera X and the adjacent cameras A to D. As shown in the drawing, the frame image 60X0 of the selected camera X and the frame images 60A0 to 60D0 of the adjacent cameras A to D encoded at the timing of the picture 0 are I pictures. Thereafter, the frame images of the selected camera X and the adjacent cameras A to D from the pictures 1 to 6 are all P pictures.

ピクチャ０のタイミングでは、選択カメラＸが撮影カメラＣａｍ６Ａ、隣接カメラＡが撮影カメラＣａｍ５Ａ、隣接カメラＢが撮影カメラＣａｍ７Ａ、隣接カメラＣが撮影カメラＣａｍ２Ａ、隣接カメラＤが撮影カメラＣａｍ１０Ａである。 At the timing of picture 0, the selected camera X is the shooting camera Cam6A, the adjacent camera A is the shooting camera Cam5A, the adjacent camera B is the shooting camera Cam7A, the adjacent camera C is the shooting camera Cam2A, and the adjacent camera D is the shooting camera Cam10A.

次のピクチャ１のタイミングでは、選択カメラＸのフレーム画像６０Ｘ１はＰピクチャであり、ピクチャ０のタイミングで符号化された選択カメラＸのフレーム画像６０Ｘ０（Ｉピクチャ）を共通参照画像として符号化する。 At the timing of the next picture 1, the frame image 60X1 of the selected camera X is a P picture, and the frame image 60X0 (I picture) of the selected camera X encoded at the timing of picture 0 is encoded as a common reference image.

また、ピクチャ１のタイミングにおける隣接カメラＡ〜Ｄのフレーム画像６０Ａ１〜６０Ｄ１はＰピクチャであり、ピクチャ０のタイミングで符号化された選択カメラＸのフレーム画像６０Ｘ０を共通参照画像として符号化する。 The frame images 60A1 to 60D1 of the adjacent cameras A to D at the timing of the picture 1 are P pictures, and the frame image 60X0 of the selected camera X encoded at the timing of the picture 0 is encoded as a common reference image.

このように、ピクチャ１のタイミングでは、選択カメラＸ、隣接カメラＡ〜Ｄで撮影された画像の全てについて、直前のピクチャ０のタイミングで符号化された選択カメラＸのフレーム画像６０Ｘ０を共通参照画像として符号化する。 As described above, at the timing of the picture 1, the frame image 60X0 of the selected camera X encoded at the timing of the immediately preceding picture 0 is used as the common reference image for all the images taken by the selected camera X and the adjacent cameras A to D. Is encoded as

同様に、ピクチャ２のタイミングでは、選択カメラＸ、隣接カメラＡ〜Ｄで撮影された画像の全てについて、直前のピクチャ１のタイミングで符号化された選択カメラＸのフレーム画像６０Ｘ１を共通参照画像として符号化する。 Similarly, at the timing of picture 2, the frame image 60X1 of the selected camera X encoded at the timing of the immediately previous picture 1 is used as a common reference image for all images taken by the selected camera X and adjacent cameras A to D. Encode.

以降、選択カメラＸが別の撮影カメラに切り替えられるまでは、直前のタイミングで符号化された選択カメラＸのフレーム画像を共通参照画像として選択カメラＸ、隣接カメラＡ〜Ｄの撮影画像は符号化される。 Thereafter, until the selected camera X is switched to another photographing camera, the frame image of the selected camera X encoded at the immediately preceding timing is used as a common reference image, and the captured images of the selected camera X and the adjacent cameras A to D are encoded. Is done.

図６の例では、ピクチャ３の時点で選択カメラＸが撮影カメラＣａｍ６から撮影カメラＣａｍ５に切り替えられるため、ピクチャ３までは、撮影カメラＣａｍ６の撮影画像を符号化したフレーム画像が共通参照画像となる。 In the example of FIG. 6, since the selected camera X is switched from the shooting camera Cam6 to the shooting camera Cam5 at the time of the picture 3, the frame image obtained by encoding the shot image of the shooting camera Cam6 is the common reference image up to the picture 3. .

そして、ピクチャ３の時点で選択カメラＸが撮影カメラＣａｍ６から撮影カメラＣａｍ５に切り替えられると、ピクチャ４以降は、選択カメラＸが撮影カメラＣａｍ５、隣接カメラＡ〜Ｄは、それぞれ撮影カメラＣａｍ４、Ｃａｍ６、Ｃａｍ１、Ｃａｍ９となる。 When the selected camera X is switched from the photographic camera Cam6 to the photographic camera Cam5 at the time of the picture 3, after the picture 4, the selected camera X is the photographic camera Cam5, and the adjacent cameras A to D are the photographic cameras Cam4, Cam6, Cam1 and Cam9.

そして、ピクチャ４のタイミングでは、選択カメラＸが撮影カメラＣａｍ５に切り替えられているため、直前のピクチャ３のタイミングでは隣接カメラＡであった撮影カメラＣａｍ５の撮影画像を符号化したフレーム画像６０Ａ３を共通参照画像とする。 Since the selected camera X is switched to the photographing camera Cam5 at the timing of the picture 4, the frame image 60A3 obtained by encoding the photographed image of the photographing camera Cam5 which is the adjacent camera A at the timing of the immediately preceding picture 3 is shared. Let it be a reference image.

ピクチャ５以降は、選択カメラＸは撮影カメラＣａｍ５に設定されているので、直前のタイミングで選択カメラＸにより撮影された撮影画像を符号化したフレーム画像を共通参照画像とする。このように、選択カメラＸが切り替えられると、共通参照画像も切り替えられる。 Since the selected camera X is set to the photographic camera Cam5 after the picture 5, a frame image obtained by encoding the photographic image taken by the selected camera X at the immediately preceding timing is used as a common reference image. Thus, when the selected camera X is switched, the common reference image is also switched.

図４に示すように、全体制御部３０には、直前の符号化タイミングにおける直前選択カメラ位置が直前選択カメラ位置記憶領域３０Ａに記憶され、符号化タイミングが到来する毎に更新される。 As shown in FIG. 4, the overall control unit 30 stores the immediately preceding selected camera position at the immediately preceding encoding timing in the immediately preceding selected camera position storage area 30A, and is updated every time the encoding timing arrives.

カメラ選択部４８は、入力された現在の選択カメラ位置と、直前選択カメラ位置記憶領域３０Ａに記憶された直前選択カメラ位置とを比較する。そして、入力された現在の選択カメラ位置が直前選択カメラ位置と同一の場合は、選択カメラＸの撮影画像を符号化したフレーム画像を共通参照画像に設定する。 The camera selection unit 48 compares the input current selected camera position with the immediately preceding selected camera position stored in the immediately preceding selected camera position storage area 30A. If the input current selected camera position is the same as the previous selected camera position, a frame image obtained by encoding the captured image of the selected camera X is set as a common reference image.

この場合、メモリコントローラ３８は、選択カメラ画像符号化部５２Ｘにより符号化されたフレーム画像（ローカルデコード画像）を共通参照画像としてＳＤＲＡＭ４０の選択カメラ参照画像領域４０Ｘ２に記憶させる。また、メモリコントローラ３８は、選択カメラ画像符号化部５２Ｘにより符号化されたフレーム画像を共通参照画像としてＳＤＲＡＭ４０の隣接カメラＡ参照画像領域４０Ａ２〜隣接カメラＤ参照画像領域４０Ｄ２に記憶させる。 In this case, the memory controller 38 stores the frame image (local decoded image) encoded by the selected camera image encoding unit 52X in the selected camera reference image area 40X2 of the SDRAM 40 as a common reference image. Further, the memory controller 38 stores the frame image encoded by the selected camera image encoding unit 52X in the adjacent camera A reference image area 40A2 to the adjacent camera D reference image area 40D2 of the SDRAM 40 as a common reference image.

一方、入力された現在の選択カメラ位置が直前選択カメラ位置と同一でない場合、すなわち選択カメラＸが切り替えられた場合、切り替えられた撮影カメラが直前のタイミングで撮影した撮影画像を符号化したフレーム画像を共通参照画像に設定する。なお、カメラ選択部４８は、開示の技術における切替部の一例である。 On the other hand, when the input current selected camera position is not the same as the immediately preceding selected camera position, that is, when the selected camera X is switched, a frame image obtained by encoding the captured image captured at the immediately preceding timing by the switched capturing camera. Is set as a common reference image. The camera selection unit 48 is an example of a switching unit in the disclosed technology.

この場合、メモリコントローラ３８は、隣接カメラＡ符号化部５２Ａ〜隣接カメラＤ符号化部５２Ｄのうち、切り替えられた撮影カメラに対応する符号化部により符号化されたフレーム画像を共通参照画像としてＳＤＲＡＭ４０の各参照画像領域に記憶させる。すなわち、ＳＤＲＡＭ４０の選択カメラ参照画像領域４０Ｘ２、隣接カメラＡ参照画像領域４０Ａ２〜隣接カメラＤ参照画像領域４０Ｄ２に記憶させる。 In this case, the memory controller 38 uses the frame image encoded by the encoding unit corresponding to the switched photographing camera among the adjacent camera A encoding unit 52A to the adjacent camera D encoding unit 52D as a common reference image. Are stored in each reference image area. That is, the selected camera reference image area 40X2 and the adjacent camera A reference image area 40A2 to the adjacent camera D reference image area 40D2 of the SDRAM 40 are stored.

カメラ選択部５４は、入力された現在の選択カメラ位置と、直前選択カメラ位置とを比較する。そして、入力された現在の選択カメラ位置が直前選択カメラ位置と同一の場合は、選択カメラＸのビットストリームデータを選択して出力するようストリーム選択部５６に指示する。 The camera selection unit 54 compares the input current selected camera position with the immediately preceding selected camera position. If the input current selected camera position is the same as the previous selected camera position, the stream selection unit 56 is instructed to select and output the bit stream data of the selected camera X.

これにより、ストリーム選択部５６は、ＳＤＲＡＭ４０の選択カメラストリーム記憶領域４０Ｘ３からビットストリームデータを読み出し、出力ストリームとして動画像通信装置１２Ｂに送信する。一方、入力された現在の選択カメラ位置が直前選択カメラ位置と同一でない、すなわち選択カメラＸが切り替えられた場合は、切り替え後の撮影カメラに対応したストリーム記憶領域からビットストリームデータを読み出す。そして、出力ストリームとして動画像通信装置１２Ｂに送信する。なお、ストリーム選択部５６は、開示の技術における符号化データ送信部の一例である。 Thereby, the stream selection unit 56 reads the bit stream data from the selected camera stream storage area 40X3 of the SDRAM 40, and transmits it to the moving image communication apparatus 12B as an output stream. On the other hand, if the input current selected camera position is not the same as the previous selected camera position, that is, if the selected camera X is switched, the bit stream data is read from the stream storage area corresponding to the switched camera. And it transmits to the moving image communication apparatus 12B as an output stream. The stream selection unit 56 is an example of an encoded data transmission unit in the disclosed technology.

復号化装置２２Ａは、動画像通信装置１２Ｂから送信されたビットストリームデータを復号して、ディスプレイ２４Ａに表示する。 The decoding device 22A decodes the bit stream data transmitted from the video communication device 12B and displays it on the display 24A.

動画像符号化装置２０Ａは、例えば図７に示すコンピュータ７０で実現することができる。コンピュータ７０はＣＰＵ７２、メモリ７４、及び不揮発性の記憶部７６を備え、これらはバス７８を介して互いに接続されている。 The moving image encoding apparatus 20A can be realized by a computer 70 shown in FIG. 7, for example. The computer 70 includes a CPU 72, a memory 74, and a nonvolatile storage unit 76, which are connected to each other via a bus 78.

また、記憶部７６はＨＤＤ(Hard Disk Drive)やフラッシュメモリ等によって実現できる。記録媒体としての記憶部７６には、コンピュータ７０を動画像符号化装置２０Ａとして機能させるための動画像符号化プログラム８０が記憶されている。ＣＰＵ７２は、動画像符号化プログラム８０を記憶部７６から読み出してメモリ７４に展開し、動画像符号化プログラム８０が有するプロセスを順次実行する。 The storage unit 76 can be realized by an HDD (Hard Disk Drive), a flash memory, or the like. The storage unit 76 as a recording medium stores a moving image encoding program 80 for causing the computer 70 to function as the moving image encoding device 20A. The CPU 72 reads out the moving image encoding program 80 from the storage unit 76 and expands it in the memory 74, and sequentially executes processes included in the moving image encoding program 80.

動画像符号化プログラム８０は、入力画像取り込みプロセス８２、符号化プロセス８４、及びストリーム出力プロセス８６を有する。 The moving image encoding program 80 includes an input image capturing process 82, an encoding process 84, and a stream output process 86.

ＣＰＵ７２は、入力画像取り込みプロセス８２を実行することで、図４に示す入力画像取り込み部３２として動作する。また、ＣＰＵ７２は、符号化プロセス８４を実行することで、図４に示す符号化部３４として動作する。また、ＣＰＵ７２は、ストリーム出力プロセス８６を実行することで、図４に示すストリーム出力部３６として動作する。 The CPU 72 operates as the input image capturing unit 32 illustrated in FIG. 4 by executing the input image capturing process 82. The CPU 72 operates as the encoding unit 34 illustrated in FIG. 4 by executing the encoding process 84. The CPU 72 operates as the stream output unit 36 illustrated in FIG. 4 by executing the stream output process 86.

これにより、動画像符号化プログラム８０を実行したコンピュータ７０が、動画像符号化装置２０Ａとして機能することになる。なお、動画像符号化プログラム８０は開示の技術における動画像符号化プログラムの一例である。 As a result, the computer 70 that has executed the moving picture coding program 80 functions as the moving picture coding apparatus 20A. The moving image encoding program 80 is an example of a moving image encoding program in the disclosed technology.

なお、動画像符号化装置２０Ａは、例えば半導体集積回路、より詳しくはＡＳＩＣ(Application Specific Integrated Circuit)等で実現することも可能である。 Note that the moving picture coding apparatus 20A can be realized by, for example, a semiconductor integrated circuit, more specifically, an ASIC (Application Specific Integrated Circuit) or the like.

動画像通信装置１２Ｂの構成は、動画像通信装置１２Ａと同様であるので説明は省略する。 Since the configuration of the moving image communication device 12B is the same as that of the moving image communication device 12A, description thereof is omitted.

次に本実施形態の作用を説明する。本実施形態に係る動画像符号化装置２０Ａでは、ユーザーにより動画像通信を実行するように指示されると、図８に示す動画像符号化処理を実行する。なお、本実施形態では、図５に示すように、選択カメラＸの上下左右に隣接するカメラを隣接カメラＡ〜Ｄとして設定する場合について説明する。 Next, the operation of this embodiment will be described. In the moving image encoding device 20A according to the present embodiment, when the user instructs to execute moving image communication, the moving image encoding process shown in FIG. 8 is executed. In the present embodiment, as shown in FIG. 5, a case will be described in which cameras adjacent to the selected camera X in the vertical and horizontal directions are set as adjacent cameras A to D.

ステップ１００では、入力画像取り込み部３２が、図９に示すような入力画像取り込み処理を実行する。 In step 100, the input image capturing unit 32 executes an input image capturing process as shown in FIG.

ステップ１０２では、符号化部３４が、図１０に示すような符号化処理を実行する。 In step 102, the encoding unit 34 executes an encoding process as shown in FIG.

ステップ１０４では、ストリーム出力部３６が、図１１に示すようなストリーム出力処理を実行する。 In step 104, the stream output unit 36 executes a stream output process as shown in FIG.

ステップ１０６では、全体制御部３０が、ユーザーにより動画像通信の終了が指示されたか否かを判断し、動画像通信の終了が指示された場合には本ルーチンを終了し、動画像通信の終了が指示されていない場合には、ステップ１００へ戻って上記の処理を繰り返す。 In step 106, the overall control unit 30 determines whether or not the user has instructed the end of the moving image communication. When the end of the moving image communication is instructed, this routine is ended and the moving image communication is ended. Is not instructed, the process returns to step 100 and the above processing is repeated.

図９のステップ２００では、カメラ選択部４２が、全体制御部３０から入力された選択カメラ位置に対応する撮影カメラを選択するように選択カメラセレクタ４４Ｘに対して指示する。これにより、選択カメラセレクタ４４Ｘは、撮影カメラＣａｍ０Ａ〜Ｃａｍ１１Ａのうちカメラ選択部４２により指示された選択カメラ位置に対応した撮影カメラを選択する。 In step 200 of FIG. 9, the camera selection unit 42 instructs the selected camera selector 44 X to select a shooting camera corresponding to the selected camera position input from the overall control unit 30. Accordingly, the selected camera selector 44X selects a photographing camera corresponding to the selected camera position instructed by the camera selection unit 42 among the photographing cameras Cam0A to Cam11A.

また、カメラ選択部４２は、選択カメラ位置に対応した撮影カメラに隣接する撮影カメラを隣接カメラとして、各隣接カメラを選択するように隣接カメラＡセレクタ４４Ａ〜隣接カメラＤセレクタ４４Ｄに対して各々指示する。これにより、隣接カメラＡセレクタ４４Ａ〜隣接カメラＤセレクタ４４Ｄは、カメラ選択部４２により指示された隣接カメラに対応した撮影カメラを各々選択する。 In addition, the camera selection unit 42 instructs each of the adjacent camera A selector 44A to the adjacent camera D selector 44D to select each adjacent camera using the shooting camera adjacent to the shooting camera corresponding to the selected camera position as the adjacent camera. To do. As a result, the adjacent camera A selector 44A to the adjacent camera D selector 44D each select a photographing camera corresponding to the adjacent camera instructed by the camera selection unit 42.

ステップ２０２では、カメラ選択部４２が、全体制御部３０から撮影画像の取り込み開始が指示されると、各セレクタに対して撮影画像の取り込み開始を指示する。これにより、選択カメラセレクタ４４Ｘは、選択した撮影カメラの撮影画像を選択カメラ画像入力部４６Ｘに出力する。選択カメラ画像入力部４６Ｘは、入力された撮影画像に予め定めたフィルタ処理等を施して、メモリコントローラ３８に出力する。メモリコントローラ３８は、入力された選択カメラＸの撮影画像をＳＤＲＡＭ４０の選択カメラ原画像記憶領域４０Ｘ１に記憶させる。 In step 202, when the camera selection unit 42 is instructed to start capturing captured images from the overall control unit 30, it instructs each selector to start capturing captured images. As a result, the selected camera selector 44X outputs the captured image of the selected imaging camera to the selected camera image input unit 46X. The selected camera image input unit 46 X performs a predetermined filter process or the like on the input captured image and outputs the result to the memory controller 38. The memory controller 38 stores the input photographed image of the selected camera X in the selected camera original image storage area 40X1 of the SDRAM 40.

また、隣接カメラＡセレクタ４４Ａ〜隣接カメラＤセレクタ４４Ｄは、選択した撮影カメラの撮影画像を隣接カメラＡ画像入力部４６Ａ〜隣接カメラＤ画像入力部４６Ｄに各々出力する。隣接カメラＡ画像入力部４６Ａ〜隣接カメラＤ画像入力部４６Ｄは、入力された撮影画像に予め定めたフィルタ処理等を施して、メモリコントローラ３８に出力する。メモリコントローラ３８は、入力された隣接カメラＡ〜隣接カメラＤの撮影画像をＳＤＲＡＭ４０の対応する原画像領域に各々記憶させる。 The adjacent camera A selector 44A to the adjacent camera D selector 44D output the captured images of the selected shooting cameras to the adjacent camera A image input unit 46A to the adjacent camera D image input unit 46D, respectively. The adjacent camera A image input unit 46 A to the adjacent camera D image input unit 46 D perform predetermined filter processing or the like on the input captured image, and output the result to the memory controller 38. The memory controller 38 stores the input captured images of the adjacent cameras A to D in the corresponding original image areas of the SDRAM 40, respectively.

ステップ２０４では、各画像入力部が、１フレーム分の撮影画像の取り込みが完了したか否かを判断し、取り込みが終了した場合には本ルーチンを終了し、取り込みが終了していない場合には、撮影画像の取り込みを継続する。 In step 204, each image input unit determines whether or not capturing of one frame of the captured image has been completed. When capturing is complete, this routine ends. When capturing is not complete, Continue to capture captured images.

図６の例の場合、ピクチャ３のタイミングで選択カメラが切り替わる。具体的には、図１２に示すように、部屋Ｂの人物２６Ｂが移動し、選択カメラが撮影カメラＣａｍ６ＡからＣａｍ５Ａに切り替わる。この場合、図５に示すように、隣接カメラＡ〜Ｄも撮影カメラＣａｍ０４Ａ、Ｃａｍ６Ａ、Ｃａｍ１Ａ、Ｃａｍ９Ａに各々切り替わる。このため、図１３に示すように、入力画像取り込み処理では、ピクチャ０〜３までは、選択カメラＸとして撮影カメラＣａｍ６Ａの撮影画像が取り込まれる。また、隣接カメラＡとして撮影カメラＣａｍ５Ａの撮影画像が、隣接カメラＢとして撮影カメラＣａｍ７Ａの撮影画像が、隣接カメラＣとして撮影カメラＣａｍ２Ａの撮影画像が、隣接カメラＤとして撮影カメラＣａｍ１０Ａの撮影画像が取り込まれる。 In the case of the example in FIG. 6, the selected camera is switched at the timing of picture 3. Specifically, as shown in FIG. 12, the person 26B in the room B moves, and the selected camera is switched from the photographing camera Cam6A to Cam5A. In this case, as shown in FIG. 5, the adjacent cameras A to D are also switched to the photographing cameras Cam04A, Cam6A, Cam1A, and Cam9A, respectively. For this reason, as shown in FIG. 13, in the input image capturing process, the captured images of the capturing camera Cam6A are captured as the selected camera X for the pictures 0 to 3. In addition, the captured image of the photographic camera Cam5A as the adjacent camera A, the captured image of the photographic camera Cam7A as the adjacent camera B, the captured image of the photographic camera Cam2A as the adjacent camera C, and the captured image of the photographic camera Cam10A as the adjacent camera D are captured. It is.

そして、図１３に示すように、選択カメラが切り替えられた後のピクチャ４〜６では、選択カメラＸとして撮影カメラＣａｍ５Ａの撮影画像が取り込まれる。また、隣接カメラＡとして撮影カメラＣａｍ４Ａの撮影画像が、隣接カメラＢとして撮影カメラＣａｍ６Ａの撮影画像が、隣接カメラＣとして撮影カメラＣａｍ１Ａの撮影画像が、隣接カメラＤとして撮影カメラＣａｍ９Ａの撮影画像が取り込まれる。 As shown in FIG. 13, in the pictures 4 to 6 after the selected camera is switched, the captured image of the imaging camera Cam5A is captured as the selected camera X. Further, the captured image of the photographic camera Cam4A is captured as the adjacent camera A, the captured image of the photographic camera Cam6A as the adjacent camera B, the captured image of the photographic camera Cam1A as the adjacent camera C, and the captured image of the photographic camera Cam9A as the adjacent camera D is captured. It is.

図１０のステップ３００では、カメラ選択部４８が、入力された現在の選択カメラ位置と、直前選択カメラ位置とを比較する。すなわち、選択カメラが切り替えられたか否かを判断する。そして、入力された現在の選択カメラ位置が直前選択カメラ位置と同一でない場合、すなわち選択カメラが切り替えられた場合はステップ３０２へ移行する。一方、入力された現在の選択カメラ位置が直前選択カメラ位置と同一の場合、すなわち選択カメラが切り替えられていない場合はステップ３０４へ移行する。 In step 300 of FIG. 10, the camera selection unit 48 compares the input current selected camera position with the immediately preceding selected camera position. That is, it is determined whether the selected camera has been switched. If the input current selected camera position is not the same as the previous selected camera position, that is, if the selected camera is switched, the process proceeds to step 302. On the other hand, if the input current selected camera position is the same as the previous selected camera position, that is, if the selected camera has not been switched, the process proceeds to step 304.

ステップ３０２では、カメラ選択部４８が、切り替え後の選択カメラが直前のタイミングで撮影した撮影画像を符号化したフレーム画像を共通参照画像に設定する。 In step 302, the camera selection unit 48 sets, as a common reference image, a frame image obtained by encoding a photographed image photographed at the immediately preceding timing by the selected camera after switching.

一方、ステップ３０４では、カメラ選択部４８が、選択カメラＸは切り替えられていないので、引き続き選択カメラＸの撮影画像を符号化したフレーム画像を共通参照画像に設定する。 On the other hand, in step 304, since the selected camera X has not been switched, the camera selection unit 48 continues to set a frame image obtained by encoding the captured image of the selected camera X as a common reference image.

ステップ３０６では、全体制御部３０が、直前選択カメラ位置を更新する。なお、選択カメラＸが切り替えられていない場合は、直前選択カメラ位置は同じ値で上書きされる。 In step 306, the overall control unit 30 updates the last selected camera position. If the selected camera X has not been switched, the immediately preceding selected camera position is overwritten with the same value.

ステップ３０８では、全体制御部３０が、符号化を開始するように各符号化部に指示する。これにより、選択カメラ画像符号化部５２Ｘは、選択カメラ原画像記憶領域４０Ｘ１に記憶された選択カメラ原画像と選択カメラ参照画像領域４０Ｘ２に記憶された共通参照画像とに基づいて、撮影画像を符号化する。 In step 308, the overall control unit 30 instructs each encoding unit to start encoding. Thus, the selected camera image encoding unit 52X encodes the captured image based on the selected camera original image stored in the selected camera original image storage area 40X1 and the common reference image stored in the selected camera reference image area 40X2. Turn into.

また、隣接カメラＡ符号化部５２Ａ〜隣接カメラＤ符号化部５２Ｄも選択カメラ画像符号化部５２Ｘと同様に、対応する原画像領域に記憶された選択カメラ原画像と、対応する参照画像領域に記憶された共通参照画像と、に基づいて、撮影画像を符号化する。そして、ビットストリームデータを生成してＳＤＲＡＭ４０の対応するストリーム記憶領域に記憶させる。 Similarly to the selected camera image encoding unit 52X, the adjacent camera A encoding unit 52A to the adjacent camera D encoding unit 52D also select the selected camera original image stored in the corresponding original image region and the corresponding reference image region. A captured image is encoded based on the stored common reference image. Then, bit stream data is generated and stored in the corresponding stream storage area of the SDRAM 40.

ステップ３１０では、各符号化部が、１フレーム分の撮影画像の符号化が完了したか否かを判断し、符号化が完了した場合は本ルーチンを終了し、符号化が完了していない場合は符号化処理を継続する。 In step 310, each encoding unit determines whether or not encoding of the captured image for one frame is completed. When encoding is completed, this routine is terminated, and when encoding is not completed. Continues the encoding process.

これにより、図１３に示すように、符号化処理では、ピクチャ３の時点で共通参照画像が撮影カメラＣａｍ６Ａのフレーム画像から撮影カメラＣａｍ５Ａのフレーム画像に切り替わる。 Accordingly, as illustrated in FIG. 13, in the encoding process, the common reference image is switched from the frame image of the photographing camera Cam6A to the frame image of the photographing camera Cam5A at the time of the picture 3.

図１１のステップ４００では、カメラ選択部５４が、入力された現在の選択カメラ位置と、直前選択カメラ位置とを比較する。すなわち、選択カメラが切り替えられたか否かを判断する。そして、入力された現在の選択カメラ位置が直前選択カメラ位置と同一でない場合、すなわち選択カメラが切り得られた場合はステップ４０２へ移行する。一方、入力された現在の選択カメラ位置が直前選択カメラ位置と同一の場合、すなわち選択カメラが切り得られていない場合はステップ４０４へ移行する。 In step 400 of FIG. 11, the camera selection unit 54 compares the input current selected camera position with the immediately preceding selected camera position. That is, it is determined whether the selected camera has been switched. If the input current selected camera position is not the same as the previous selected camera position, that is, if the selected camera is obtained, the process proceeds to step 402. On the other hand, if the input current selected camera position is the same as the previous selected camera position, that is, if the selected camera has not been obtained, the process proceeds to step 404.

ステップ４０２では、カメラ選択部５４が、切り替え後の撮影カメラのビットストリームデータを出力ストリームとして選択するようにストリーム選択部５６に指示する。 In step 402, the camera selection unit 54 instructs the stream selection unit 56 to select the bit stream data of the photographing camera after switching as an output stream.

一方、ステップ４０４では、カメラ選択部５４が、引き続き選択カメラＸのビットストリームデータを出力ストリームとして選択するようにストリーム選択部５６に指示する。 On the other hand, in step 404, the camera selection unit 54 instructs the stream selection unit 56 to continue to select the bit stream data of the selected camera X as an output stream.

ステップ４０６では、全体制御部３０が、ビットストリームデータの出力を開始するようにストリーム選択部５６に指示する。これにより、ストリーム選択部５６は、ＳＤＲＡＭ４０の選択カメラストリーム記憶領域４０Ｘ３からビットストリームデータを読み出し、出力ストリームとして動画像通信装置１２Ｂに送信する。 In step 406, the overall control unit 30 instructs the stream selection unit 56 to start outputting the bit stream data. Thereby, the stream selection unit 56 reads the bit stream data from the selected camera stream storage area 40X3 of the SDRAM 40, and transmits it to the moving image communication apparatus 12B as an output stream.

ステップ４０８では、ストリーム選択部５６が、１フレーム分のストリームデータの出力が完了したか否かを判断し、出力が完了した場合は本ルーチンを終了し、出力が完了していない場合は出力を継続する。 In step 408, the stream selection unit 56 determines whether or not the output of the stream data for one frame has been completed. If the output has been completed, this routine is terminated. If the output has not been completed, the output is output. continue.

これにより、図１３に示すように、ストリーム出力処理では、ピクチャ３の時点で選択カメラＸの出力ストリームが撮影カメラＣａｍ５ＡのストリームデータからＣａｍ５Ａのストリームデータに切り替わる。 Accordingly, as shown in FIG. 13, in the stream output process, the output stream of the selected camera X is switched from the stream data of the photographing camera Cam5A to the stream data of Cam5A at the time of the picture 3.

このように、本実施形態では、選択カメラにより撮影された撮影画像だけでなく、選択カメラに隣接する隣接カメラの撮影画像も符号化しておく。このため、選択カメラが切り替わった後に撮影を開始して符号する場合と比較して、動画像データの通信の遅延時間を削減することができる。また、複数の撮影カメラにより撮影された撮影画像を符号化する際の参照画像を、過去に選択カメラで撮影された撮影画像を符号化したフレーム画像に設定し、共通とするため、選択カメラが切り替わった際に違和感が発生するのを防ぐことができる。また、多視点符号化方法を用いる必要もない。 As described above, in this embodiment, not only a captured image captured by the selected camera but also a captured image of an adjacent camera adjacent to the selected camera is encoded. For this reason, the delay time of communication of moving image data can be reduced as compared with the case where imaging is started and encoded after the selected camera is switched. In addition, since the reference image when encoding captured images captured by a plurality of imaging cameras is set to a frame image that is an encoded captured image previously captured by the selected camera, the selected camera It is possible to prevent a sense of incongruity from occurring when switching. Further, there is no need to use a multi-view coding method.

なお、本実施形態では、１２個の撮影カメラを格子状に設けた場合について説明したが、撮影カメラの数及び配置の仕方はこれに限られるものではない。 In the present embodiment, the case where twelve photographing cameras are provided in a lattice shape has been described. However, the number and arrangement of the photographing cameras are not limited to this.

また、本実施形態では、選択カメラの上下左右の４個の撮影カメラを隣接カメラＡ〜Ｄとして設定した場合について説明したが、設定可能な隣接カメラの数及び位置はこれに限られるものではない。例えば、選択カメラの斜め方向に位置する撮影カメラも隣接カメラとして設定してもよい。 Further, in the present embodiment, the case has been described in which four shooting cameras on the top, bottom, left, and right of the selected camera are set as the adjacent cameras A to D, but the number and positions of the adjacent cameras that can be set are not limited thereto. . For example, a shooting camera positioned in the oblique direction of the selected camera may be set as the adjacent camera.

また、本実施形態では、選択カメラが相手の位置に応じて切り替えられる場合について説明したが、これに限らず、選択カメラの切り替えをユーザーが指示するようにしてもよい。 In this embodiment, the case where the selected camera is switched according to the position of the other party has been described. However, the present invention is not limited to this, and the user may instruct switching of the selected camera.

また、本実施形態では被写体が人物の場合について説明しが、これに限らず、人間以外の動物等でもよい。 In the present embodiment, the case where the subject is a person is described. However, the present invention is not limited to this, and an animal other than a human may be used.

また、上記では開示の技術に係る動画像符号化プログラムの一例である動画像符号化プログラム８０が記憶部７６に予め記憶（インストール）されている態様を説明したが、これに限定されるものではない。開示の技術に係る画像処理プログラムは、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭ等の記録媒体に記録されている形態で提供することも可能である。 Moreover, although the moving image encoding program 80 which is an example of the moving image encoding program which concerns on the technique of the indication demonstrated above the aspect pre-stored (installed) in the memory | storage part 76, it is not limited to this Absent. The image processing program according to the disclosed technology can be provided in a form recorded on a recording medium such as a CD-ROM or a DVD-ROM.

本明細書に記載された全ての文献、特許出願及び技術規格は、個々の文献、特許出願及び技術規格が参照により取り込まれることが具体的かつ個々に記された場合と同程度に、本明細書中に参照により取り込まれる。 All documents, patent applications and technical standards mentioned in this specification are to the same extent as if each individual document, patent application and technical standard were specifically and individually stated to be incorporated by reference. Incorporated by reference in the book.

以上の実施形態に関し、更に以下の付記を開示する。 Regarding the above embodiment, the following additional notes are disclosed.

（付記１）
複数の撮影カメラにより撮影対象を各々撮影した撮影画像のうち、前記複数の撮影カメラの中から選択された選択カメラにより順次撮影された選択撮影画像及び前記選択カメラ以外の非選択カメラにより撮影された非選択撮影画像の各々を、過去の前記選択撮影画像をフレーム間予測を用いた符号化方法により符号化したフレーム画像を共通参照画像として前記符号化方法により符号化する符号化部と、
前記選択カメラが他の撮影カメラに切り替えられた場合に、前記共通参照画像を、過去の前記他の撮影カメラにより撮影された撮影画像を前記符号化方法により符号化したフレーム画像に切り替える切替部と、
前記符号化部により符号化された符号化データを通信相手側の装置に送信する符号化データ送信部と、
を含む動画像符号化装置。 (Appendix 1)
Of the captured images captured by the plurality of shooting cameras, the selected shooting images sequentially shot by the selected camera selected from the plurality of shooting cameras and the non-selected cameras other than the selected camera were shot. An encoding unit that encodes each non-selected captured image by the encoding method using a frame image obtained by encoding the past selected captured image by an encoding method using inter-frame prediction as a common reference image;
A switching unit that switches the common reference image to a frame image obtained by encoding the captured image captured by the other imaging camera in the past by the encoding method when the selected camera is switched to another imaging camera; ,
An encoded data transmission unit that transmits the encoded data encoded by the encoding unit to a communication partner side device;
A video encoding apparatus including:

（付記２）
前記非選択カメラは、前記複数の撮影カメラのうち、前記選択カメラに隣接する撮影カメラである
付記１記載の動画像符号化装置。 (Appendix 2)
The moving image coding apparatus according to claim 1, wherein the non-selected camera is a photographing camera adjacent to the selected camera among the plurality of photographing cameras.

（付記３）
前記撮影対象の位置を検出するための検出用カメラと、
前記検出用カメラにより撮影された撮影画像に基づいて算出した前記撮影対象の位置に基づいて、前記通信相手側に設けられた複数の撮影カメラの中から前記通信相手側の撮影対象を撮影する撮影カメラの位置を通信相手選択カメラ位置として特定する選択カメラ位置特定部と、
前記通信相手選択カメラ位置を前記通信相手側の装置に送信する選択カメラ位置送信部と、
を備えた付記１又は付記２記載の動画像符号化装置。 (Appendix 3)
A detection camera for detecting the position of the photographing target;
Photographing for photographing a photographing target on the communication partner side from among a plurality of photographing cameras provided on the communication partner side based on a position of the photographing target calculated based on a photographed image photographed by the detection camera. A selected camera position specifying unit that specifies the position of the camera as the communication partner selected camera position;
A selected camera position transmitter for transmitting the communication partner selection camera position to the communication partner side device;
The moving picture encoding device according to Supplementary Note 1 or Supplementary Note 2, comprising:

（付記４）
前記通信相手側の装置は、
前記通信相手側の撮影対象の位置を検出するための通信相手側検出用カメラと、
前記通信相手側検出用カメラにより撮影された撮影画像に基づいて算出した前記通信相手側の撮影対象の位置に基づいて、前記複数の撮影カメラの中から前記撮影対象を撮影する撮影カメラの位置を選択カメラ位置として特定する通信相手側選択カメラ位置特定部と、
前記選択カメラ位置を本動画像符号化装置に送信する通信相手側選択カメラ位置送信部と、
を備え、
前記選択カメラは、前記通信相手側選択カメラ位置送信部から送信された前記選択カメラ位置に対応した撮影カメラである
付記１〜３の何れかに記載の動画像符号化装置。 (Appendix 4)
The communication partner side device is:
A communication partner side detection camera for detecting the position of the photographing target on the communication partner side;
Based on the position of the photographing target on the communication partner side calculated based on the photographed image photographed by the communication partner side detection camera, the position of the photographing camera for photographing the photographing target from the plurality of photographing cameras is determined. A communication partner side selected camera position identifying unit that identifies the selected camera position;
A communication partner-side selected camera position transmitting unit that transmits the selected camera position to the moving image encoding device;
With
The moving image encoding apparatus according to any one of appendices 1 to 3, wherein the selected camera is a photographing camera corresponding to the selected camera position transmitted from the communication partner side selected camera position transmitting unit.

（付記５）
複数の撮影カメラにより順次撮影された被写体の撮影画像のうち、前記複数の撮影カメラから選択された選択カメラにより順次撮影された選択撮影画像及び前記選択カメラ以外の非選択カメラにより撮影された非選択撮影画像の各々を、過去の前記選択撮影画像をフレーム間予測を用いた符号化方法により符号化したフレーム画像を共通参照画像として前記符号化方法により符号化し、
前記選択カメラが他の撮影カメラに切り替えられた場合に、前記共通参照画像を、過去の前記他の撮影カメラにより撮影された撮影画像を前記符号化方法により符号化したフレーム画像に切り替える
ことを含む動画像符号化方法。 (Appendix 5)
Among the photographed images of the subject photographed sequentially by a plurality of photographing cameras, a selected photographed image sequentially photographed by a selected camera selected from the plurality of photographing cameras and a non-selected photographed by a non-selected camera other than the selected camera Each of the captured images is encoded by the encoding method as a common reference image, a frame image obtained by encoding the past selected captured image by an encoding method using inter-frame prediction,
When the selected camera is switched to another shooting camera, the common reference image is switched to a frame image obtained by encoding a captured image shot by the other shooting camera in the past by the encoding method. Video encoding method.

（付記６）
前記非選択カメラは、前記複数の撮影カメラのうち、前記選択カメラに隣接する撮影カメラである
付記３記載の動画像符号化方法。 (Appendix 6)
The moving image encoding method according to claim 3, wherein the non-selected camera is a shooting camera adjacent to the selected camera among the plurality of shooting cameras.

（付記７）
前記撮影対象の位置を検出するための検出用カメラにより撮影された撮影画像に基づいて算出した前記撮影対象の位置に基づいて、前記通信相手側に設けられた複数の撮影カメラの中から前記通信相手側の撮影対象を撮影する撮影カメラの位置を通信相手選択カメラ位置として特定し、
前記通信相手選択カメラ位置を前記通信相手側の装置に送信する
付記５又は付記６記載の動画像符号化方法。 (Appendix 7)
Based on the position of the photographing target calculated based on a photographed image photographed by a detection camera for detecting the position of the photographing target, the communication is performed from among a plurality of photographing cameras provided on the communication partner side. Specify the position of the shooting camera that shoots the other party's shooting target as the communication partner selection camera position,
The moving image encoding method according to claim 5 or 6, wherein the communication partner selection camera position is transmitted to the device on the communication partner side.

（付記８）
前記通信相手側の装置は、
前記通信相手側の撮影対象の位置を検出するための通信相手側検出用カメラと、
前記通信相手側検出用カメラにより撮影された撮影画像に基づいて算出した前記通信相手側の撮影対象の位置に基づいて、前記複数の撮影カメラの中から前記撮影対象を撮影する撮影カメラの位置を選択カメラ位置として特定する通信相手側選択カメラ位置特定部と、
前記選択カメラ位置を本動画像符号化装置に送信する通信相手側選択カメラ位置送信部と、
を備え、
前記選択カメラは、前記通信相手側選択カメラ位置送信部から送信された前記選択カメラ位置に対応した撮影カメラである
付記５〜７の何れか１項に記載の動画像符号化方法。 (Appendix 8)
The communication partner side device is:
A communication partner side detection camera for detecting the position of the photographing target on the communication partner side;
Based on the position of the photographing target on the communication partner side calculated based on the photographed image photographed by the communication partner side detection camera, the position of the photographing camera for photographing the photographing target from the plurality of photographing cameras is determined. A communication partner side selected camera position identifying unit that identifies the selected camera position;
A communication partner-side selected camera position transmitting unit that transmits the selected camera position to the moving image encoding device;
With
The moving image encoding method according to any one of claims 5 to 7, wherein the selected camera is a shooting camera corresponding to the selected camera position transmitted from the communication partner side selected camera position transmitting unit.

（付記９）
コンピュータに、
複数の撮影カメラにより順次撮影された被写体の撮影画像のうち、前記複数の撮影カメラから選択された選択カメラにより順次撮影された選択撮影画像及び前記選択カメラ以外の非選択カメラにより撮影された非選択撮影画像の各々を、過去の前記選択撮影画像をフレーム間予測を用いた符号化方法により符号化したフレーム画像を共通参照画像として前記符号化方法により符号化し、
前記選択カメラが他の撮影カメラに切り替えられた場合に、前記共通参照画像を、過去の前記他の撮影カメラにより撮影された撮影画像を前記符号化方法により符号化したフレーム画像に切り替える
ことを含む処理を実行させるための動画像符号化プログラム。 (Appendix 9)
On the computer,
Among the photographed images of the subject photographed sequentially by a plurality of photographing cameras, a selected photographed image sequentially photographed by a selected camera selected from the plurality of photographing cameras and a non-selected photographed by a non-selected camera other than the selected camera Each of the captured images is encoded by the encoding method as a common reference image, a frame image obtained by encoding the past selected captured image by an encoding method using inter-frame prediction,
When the selected camera is switched to another shooting camera, the common reference image is switched to a frame image obtained by encoding a captured image shot by the other shooting camera in the past by the encoding method. A moving image encoding program for executing processing.

（付記１０）
複数の撮影カメラにより撮影対象を各々撮影した撮影画像のうち、前記複数の撮影カメラの中から選択された選択カメラにより順次撮影された選択撮影画像及び前記選択カメラ以外の非選択カメラにより撮影された非選択撮影画像の各々を、過去の前記選択撮影画像をフレーム間予測を用いた符号化方法により符号化したフレーム画像を共通参照画像として前記符号化方法により符号化する符号化部と、前記選択カメラが他の撮影カメラに切り替えられた場合に、前記共通参照画像を、過去の前記他の撮影カメラにより撮影された撮影画像を前記符号化方法により符号化したフレーム画像に切り替える切替部と、前記符号化部により符号化された符号化データを通信相手側の装置に送信する符号化データ送信部と、を備えた動画像符号化装置と、
通信相手側の装置から送信された符号化データを受信して復号する復号化部と、
前記復号化部により復号化された復号化データに基づいて、動画像を表示する表示部と、
を含む動画像通信装置。 (Appendix 10)
Of the captured images captured by the plurality of shooting cameras, the selected shooting images sequentially shot by the selected camera selected from the plurality of shooting cameras and the non-selected cameras other than the selected camera were shot. An encoding unit that encodes each non-selected captured image by the encoding method using a frame image obtained by encoding the past selected captured image by an encoding method using inter-frame prediction as a common reference image; and the selection A switching unit that switches the common reference image to a frame image obtained by encoding the captured image captured by the other imaging camera in the past by the encoding method when the camera is switched to another imaging camera; A moving picture coding apparatus comprising: a coded data transmission unit configured to send coded data encoded by the coding unit to a communication partner side device ,
A decoding unit that receives and decodes the encoded data transmitted from the device on the communication partner side;
A display unit for displaying a moving image based on the decoded data decoded by the decoding unit;
A moving image communication apparatus including:

１０動画像通信システム
１２Ａ、１２Ｂ動画像通信装置
１４ネットワーク
１６Ａ、１６Ｂ選択カメラ位置特定部
１８Ａ１、１８Ａ２、１８Ｂ１、１８Ｂ２検出カメラ
２０Ａ、２０Ｂ動画像符号化装置
２２Ａ、２２Ｂ復号化装置
２４Ａ、２４Ｂディスプレイ
３０全体制御部
３２入力画像取り込み部
３４符号化部
３６ストリーム出力部
３８メモリコントローラ DESCRIPTION OF SYMBOLS 10 Video communication system 12A, 12B Video communication apparatus 14 Network 16A, 16B Selected camera position specific | specification part 18A1, 18A2, 18B1, 18B2 Detection camera 20A, 20B Video encoding apparatus 22A, 22B Decoding apparatus 24A, 24B Display 30 Overall control unit 32 Input image capturing unit 34 Encoding unit 36 Stream output unit 38 Memory controller

Claims

Of the captured images captured by the plurality of shooting cameras, the selected shooting images sequentially shot by the selected camera selected from the plurality of shooting cameras and the non-selected cameras other than the selected camera were shot. An encoding unit that encodes each non-selected captured image by the encoding method using a frame image obtained by encoding the past selected captured image by an encoding method using inter-frame prediction as a common reference image;
A switching unit that switches the common reference image to a frame image obtained by encoding the captured image captured by the other imaging camera in the past by the encoding method when the selected camera is switched to another imaging camera; ,
An encoded data transmission unit that transmits the encoded data encoded by the encoding unit to a communication partner side device;
A video encoding apparatus including:

The moving image encoding apparatus according to claim 1, wherein the non-selected camera is a photographic camera adjacent to the selected camera among the plurality of photographic cameras.

A detection camera for detecting the position of the photographing target;
Photographing for photographing a photographing target on the communication partner side from among a plurality of photographing cameras provided on the communication partner side based on a position of the photographing target calculated based on a photographed image photographed by the detection camera. A selected camera position specifying unit that specifies the position of the camera as the communication partner selected camera position;
A selected camera position transmitter for transmitting the communication partner selection camera position to the communication partner side device;
The moving picture coding apparatus according to claim 1 or 2, further comprising:

The communication partner side device is:
A communication partner side detection camera for detecting the position of the photographing target on the communication partner side;
Based on the position of the photographing target on the communication partner side calculated based on the photographed image photographed by the communication partner side detection camera, the position of the photographing camera for photographing the photographing target from the plurality of photographing cameras is determined. A communication partner side selected camera position identifying unit that identifies the selected camera position;
A communication partner-side selected camera position transmitting unit that transmits the selected camera position to the moving image encoding device;
With
The moving image encoding apparatus according to claim 1, wherein the selected camera is a shooting camera corresponding to the selected camera position transmitted from the communication partner side selected camera position transmitting unit.

Of the captured images captured by the plurality of shooting cameras, the selected shooting images sequentially shot by the selected camera selected from the plurality of shooting cameras and the non-selected cameras other than the selected camera were shot. Each non-selected captured image is encoded by the encoding method as a common reference image, a frame image obtained by encoding the past selected captured image by an encoding method using inter-frame prediction,
When the selected camera is switched to another shooting camera, the common reference image is switched to a frame image obtained by encoding the shot image shot by the other shooting camera in the past by the encoding method,
A moving image encoding method comprising: transmitting encoded data that has been encoded to a device on a communication partner side.

On the computer,
Of the captured images captured by the plurality of shooting cameras, the selected shooting images sequentially shot by the selected camera selected from the plurality of shooting cameras and the non-selected cameras other than the selected camera were shot. Each non-selected captured image is encoded by the encoding method as a common reference image, a frame image obtained by encoding the past selected captured image by an encoding method using inter-frame prediction,
When the selected camera is switched to another shooting camera, the common reference image is switched to a frame image obtained by encoding the shot image shot by the other shooting camera in the past by the encoding method,
A moving picture encoding program for executing processing including transmitting encoded encoded data to a communication partner apparatus.

Of the captured images captured by the plurality of shooting cameras, the selected shooting images sequentially shot by the selected camera selected from the plurality of shooting cameras and the non-selected cameras other than the selected camera were shot. An encoding unit that encodes each non-selected captured image by the encoding method using a frame image obtained by encoding the past selected captured image by an encoding method using inter-frame prediction as a common reference image; and the selection A switching unit that switches the common reference image to a frame image obtained by encoding the captured image captured by the other imaging camera in the past by the encoding method when the camera is switched to another imaging camera; A moving picture coding apparatus comprising: a coded data transmission unit configured to send coded data encoded by the coding unit to a communication partner side device ,
A decoding unit that receives and decodes the encoded data transmitted from the device on the communication partner side;
A display unit for displaying a moving image based on the decoded data decoded by the decoding unit;
A moving image communication apparatus including: