JP2024058941A

JP2024058941A - Control device, control method, and computer program

Info

Publication number: JP2024058941A
Application number: JP2022166369A
Authority: JP
Inventors: 航須▲崎▼
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2022-10-17
Filing date: 2022-10-17
Publication date: 2024-04-30

Abstract

【課題】仮想カメラの操縦者が自動制御を行うカメラパラメータを手動操作したい場合に、操作者の意図通りに仮想カメラを移動できなくなる虞がある。【解決手段】仮想カメラの位置または仮想カメラの姿勢を変更する変更情報を取得し、仮想カメラの位置を変更する情報か、仮想カメラの姿勢を変更する情報かを判定し、判定結果に基づいて特定のオブジェクトが仮想カメラの光軸上に位置するように仮想カメラの位置および姿勢を変更する。【選択図】図４[Problem] When an operator of a virtual camera wishes to manually operate camera parameters that are automatically controlled, there is a risk that the operator will not be able to move the virtual camera as intended. [Solution] Change information for changing the position or attitude of the virtual camera is acquired, and it is determined whether the information is for changing the position of the virtual camera or the attitude of the virtual camera, and the position and attitude of the virtual camera are changed based on the determination result so that a specific object is positioned on the optical axis of the virtual camera. [Selected Figure] Figure 4

Description

本開示は、仮想視点映像を生成する仮想カメラの制御装置、制御方法およびコンピュータプログラムに関する。 The present disclosure relates to a control device, a control method, and a computer program for a virtual camera that generates a virtual viewpoint image.

近年、複数のカメラを異なる位置に設置して多視点で同期撮影し、当該撮影により得られた複数視点画像を用いて、カメラ設置位置の画像だけでなく任意の視点からなる仮想視点画像を生成する技術が注目されている。この技術をテレビ等のスポーツ中継に適用した場合、従来の撮影画像と比較して高い自由度かつ臨場感のある映像コンテンツを生成することができる。 In recent years, a technology that uses multiple cameras installed in different positions to synchronously capture images from multiple viewpoints and then generates virtual viewpoint images from any viewpoint, rather than just images from the camera installation positions, has been gaining attention. When this technology is applied to sports broadcasts on television and other media, it is possible to generate video content with a high degree of freedom and realism compared to conventionally captured images.

このとき、仮想カメラの操作者は、試合のシーン、具体的には選手やボールの動きなどに応じて、そのシーンに適した映像となるように仮想カメラの位置や向きを指定する必要がある。仮想カメラは、併進および回転に対してそれぞれ３自由度を持ち、被写体に合わせて位置や向きを自由に操作可能である。仮想カメラの操縦者は、仮想カメラが常に特定の選手やボールを向くよう姿勢を変化させる、といったシーンを撮影する場合、移動する被写体を画角内の所望の位置に捉え続けるために複数の自由度を操作しなければならない。そのため、移動する被写体を画角内の所望の位置にとらえ続けるために、仮想カメラの位置だけを手動操作し、仮想カメラの姿勢は被写体に常に向くように自動制御するなど、移動操作を簡易化したいという要望がある。 At this time, the operator of the virtual camera must specify the position and orientation of the virtual camera so that the image is appropriate for the scene of the match, specifically the movement of the players and the ball. The virtual camera has three degrees of freedom for both translation and rotation, and can freely manipulate its position and orientation to suit the subject. When shooting a scene in which the virtual camera changes its attitude so that it always faces a specific player or the ball, the operator of the virtual camera must manipulate multiple degrees of freedom to keep the moving subject captured at a desired position within the angle of view. For this reason, there is a demand for simplifying the movement operation, such as manually manipulating only the position of the virtual camera and automatically controlling the attitude of the virtual camera so that it always faces the subject, in order to keep the moving subject captured at a desired position within the angle of view.

特許文献１では、自動制御を行うカメラパラメータと手動操作を行うカメラパラメータを予め設定することにより、被写体が速く複雑に移動するシーンであっても、仮想カメラの画角内に被写体の所望の場所を捉え続けることを可能とする技術が示されている。 Patent document 1 discloses a technology that allows a desired location of a subject to be continuously captured within the field of view of a virtual camera, even in a scene in which the subject moves quickly and in a complex manner, by previously setting camera parameters for automatic control and camera parameters for manual operation.

特開２０２１－１７７３５１号公報JP 2021-177351 A

しかしながら、自動制御を行うカメラパラメータと手動操作を行うカメラパラメータを予め設定するため、操作者（ユーザ）が手動操作したいカメラパラメータが変化した場合、操作者の意図通りに仮想カメラを移動できなくなる虞がある。 However, because the camera parameters for automatic control and those for manual operation are set in advance, if the camera parameters that the operator (user) wants to manually operate change, there is a risk that the virtual camera will not be able to be moved as the operator intends.

本開示は上記の課題に鑑みてなされたものであり、仮想カメラの操作者が意図した仮想視点画像を、容易な操作で撮影することを目的とする。 This disclosure was made in consideration of the above-mentioned problems, and aims to enable the operator of the virtual camera to capture the intended virtual viewpoint image with simple operations.

本開示の一態様に係る情報処理装置は、
複数の撮像装置により撮像される複数の撮像画像に基づいて生成される仮想視点画像に対応する仮想カメラの位置または仮想カメラの姿勢を変更する変更情報を取得する取得手段と、
特定のオブジェクトを設定する設定手段と、
前記変更情報が、仮想カメラの位置を変更する情報か、仮想カメラの姿勢を変更する情報かを判定する判定手段と、
前記変更情報と前記判定手段の判定結果とに基づいて、前記特定のオブジェクトが前記仮想カメラの光軸上に位置するように前記仮想カメラの位置および姿勢を変更する変更手段と、
を有することを特徴とする。 An information processing device according to an embodiment of the present disclosure includes:
an acquisition means for acquiring change information for changing a position or an attitude of a virtual camera corresponding to a virtual viewpoint image generated based on a plurality of captured images captured by a plurality of imaging devices;
A setting means for setting a particular object;
a determination means for determining whether the change information is information for changing a position of the virtual camera or information for changing an attitude of the virtual camera;
a modification means for modifying a position and an attitude of the virtual camera based on the modification information and a determination result of the determination means so that the specific object is positioned on an optical axis of the virtual camera;
The present invention is characterized by having the following.

本開示によれば、仮想カメラの操縦者が意図した仮想視点画像を、容易な操作で撮影することができる。 According to the present disclosure, the virtual viewpoint image intended by the operator of the virtual camera can be captured with simple operations.

画像処理システム全体の構成図である。FIG. 1 is a diagram illustrating the configuration of an entire image processing system. 情報処理装置の構成図である。FIG. 1 is a configuration diagram of an information processing device. 世界座標系及び仮想カメラのカメラパラメータ・移動操作を示す図である。1 is a diagram showing a world coordinate system and camera parameters and movement operations of a virtual camera. FIG. 画像処理装置の機能構成例を示す図である。FIG. 2 is a diagram illustrating an example of a functional configuration of an image processing device. 仮想カメラの移動操作と注視時に自動制御されるカメラパラメータの対応例を示す図である。11 is a diagram showing an example of correspondence between a moving operation of the virtual camera and camera parameters that are automatically controlled during gaze. 実施形態１に係る仮想カメラのカメラパラメータの修正例を示す図である。5A to 5C are diagrams illustrating an example of correction of camera parameters of a virtual camera according to the first embodiment. 実施形態１に係る画像処理装置の処理を表すフローチャートである。4 is a flowchart showing a process of the image processing device according to the first embodiment.

以下、図面を参照して本開示の実施形態を説明する。ただし、本開示は以下の実施形態に限定されるものではない。なお、各図において、同一の部材または要素については同一の参照番号を付し、重複する説明は省略または簡略化する。 Embodiments of the present disclosure will be described below with reference to the drawings. However, the present disclosure is not limited to the following embodiments. In each drawing, the same members or elements are given the same reference numbers, and duplicate descriptions are omitted or simplified.

＜実施形態１＞
図１は、画像処理システム全体の構成図である。 <Embodiment 1>
FIG. 1 is a diagram showing the overall configuration of an image processing system.

画像処理システム１０は、撮影システム１０１、画像処理装置１０２、情報処理装置１０３から構成される。画像処理システム１０は、仮想視点画像を生成することが可能である。 The image processing system 10 is composed of an imaging system 101, an image processing device 102, and an information processing device 103. The image processing system 10 is capable of generating a virtual viewpoint image.

撮影システム１０１は、複数の物理カメラを、撮影領域を囲うようにそれぞれ異なる位置に設置し、時刻同期して撮影する。多視点から同期撮影した複数画像を、画像処理装置１０２に送信する。撮影領域は、仮想視点画像を生成するための撮影が行われる撮影スタジオや、スポーツ競技が行われる競技場や演技が行われる舞台などである。 The imaging system 101 has multiple physical cameras installed at different positions surrounding the imaging area, and captures images in a time-synchronized manner. Multiple images captured synchronously from multiple viewpoints are transmitted to the image processing device 102. The imaging area may be a photography studio where imaging is performed to generate a virtual viewpoint image, a stadium where a sports competition is held, or a stage where a performance is performed.

画像処理装置１０２は、多視点から同期撮影した複数画像を元に、仮想カメラから見た仮想視点画像を生成する。仮想カメラとは、撮像領域の周囲に実際に設置された複数の撮像装置とは異なる仮想的なカメラであって、仮想視点画像の生成に係る仮想視点を便宜的に説明するための概念である。すなわち、仮想視点画像は、撮像領域に関連付けられる仮想空間内に設定された仮想視点から撮像した画像であるとみなすことができる。そして、仮想的な当該撮像における視点の位置及び向きは仮想カメラの位置及び向きとして表すことができる。言い換えれば、仮想視点画像は、空間内に設定された仮想視点の位置にカメラが存在するものと仮定した場合に、そのカメラにより得られる撮像画像を模擬した画像であると言える。また本実施形態では、経時的な仮想視点の変遷の内容を、仮想カメラパスと表記する。ただし、本実施形態の構成を実現するために仮想カメラの概念を用いることは必須ではない。すなわち、少なくとも空間内における特定の位置を表す情報と向きを表す情報とが設定され、設定された情報に応じて仮想視点画像が生成されればよい。仮想カメラの視点は、後述する情報処理装置１０３が決定するカメラパラメータによって表現される。なお、特に断りがない限り、以降の説明に於いて画像という文言が動画と静止画の両方の概念を含むものとして説明する。すなわち、画像処理システム１０は、静止画及び動画の何れについても処理可能である。 The image processing device 102 generates a virtual viewpoint image seen from a virtual camera based on multiple images captured synchronously from multiple viewpoints. The virtual camera is a virtual camera that is different from multiple imaging devices actually installed around the imaging area, and is a concept for conveniently explaining the virtual viewpoint related to the generation of the virtual viewpoint image. That is, the virtual viewpoint image can be considered to be an image captured from a virtual viewpoint set in a virtual space associated with the imaging area. The position and orientation of the viewpoint in the virtual imaging can be expressed as the position and orientation of the virtual camera. In other words, it can be said that the virtual viewpoint image is an image that simulates an image captured by a camera assuming that a camera exists at the position of the virtual viewpoint set in the space. In addition, in this embodiment, the content of the transition of the virtual viewpoint over time is expressed as a virtual camera path. However, it is not necessary to use the concept of a virtual camera to realize the configuration of this embodiment. That is, it is sufficient that at least information representing a specific position and information representing a direction in the space are set, and the virtual viewpoint image is generated according to the set information. The viewpoint of the virtual camera is expressed by camera parameters determined by the information processing device 103 described later. In the following explanation, unless otherwise specified, the term "image" will be used to include the concepts of both moving images and still images. In other words, the image processing system 10 can process both still images and moving images.

画像処理装置１０２は、撮影システム１０１から送られた複数の撮像画像から、被写体を前景として抽出し、抽出された前景画像から３次元モデルを生成する。前景を抽出する方法としては、背景差分情報を用いる方法がある。最も単純には、あらかじめ背景画像として、前景が存在しない状態を撮影しておき、前景が存在する画像と背景画像の差分を算出する。差分値が閾値より大きい場合に、その画素位置は前景であると判定する。その他前景を抽出する手法については、被写体に関する画像上の特徴量や機械学習を用いる手法など様々な手法があるが本提案では前景を抽出する手法は問わない。３次元モデルは、視体積交差法（ＳｈａｐｆｒｏｍＳｉｌｈｏｕｅｔｔｅ法）により３Ｄ点群（３次元座標を持つ点の集合）を生成しても良いし、ステレオ画像処理から得られたデプスデータを用いて生成してもよい。本件では、３次元モデルを生成する方法について限定しない。この３次元モデルと、指定した背景モデルから、仮想カメラから見た仮想視点画像を生成する。背景モデルは、あらかじめ撮影したスタジオセットや競技場のフィールドなどのデータでも良いし、ＣＧなどで生成した架空の空間のデータでも良い。仮想視点画像の生成方法としては、例えば、モデルベースレンダリング（Ｍｏｄｅｌ－ＢａｓｅｄＲｅｎｄｅｒｉｎｇ：ＭＢＲ）を用いることができる。この処理により、仮想カメラの位置および姿勢から見た３次元モデルの画像を生成できる。なお、仮想視点画像の生成方法はこれに限定しない。 The image processing device 102 extracts the subject as a foreground from the multiple captured images sent from the imaging system 101, and generates a three-dimensional model from the extracted foreground image. One method of extracting the foreground is to use background difference information. In the simplest case, a state in which the foreground does not exist is captured in advance as a background image, and the difference between the image in which the foreground exists and the background image is calculated. If the difference value is greater than a threshold value, the pixel position is determined to be the foreground. There are various other methods for extracting the foreground, such as a method using image features related to the subject or machine learning, but in this proposal, any method for extracting the foreground is acceptable. The three-dimensional model may be generated by generating a 3D point cloud (a set of points having three-dimensional coordinates) using the shape from Silhouette method, or may be generated using depth data obtained from stereo image processing. In this case, the method of generating the three-dimensional model is not limited. A virtual viewpoint image viewed from a virtual camera is generated from this three-dimensional model and a specified background model. The background model may be data of a studio set or a stadium field that has been photographed in advance, or data of a fictitious space generated using CG or the like. Model-Based Rendering (MBR), for example, may be used as a method for generating a virtual viewpoint image. This process makes it possible to generate an image of a three-dimensional model seen from the position and orientation of a virtual camera. Note that the method for generating a virtual viewpoint image is not limited to this.

情報処理装置１０３は、仮想カメラを制御し、仮想カメラの視点を表すカメラパラメータを決定する。仮想カメラのカメラパラメータは、位置、姿勢、ズームまたは時刻を指定するためのパラメータを含んでいる。カメラパラメータにより指定される仮想カメラの位置は、所定の位置を原点とする３次元座標で示される。また、仮想カメラのカメラパラメータにより指定される位置は、Ｘ軸、Ｙ軸、Ｚ軸の３軸の直交座標系の座標により示される。また、カメラパラメータにより指定される仮想カメラの姿勢はパン、チルト、ロールの３軸のパラメータから構成される。カメラパラメータにより指定される仮想カメラのズームは、例えば、焦点距離の１軸により示される。なお、カメラパラメータは他の要素を規定するパラメータを含んでもよいし、上述したパラメータの全てを含まなくてもよい。 The information processing device 103 controls the virtual camera and determines camera parameters that represent the viewpoint of the virtual camera. The camera parameters of the virtual camera include parameters for specifying the position, attitude, zoom, or time. The position of the virtual camera specified by the camera parameters is indicated by three-dimensional coordinates with a predetermined position as the origin. The position specified by the camera parameters of the virtual camera is indicated by coordinates of a Cartesian coordinate system of three axes, the X axis, the Y axis, and the Z axis. The attitude of the virtual camera specified by the camera parameters is composed of parameters of three axes, pan, tilt, and roll. The zoom of the virtual camera specified by the camera parameters is indicated by one axis, for example, the focal length. The camera parameters may include parameters that define other elements, and may not include all of the above-mentioned parameters.

情報処理装置１０３は、決定した仮想カメラのカメラパラメータを画像処理装置１０２へ送信する。次に、画像処理装置１０２は、受信したカメラパラメータを元に仮想視点画像を生成し、情報処理装置１０３へ送信する。 The information processing device 103 transmits the determined camera parameters of the virtual camera to the image processing device 102. Next, the image processing device 102 generates a virtual viewpoint image based on the received camera parameters and transmits it to the information processing device 103.

図２（ａ）は、情報処理装置１０３のハードウェア構成を説明する図である。 Figure 2(a) is a diagram explaining the hardware configuration of the information processing device 103.

ＣＰＵ２０１は、ＲＡＭ２０２やＲＯＭ２０３に格納されているコンピュータプログラムやデータを用いて情報処理装置１０３の全体を制御する。なお、情報処理装置１０３がＣＰＵ２０１と異なる１又は複数の専用のハードウェアを有し、ＣＰＵ２０１による処理の少なくとも一部を専用のハードウェアが実行してもよい。専用のハードウェアの例としては、ＡＳＩＣ（特定用途向け集積回路）、ＦＰＧＡ（フィールドプログラマブルゲートアレイ）、およびＤＳＰ（デジタルシグナルプロセッサ）などがある。 The CPU 201 controls the entire information processing device 103 using computer programs and data stored in the RAM 202 and ROM 203. The information processing device 103 may have one or more dedicated hardware components different from the CPU 201, and at least a portion of the processing by the CPU 201 may be executed by the dedicated hardware components. Examples of the dedicated hardware components include an ASIC (application specific integrated circuit), an FPGA (field programmable gate array), and a DSP (digital signal processor).

ＲＡＭ２０２は、ＲＯＭ２０３から読みだされたコンピュータプログラムや計算の途中結果など、通信部２０４を介して外部から供給されるデータなどを一時的に記憶する。 RAM 202 temporarily stores computer programs read from ROM 203, intermediate calculation results, and data supplied from the outside via communication unit 204.

ＲＯＭ２０３は、変更を必要としないコンピュータプログラムやデータを保持する。 ROM 203 stores computer programs and data that do not require modification.

通信部２０４は、情報処理装置１０３と外部の装置との通信に用いられる。例えば、情報処理装置１０３が外部の装置と有線通信する機能を有する場合には、ＥｔｈｅｒｎｅｔやＵＳＢなどの通信手段を備える。情報処理装置１０３が外部の装置と無線通信する機能を有する場合には、通信部２０４はアンテナを備える。 The communication unit 204 is used for communication between the information processing device 103 and an external device. For example, if the information processing device 103 has a function for wired communication with an external device, the communication unit 204 is provided with a communication means such as Ethernet or USB. If the information processing device 103 has a function for wireless communication with an external device, the communication unit 204 is provided with an antenna.

入出力部２０５は、仮想カメラを制御するための複数の入力部と、仮想カメラの状態などを表示する複数の表示部を有する。 The input/output unit 205 has multiple input units for controlling the virtual camera and multiple display units for displaying the status of the virtual camera, etc.

移動制御部２０６は、操作者により入出力部２０５に入力された内容に基づき、仮想カメラの移動を制御し、カメラパラメータを決定する。本実施形態では、移動制御部２０６は情報処理装置１０３が備えるがこれに限定されない。画像処理装置１０２に備えてもよい。その場合、入出力部２０５に入力された内容をそのまま画像処理装置１０２に送信する。 The movement control unit 206 controls the movement of the virtual camera and determines the camera parameters based on the contents input to the input/output unit 205 by the operator. In this embodiment, the movement control unit 206 is provided in the information processing device 103, but is not limited to this. It may also be provided in the image processing device 102. In that case, the contents input to the input/output unit 205 are sent to the image processing device 102 as is.

図２（ｂ）は、入出力部２０５を説明する図である。 Figure 2 (b) is a diagram explaining the input/output unit 205.

表示部２１１は、例えば液晶ディスプレイやＬＥＤ等で構成され、ユーザが情報処理装置１０３を操作するためのＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）や画像処理装置１０２で生成された仮想視点画像を含む画像を表示する。 The display unit 211 is composed of, for example, an LCD display or LEDs, and displays a GUI (Graphical User Interface) for the user to operate the information processing device 103, and images including a virtual viewpoint image generated by the image processing device 102.

入出力部２０５は、ジョイスティック２１２ａ、２１２ｂ、シーソースイッチ２１３ａ、２１３ｂ、ボタン群２１４を有する。操作者はそれらを入力部として操作し、仮想カメラのカメラパラメータの変更を行う。ジョイスティック２１２ａ、２１２ｂは、それぞれ３自由度のカメラパラメータを持っている。本実施形態では、ジョイスティック２１２ａにより仮想カメラのＸ軸、Ｙ軸、Ｚ軸のカメラパラメータを操作する、ジョイスティック２１２ｂにより仮想カメラのパン、チルト、ロールのカメラパラメータを操作する。シーソースイッチ２１３ａ、２１３ｂは、プラス側またはマイナス側に倒すことで、予め定められた焦点距離範囲内で仮想カメラの焦点距離や注目点までの距離を変更する。ここで、注目点は、仮想カメラから一定距離離れた光軸上の点である。なお、仮想カメラの光軸上に存在するオブジェクトの位置を注目点として設定してもよい。あるいは、仮想カメラの光軸と地上から所定の高さ（例えば１．５ｍ）の平面との交点を注目点として設定してもよい。また、注視領域は、注視点を中心とし所定の距離を半径とした円で表現される領域である。本実施形態では、注視点を中心に２ｍと設定する。なお、注視領域はこれに限定されず、注視点を中心とした多角形の領域や球状の領域であってもよい。また、図２に示す入出力部２０５は一例に過ぎず、本件では入出力部２０５の構成について限定しない。 The input/output unit 205 has joysticks 212a and 212b, seesaw switches 213a and 213b, and a group of buttons 214. The operator operates these as input units to change the camera parameters of the virtual camera. The joysticks 212a and 212b each have camera parameters with three degrees of freedom. In this embodiment, the joystick 212a operates the camera parameters of the X-axis, Y-axis, and Z-axis of the virtual camera, and the joystick 212b operates the camera parameters of the pan, tilt, and roll of the virtual camera. The seesaw switches 213a and 213b change the focal length of the virtual camera and the distance to the point of interest within a predetermined focal length range by tilting them to the plus or minus side. Here, the point of interest is a point on the optical axis that is a certain distance away from the virtual camera. The position of an object that exists on the optical axis of the virtual camera may be set as the point of interest. Alternatively, the intersection of the optical axis of the virtual camera and a plane at a certain height (for example, 1.5 m) from the ground may be set as the point of interest. The gaze area is an area represented by a circle with a radius of a predetermined distance centered on the gaze point. In this embodiment, it is set to 2 m centered on the gaze point. Note that the gaze area is not limited to this, and may be a polygonal area or a spherical area centered on the gaze point. Also, the input/output unit 205 shown in FIG. 2 is merely an example, and the present invention does not limit the configuration of the input/output unit 205.

図３（ａ）は、仮想カメラの撮影空間における世界座標系（ｘ、ｙ、ｚ）を示す図である。 Figure 3(a) shows the world coordinate system (x, y, z) in the shooting space of the virtual camera.

世界座標系は、仮想カメラのカメラパラメータや被写体を表すために使用される。被写体とは、フィールド３０１やボール３０２、選手３０３等の、撮影空間内に存在する有体物である。世界座標系は、フィールド３０１の中心を、原点（０、０、０）とする。また、ｘ軸をフィールド３０１の長辺方向、ｙ軸をフィールド３０１の短辺方向、ｚ軸をフィールド３０１に対する鉛直方向とする。なお、世界座標系の設定方法は、これに限定しない。 The world coordinate system is used to represent the camera parameters of the virtual camera and the subject. A subject is a tangible object that exists in the shooting space, such as the field 301, the ball 302, the players 303, etc. In the world coordinate system, the center of the field 301 is set as the origin (0,0,0). The x-axis is set in the direction of the long side of the field 301, the y-axis is set in the direction of the short side of the field 301, and the z-axis is set in the vertical direction relative to the field 301. Note that the method of setting the world coordinate system is not limited to this.

図３（ｂ）は、仮想カメラのカメラパラメータを示す図である。 Figure 3(b) shows the camera parameters of the virtual camera.

仮想カメラのカメラパラメータは、Ｘ軸３１２、Ｙ軸３１３、Ｚ軸３１４の３軸の直交座標系（Ｘ、Ｙ、Ｚ）で表現される。Ｚ軸３１４は、仮想カメラ３１１の姿勢によらず、常にフィールド３０１に垂直な方向である。したがって、Ｘ軸３１２とＹ軸３１３は、カメラの姿勢によらず、常にフィールド３０１に水平な方向である。Ｘ軸３１２とＹ軸３１３は、仮想カメラ３１１の姿勢によって方向が変わる。仮想カメラ３１１の光軸をフィールド３０１に水平な面に投影した方向がＹ軸３１３であり、仮想カメラ３１１の光軸に直交する方向がＸ軸３１２である。また、Ｘ軸３１２、Ｙ軸３１３、Ｚ軸３１４を回転軸とした回転移動がそれぞれチルト３１５、ロール３１６、パン３１７である。なお、カメラパラメータの設定方法は、これに限定しない。 The camera parameters of the virtual camera are expressed in a Cartesian coordinate system (X, Y, Z) of three axes: X-axis 312, Y-axis 313, and Z-axis 314. The Z-axis 314 is always perpendicular to the field 301, regardless of the attitude of the virtual camera 311. Therefore, the X-axis 312 and Y-axis 313 are always horizontal to the field 301, regardless of the attitude of the camera. The directions of the X-axis 312 and Y-axis 313 change depending on the attitude of the virtual camera 311. The direction in which the optical axis of the virtual camera 311 is projected onto a plane horizontal to the field 301 is the Y-axis 313, and the direction perpendicular to the optical axis of the virtual camera 311 is the X-axis 312. Rotational movements around the X-axis 312, Y-axis 313, and Z-axis 314 as rotation axes are tilt 315, roll 316, and pan 317, respectively. Note that the method of setting the camera parameters is not limited to this.

Ｘ軸３１２、Ｙ軸３１３、Ｚ軸３１４の３軸に対応するカメラパラメータと、チルト３１５、ロール３１６、パン３１７の３軸に対応するカメラパラメータを組み合わせることで、仮想カメラは被写体（フィールド）の三次元空間を自由に移動及び姿勢変更できる。その中でユーザにとって操作しやすい組み合わせを仮想カメラの移動操作として用意する。本実施形態では、Ｘ軸３１２、Ｙ軸３１３、Ｚ軸３１４の３軸に対応するカメラパラメータに対する操作を位置操作、チルト３１５、ロール３１６、パン３１７の３軸に対応するカメラパラメータに対する操作を回転操作と呼称する。仮想カメラの移動操作として、前述の位置操作・回転操作の他に、拡大・縮小や、注目点との併進移動、注目点を中心にした横回転・縦回転等がある。 By combining camera parameters corresponding to the three axes, X-axis 312, Y-axis 313, and Z-axis 314, with camera parameters corresponding to the three axes, tilt 315, roll 316, and pan 317, the virtual camera can freely move and change its posture in the three-dimensional space of the subject (field). Among these, combinations that are easy for the user to operate are prepared as virtual camera movement operations. In this embodiment, operations on camera parameters corresponding to the three axes, X-axis 312, Y-axis 313, and Z-axis 314, are called position operations, and operations on camera parameters corresponding to the three axes, tilt 315, roll 316, and pan 317, are called rotation operations. In addition to the position and rotation operations described above, virtual camera movement operations include zooming in and out, translation with respect to the point of interest, and horizontal and vertical rotation around the point of interest.

拡大・縮小とは、仮想カメラの撮影領域にある被写体を拡大または縮小する操作である。仮想カメラを光軸方向へ移動させるによって実現する。前進により拡大、後退により縮小表示される。なお、拡大・縮小はこれらの方法に限定されず、仮想カメラの焦点距離等の変更を用いてもよい。 Zooming in and out is the operation of enlarging or reducing the size of a subject in the virtual camera's field of view. This is achieved by moving the virtual camera along the optical axis. Moving forward causes the image to zoom in, and moving backward causes the image to zoom out. Note that zooming in and out is not limited to these methods, and it is also possible to change the focal length of the virtual camera, etc.

注目点との併進移動とは、図３（ｃ）に示すように、仮想カメラ３１１の姿勢を変更せずに、仮想カメラ３１１の注目点３２１の位置を移動する操作である。注目点の位置は、仮想カメラの光軸上に位置する。そのため、仮想カメラ３１１の姿勢を変更せずに仮想カメラ３１１の位置を移動した場合、注目点の位置も仮想カメラの３１１の位置と平行に移動する。したがって、仮想カメラ３１１の姿勢を変更せずに仮想カメラ３１１の位置を変更する際の仮想カメラ３１１の移動軌跡と、注目点３２１の移動軌跡は、同じ移動軌跡となる。 As shown in FIG. 3(c), translational movement with respect to the point of interest is an operation of moving the position of the point of interest 321 of the virtual camera 311 without changing the attitude of the virtual camera 311. The position of the point of interest is located on the optical axis of the virtual camera. Therefore, when the position of the virtual camera 311 is moved without changing the attitude of the virtual camera 311, the position of the point of interest also moves parallel to the position of the virtual camera 311. Therefore, the movement trajectory of the virtual camera 311 when the position of the virtual camera 311 is changed without changing the attitude of the virtual camera 311 and the movement trajectory of the point of interest 321 are the same movement trajectory.

注目点を中心とした横回転とは、図３（ｄ）に示すように、仮想カメラ３１１を、注目点３２１を中心として、Ｚ軸３１４と平行な軸３２４回りに回転移動する操作である。言い換えれば、注目点３２１を中心とした円上を移動する操作である。どの方向への回転移動でも同様に言い換えられるため、下記省略する。 As shown in FIG. 3(d), horizontal rotation around a point of interest is an operation of rotating the virtual camera 311 around an axis 324 parallel to the Z axis 314, with the point of interest 321 as the center. In other words, it is an operation of moving on a circle with the point of interest 321 as the center. The same can be said for rotation in any direction, so it will not be explained below.

注目点を中心とした縦回転とは、図３（ｅ）に示すように、仮想カメラ３１１を、注目点３２１を中心として、Ｘ軸３１２と平行な軸３２６回りに回転移動する操作である。注目点を中心とした横回転・縦回転において、仮想カメラの姿勢は、常に注目点３２１を向くように変化し、注目点３２１の座標は変化しない。また、回転半径は、仮想カメラ３１１と注目点３２１間の距離と等しい。 As shown in FIG. 3(e), vertical rotation around a point of interest is an operation of rotating the virtual camera 311 around an axis 326 parallel to the X-axis 312, with the point of interest 321 as the center. In horizontal and vertical rotation around the point of interest, the attitude of the virtual camera changes so that it always faces the point of interest 321, and the coordinates of the point of interest 321 do not change. In addition, the rotation radius is equal to the distance between the virtual camera 311 and the point of interest 321.

注目点３２１を中心とした横回転３２３と縦回転３２５を組み合わせることによって、注目点３２１の座標を変化させずに、３６０度あらゆる角度から被写体を見ることができる仮想カメラの移動操作を実現できる。なお、仮想カメラの移動操作はこれらに限定されず、仮想カメラの移動と姿勢変更の組み合わせによって実現できる動作であれば良い。 By combining horizontal rotation 323 and vertical rotation 325 centered on the attention point 321, it is possible to realize a movement operation of the virtual camera that allows the subject to be viewed from any angle of 360 degrees without changing the coordinates of the attention point 321. Note that the movement operation of the virtual camera is not limited to this, and any operation can be realized by combining the movement of the virtual camera and a change in posture.

図４は、実施形態１における画像処理装置の機能構成例を表すブロック図である。画像処理装置１０２は、撮影システム１０１から同期撮影した複数画像、情報処理装置１０３からカメラパラメータを受信し、仮想視点画像を生成する。 Figure 4 is a block diagram showing an example of the functional configuration of an image processing device in embodiment 1. The image processing device 102 receives multiple images captured synchronously from the imaging system 101 and camera parameters from the information processing device 103, and generates a virtual viewpoint image.

画像処理装置１０２が搭載する各機能について順に説明する。 The functions of the image processing device 102 are explained in order.

仮想カメラ情報取得部４０１は、情報処理装置１０３が有する入出力部２０５から操作者による入力部への入力内容を受信し、注視領域の範囲を取得する。また、情報処理装置１０３が有する移動制御部２０６から各フレームにおける仮想カメラの位置および姿勢といったカメラパラメータを取得する。このカメラパラメータは注視対象設定部４０３、制御判定部４０４に送信される。 The virtual camera information acquisition unit 401 receives the input contents entered by the operator into the input unit from the input/output unit 205 of the information processing device 103, and acquires the range of the gaze area. It also acquires camera parameters such as the position and attitude of the virtual camera in each frame from the movement control unit 206 of the information processing device 103. These camera parameters are transmitted to the gaze target setting unit 403 and the control determination unit 404.

被写体モデル生成部４０２は、撮影システム１０１から受信した複数画像や、各撮影装置の位置や姿勢といったパラメータをもとに、被写体のモデル情報を生成する。モデル情報には、モデルの３次元形状を表すデータ、モデルの位置情報などが含まれる。なお、モデルの位置は、モデルを囲うバウンディングボックスの重心で近似するが、モデルの位置の近似方法はこれに限定しない。生成されたモデルは、仮想視点画像生成部４０６に送信される。 The subject model generation unit 402 generates model information of the subject based on multiple images received from the imaging system 101 and parameters such as the position and orientation of each imaging device. The model information includes data representing the three-dimensional shape of the model, position information of the model, etc. The position of the model is approximated by the center of gravity of the bounding box that encloses the model, but the method of approximating the model position is not limited to this. The generated model is sent to the virtual viewpoint image generation unit 406.

注視対象設定部４０３は、被写体モデル生成部４０２からモデル情報を受信し、入出力部２０５から受信した操作者による入力部への入力内容に対応した被写体を注視対象（特定オブジェクト）とし、そのモデル情報を出力する。例えば、入出力部２０５に新たにキーボードを設け、操作者がキーボードの右方向のボタンを押した場合、仮想カメラから見て現在選択されている被写体モデルの右側に位置し、Ｘ軸方向の距離が最短となる被写体のモデルを注視対象として設定する。設定した被写体のモデル情報は注視対象設定部４０３に送信する。特に入力部から注視対象を指定する入力がない場合、注目点に最も近い被写体を注視対象として選択する。なお、注視対象は単体とは限らず、複数の被写体を注視対象に選択してもよい。その場合、注視対象のモデル群の位置は、モデル群を囲うバウンディングボックスの重心で近似するが、位置情報を取得する手段はこれに限定しない。注視対象として選択されたモデル情報は、制御判定部４０４へ送信される。なお、注視対象設定部４０３は画像処理装置１０２と異なる装置が有していてもよい。例えば、情報処理装置１０３が有していてもよく、その場合、ユーザが注視対象を選択する入力を行うことにより注視対象を設定する。 The gaze target setting unit 403 receives model information from the subject model generation unit 402, and sets the subject corresponding to the input contents received from the input/output unit 205 to the input unit by the operator as the gaze target (specific object), and outputs the model information. For example, when a new keyboard is provided in the input/output unit 205 and the operator presses the right button on the keyboard, the model of the subject located to the right of the currently selected subject model as seen from the virtual camera and having the shortest distance in the X-axis direction is set as the gaze target. The model information of the set subject is transmitted to the gaze target setting unit 403. If there is no input from the input unit to specify the gaze target, the subject closest to the attention point is selected as the gaze target. Note that the gaze target is not limited to a single object, and multiple subjects may be selected as the gaze target. In that case, the position of the model group of the gaze target is approximated by the center of gravity of the bounding box surrounding the model group, but the means for acquiring the position information is not limited to this. The model information selected as the gaze target is transmitted to the control determination unit 404. Note that the gaze target setting unit 403 may be included in a device different from the image processing device 102. For example, the information processing device 103 may have this function, in which case the gaze target is set by the user making an input to select the gaze target.

制御判定部４０４は、注視対象設定部４０３から注視対象のモデル情報を、仮想カメラ情報取得部４０１から仮想カメラのカメラパラメータを受信し、注視対象のモデルを注視するように仮想カメラの位置または姿勢に自動制御が有効か否かを判定する。ここで、注視とは、仮想カメラの撮影画像の中心位置と注視対象を囲うバウンディングボックスの重心位置を一致していること、または、仮想カメラ情報取得部４０１から受信した注視領域内に注視対象のモデルが位置していることとする。自動制御が有効か否かの判定は、操作者による入力部への入力内容や、注視対象を含む被写体群の位置と仮想カメラの位置および姿勢、注目点の関係に基づいて行われる。本実施形態では、注視対象設定部４０３で設定された注視対象を注視できておらず、仮想カメラの画角内に注視対象が存在しない場合に自動制御が有効と判定する。自動制御が有効か否かの判定結果および自動制御の対象となるカメラパラメータを示す情報は、位置姿勢制御部４０５に送信される。なお、自動制御が有効と判定された場合、次のフレームから自動制御を解除するか否かの判定を行う。自動制御を解除するか否かの判定は、操作者による入力部への入力内容に基づいて行われる。具体的には、自動制御の対象となるカメラパラメータに対する入力があれば自動制御を解除する。つまり、操縦者が自動制御を解除したい場合、自動制御されているパラメータを変更する操作を行えばよい。自動制御を解除するか否かの判定結果は、位置姿勢制御部４０５に送信される。なお、自動制御はカメラパラメータを修正する処理であるため、制御判定のことを修正判定や変更判定と呼称してもよい。 The control determination unit 404 receives model information of the gaze target from the gaze target setting unit 403 and camera parameters of the virtual camera from the virtual camera information acquisition unit 401, and determines whether automatic control is enabled for the position or posture of the virtual camera so as to gaze at the model of the gaze target. Here, gaze means that the center position of the image captured by the virtual camera coincides with the center of gravity position of the bounding box surrounding the gaze target, or the model of the gaze target is located within the gaze area received from the virtual camera information acquisition unit 401. The determination of whether automatic control is enabled is performed based on the input contents to the input unit by the operator, the relationship between the position of the group of subjects including the gaze target, the position and posture of the virtual camera, and the attention point. In this embodiment, if the gaze target set by the gaze target setting unit 403 cannot be gazed at and the gaze target does not exist within the angle of view of the virtual camera, the automatic control is determined to be enabled. The determination result of whether automatic control is enabled and information indicating the camera parameters to be subject to automatic control are transmitted to the position and posture control unit 405. If automatic control is determined to be enabled, a determination is made as to whether or not to cancel automatic control from the next frame. The determination as to whether or not to cancel automatic control is made based on the input by the operator to the input unit. Specifically, automatic control is canceled if there is an input for a camera parameter that is the subject of automatic control. In other words, if the pilot wishes to cancel automatic control, he or she can perform an operation to change the automatically controlled parameter. The result of the determination as to whether or not to cancel automatic control is sent to the position and orientation control unit 405. Note that, because automatic control is a process of modifying camera parameters, the control determination may also be called a modification determination or a change determination.

位置姿勢制御部４０５は、制御判定部４０４により自動制御が有効と判定された場合に、注視対象として選択されているモデルに対して注視するようにカメラパラメータの一部を自動制御し、仮想カメラのカメラパラメータを修正する。このとき、操作者により入力部に入力された移動操作及び注視対象のモデルの移動に応じて、自動制御されるカメラパラメータが決定される。言い換えれば、自動制御が有効になったときの操作者の入力により、自動制御されるカメラパラメータが決定される。修正後のカメラパラメータは、仮想視点画像生成部４０６に送信される。有効と判定された自動制御は、解除の判定がされるまで継続される。自動制御が有効であって、制御判定部４０４より自動制御を解除する情報を受信した場合、自動制御を解除する。 When the control determination unit 404 determines that automatic control is enabled, the position and orientation control unit 405 automatically controls some of the camera parameters to focus on the model selected as the gaze target, and modifies the camera parameters of the virtual camera. At this time, the camera parameters to be automatically controlled are determined according to the movement operation input to the input unit by the operator and the movement of the gaze target model. In other words, the camera parameters to be automatically controlled are determined by the input of the operator when automatic control is enabled. The modified camera parameters are sent to the virtual viewpoint image generation unit 406. Automatic control that is determined to be enabled continues until it is determined to be released. When automatic control is enabled and information to release the automatic control is received from the control determination unit 404, the automatic control is released.

仮想視点画像生成部４０６は、被写体モデル生成部４０２から被写体のモデル情報、位置姿勢制御部４０５から仮想カメラのカメラパラメータを取得し、仮想カメラの位置および姿勢から見た被写体モデルのレンダリングを行い、仮想視点画像を生成する。生成された仮想視点画像は情報処理装置１０３に送信され、入出力部２０５の表示部で表示される。 The virtual viewpoint image generation unit 406 acquires the subject model information from the subject model generation unit 402 and the camera parameters of the virtual camera from the position and orientation control unit 405, renders the subject model as viewed from the position and orientation of the virtual camera, and generates a virtual viewpoint image. The generated virtual viewpoint image is transmitted to the information processing device 103 and displayed on the display unit of the input/output unit 205.

図５（ａ）は、仮想カメラの操作内容と注視時に自動制御されるカメラパラメータの対応例を示す図である。注視対象のモデルに対して注視するように仮想カメラの位置および姿勢を自動制御するとき、入力部に入力された、操作者による仮想カメラの操作内容によって、注視するように自動制御を受けるカメラパラメータが決定される。ただし、自動制御を受けるカメラパラメータは、入力部に入力されている移動操作のカメラパラメータとは異なるカメラパラメータである。また、操作者による仮想カメラの移動操作と注視時に自動制御されるカメラパラメータは、一対一で対応するとは限らず、一つの仮想カメラの移動操作に対し、自動制御されるカメラパラメータの候補が二つ以上存在する場合がある。この場合、注視するように仮想カメラを移動させたときに、カメラパラメータの変化量が小さくなる移動のカメラパラメータに自動制御を行う。Ｘ軸周りの回転また、複数の移動操作が入力部に入力された場合、自動制御されるカメラパラメータの総数が最小となるように自動制御されるカメラパラメータを決定する。なお、カメラパラメータの候補から自動制御されるカメラパラメータを決定する方法はこれに限定しない。 Figure 5 (a) is a diagram showing an example of the correspondence between the operation contents of the virtual camera and the camera parameters that are automatically controlled when gazing. When the position and attitude of the virtual camera are automatically controlled to gaze at the model of the gaze target, the camera parameters that are automatically controlled to gaze are determined according to the operation contents of the virtual camera by the operator inputted to the input unit. However, the camera parameters that are automatically controlled are different from the camera parameters of the movement operation inputted to the input unit. In addition, the movement operation of the virtual camera by the operator and the camera parameters that are automatically controlled when gazing are not necessarily in one-to-one correspondence, and there may be two or more candidates for the camera parameters that are automatically controlled for one movement operation of the virtual camera. In this case, when the virtual camera is moved to gaze at, the camera parameters of the movement that have a small change amount of the camera parameters are automatically controlled. Rotation around the X axis Furthermore, when multiple movement operations are inputted to the input unit, the camera parameters that are automatically controlled are determined so that the total number of camera parameters that are automatically controlled is minimized. Note that the method of determining the camera parameters to be automatically controlled from the candidates of the camera parameters is not limited to this.

本実施形態では、操作内容がＸ軸方向の併進移動であった場合、パンが自動制御される。操作内容が併進移動の場合、操作者は仮想カメラの位置を指定する操作を行うため、仮想カメラの位置を把握できる。しかしながら、操作内容が回転移動の場合、仮想カメラの位置に対応するカメラパラメータが自動制御されるため、仮想カメラの位置が実際にどこに移動するのか操作者が把握できない虞がある。そのため、本実施形態では、操作内容が回転移動の場合、注視対象を中心とし、注視対象と仮想カメラとの距離を半径とした円上に仮想カメラの位置を移動させるように自動制御を行う。このようにすることで、操作内容が回転移動であった場合にも、仮想カメラの位置がどこに動くのか容易に想像できる。なお、これに限定されず、操作内容が回転移動の場合、仮想カメラの位置が併進移動するように自動制御してもよい。また、コントローラに別途ボタンを設けておき、ボタンを押下することにより、自動制御の内容を切り替えるようにしてもよい。例えば、自動制御の内容が、注視対象を中心とした回転移動であった場合、ボタンを押下することにより、併進移動に切り替えることができる。 In this embodiment, if the operation is translation in the X-axis direction, the pan is automatically controlled. If the operation is translation, the operator performs an operation to specify the position of the virtual camera, and thus the position of the virtual camera can be grasped. However, if the operation is rotation, the camera parameters corresponding to the position of the virtual camera are automatically controlled, and there is a risk that the operator will not be able to grasp where the position of the virtual camera will actually move. Therefore, in this embodiment, if the operation is rotation, automatic control is performed to move the position of the virtual camera on a circle with the gaze target as the center and the distance between the gaze target and the virtual camera as the radius. In this way, even if the operation is rotation, it is easy to imagine where the position of the virtual camera will move. Note that this is not limited to this, and if the operation is rotation, the position of the virtual camera may be automatically controlled to move in a translational manner. In addition, a separate button may be provided on the controller, and the content of the automatic control may be switched by pressing the button. For example, if the content of the automatic control is rotation around the gaze target, pressing the button can switch to translation.

図５（ｂ）に仮想カメラの併進移動・回転移動の分類を示す。本実施形態において、仮想カメラの併進移動とは、仮想カメラの位置のみを変更し、姿勢を変更しない移動のことである。言い換えれば、Ｘ軸、Ｙ軸、Ｚ軸のパラメータを変更し、パン、チルト、ロールのパラメータを変更しないことである。姿勢を変更しないため、仮想カメラの位置と注目点の位置が並行して移動するため、併進移動と定義する。なお、仮想カメラの回転移動とは、仮想カメラの姿勢のみを変更し、位置を変更しない動作のことである。言い換えれば、パン、チルト、ロールのパラメータを変更し、Ｘ軸、Ｙ軸、Ｚ軸のパラメータを変更しないことである。操作者による仮想カメラの移動操作と注視時に自動制御されるカメラパラメータの対応は図５（ａ）に示す対応に限定されず、一方が併進移動、もう一方が回転移動となるような対応となればよい。つまり、操作者による移動操作と自動制御されるカメラパラメータがともに併進移動（または回転移動）となることはない。なお、拡大・縮小は、光軸方向への併進移動と考え、併進移動に含める。注目点を中心にした横回転は、回転移動に含める。 Figure 5(b) shows the classification of translational and rotational movements of the virtual camera. In this embodiment, translational movement of the virtual camera refers to movement in which only the position of the virtual camera is changed, without changing the attitude. In other words, it is to change the parameters of the X-axis, Y-axis, and Z-axis, and not to change the parameters of the pan, tilt, and roll. Since the attitude is not changed, the position of the virtual camera and the position of the point of interest move in parallel, so it is defined as translational movement. Note that rotational movement of the virtual camera refers to an operation in which only the attitude of the virtual camera is changed, without changing the position. In other words, it is to change the parameters of the pan, tilt, and roll, and not to change the parameters of the X-axis, Y-axis, and Z-axis. The correspondence between the movement operation of the virtual camera by the operator and the camera parameters that are automatically controlled when gazing is not limited to the correspondence shown in Figure 5(a), and it is sufficient that one is translational movement and the other is rotational movement. In other words, the movement operation by the operator and the camera parameters that are automatically controlled are not both translational movement (or rotational movement). Note that enlargement and reduction are considered as translational movement in the optical axis direction, and are included in translational movement. Horizontal rotation around a point of interest is included in rotational movement.

一方で、仮想カメラの移動が停止している状態であっても注視対象のモデルが移動した場合、仮想カメラと注視対象の相対的な位置関係は変化する。そのため、仮想カメラが停止している状態で注視対象のモデルが移動した場合の自動制御の対象となるカメラパラメータを予め設定する。具体的には、仮想カメラの位置または姿勢のいずれかに対応するカメラパラメータが設定される。本実施形態では、仮想カメラの位置に対応するカメラパラメータを設定する。どちらの制御を行うかは操縦者が設定でき、コントローラに仮想カメラが停止している状態での自動制御対象を併進移動と回転移動で切り替えるボタンを設ける（不図示）。 On the other hand, if the model being watched moves even when the virtual camera is stopped, the relative positional relationship between the virtual camera and the object being watched changes. For this reason, camera parameters that are subject to automatic control when the model being watched moves while the virtual camera is stopped are set in advance. Specifically, camera parameters corresponding to either the position or the attitude of the virtual camera are set. In this embodiment, camera parameters corresponding to the position of the virtual camera are set. The operator can set which control is to be performed, and the controller is provided with a button (not shown) that switches the object of automatic control between translational movement and rotational movement when the virtual camera is stopped.

図６は、実施形態１における仮想カメラのカメラパラメータの修正例を示す図である。制御判定部４０４により自動制御を行うと判定された場合、位置姿勢制御部４０５は、操作者による入力部の入力内容や注視対象を含む被写体群の位置、仮想カメラの位置および姿勢、注目点の関係に基づき、カメラパラメータを修正する。自動制御有効中は、操作者により入力部に入力された移動操作及び注視対象のモデルの移動に応じてカメラパラメータの一部が自動制御される。なお、自動制御されているカメラパラメータに対する入力があった場合、自動制御を解除する。 Figure 6 is a diagram showing an example of how camera parameters of a virtual camera are modified in embodiment 1. When the control determination unit 404 determines that automatic control should be performed, the position and orientation control unit 405 modifies the camera parameters based on the input content by the operator via the input unit, the position of the group of subjects including the gaze target, the position and orientation of the virtual camera, and the relationship between the attention points. When automatic control is active, some of the camera parameters are automatically controlled in response to the movement operation input by the operator via the input unit and the movement of the model of the gaze target. Note that when there is an input for the automatically controlled camera parameters, the automatic control is released.

図６（ａ）は、注視対象のモデル６０２と仮想カメラ６０１の注目点６０３の位置関係に基づくカメラパラメータの修正例を示す図である。操作者により注目点６０３を中心にした横回転の移動操作６０４及びそれらを組み合わせた移動操作後に、注視対象の移動により注視対象のモデル６０２の位置が注目点６０３から遠ざかることがある。このとき、注視対象のモデル６０２が仮想カメラ６０１の撮影画像内の注視領域６０４の内から外へ移動した場合に、制御判定部４０４により仮想カメラに自動制御を行うと判定される。自動制御が有効となると、注視対象のモデル情報と、自動制御が有効になったフレームでの仮想カメラのカメラパラメータを参照し、自動制御を行うパラメータを決定する。図６（ａ）では、操縦者による注目点６０３を中心にした横回転が終了し、操縦者が仮想カメラを操縦していないときに、注視対象６０２が注視領域６０４の外に出たとする。このとき、注目点６０３の位置座標が注視対象のモデル６０２の位置座標と一致するように仮想カメラが併進移動６０５する。なお、注目点６０３と注視対象のモデル６０２の位置座標を一致させる方法はこれに限定しない。自動制御有効中は、注目点６０３と注視対象のモデル６０２の位置座標が常に一致するように、フレームごとに注目点６０３の位置座標が更新され、仮想カメラが併進移動６０５する。このとき、自動制御が有効化された直後の注目点と仮想カメラの距離の値を記憶しておき、自動制御有効中は注目点６０３と仮想カメラ６０１の距離が常に記憶した値となるように仮想カメラが併進移動６０５する。ただし、操作者が拡大または縮小の操作を行った場合、変更後の値となるように仮想カメラを併進移動６０５させる。 6A is a diagram showing an example of camera parameter correction based on the positional relationship between the model 602 of the gaze target and the attention point 603 of the virtual camera 601. After the operator performs a horizontal rotation movement operation 604 around the attention point 603 and a movement operation combining these, the position of the model 602 of the gaze target may move away from the attention point 603 due to the movement of the gaze target. In this case, when the model 602 of the gaze target moves from inside to outside the gaze area 604 in the image captured by the virtual camera 601, the control determination unit 404 determines that automatic control is to be performed on the virtual camera. When automatic control is enabled, the model information of the gaze target and the camera parameters of the virtual camera in the frame in which automatic control is enabled are referenced to determine the parameters for automatic control. In FIG. 6A, it is assumed that the gaze target 602 moves outside the gaze area 604 when the operator finishes the horizontal rotation around the attention point 603 and the operator is not operating the virtual camera. At this time, the virtual camera translates 605 so that the position coordinates of the attention point 603 match those of the gazed-upon model 602. Note that the method of matching the position coordinates of the attention point 603 and the gazed-upon model 602 is not limited to this. While automatic control is active, the position coordinates of the attention point 603 are updated for each frame and the virtual camera translates 605 so that the position coordinates of the attention point 603 and the gazed-upon model 602 always match. At this time, the value of the distance between the attention point and the virtual camera immediately after automatic control is enabled is stored, and while automatic control is active, the virtual camera translates 605 so that the distance between the attention point 603 and the virtual camera 601 always becomes the stored value. However, if the operator performs a zoom in or zoom out operation, the virtual camera is translated 605 so that the changed value is obtained.

図６（ｂ）は、注視対象を含む被写体群の位置と仮想カメラの位置および姿勢の関係に基づくカメラパラメータの修正例を示す図である。操作者による仮想カメラの移動操作６０６及び注視対象のモデル６０２の位置の変化により、注視対象のモデル６０２が仮想カメラ６０１の撮影画像内の注視領域６０４の内から外へ移動した場合、自動制御が有効化されてカメラパラメータが修正される。自動制御が有効となると、注視対象のモデル６０２を注視するように仮想カメラが回転移動６０８する。ここで、注視対象のモデル６０２が注視対象ではない被写体６０７に遮られたために自動制御が有効になった場合においては、注目点６０３を中心にした横回転により仮想カメラを回転移動６０８させて注視対象を注視する。自動制御有効中は、注視対象設定部４０３から受信した注視対象のモデルの位置情報と、仮想カメラ情報取得部４０１から受信した仮想カメラのカメラパラメータを参照し、フレームごとに注視対象のモデルに対して注視しているかを判定する。注視されていないと判定された場合、注視対象のモデルを注視するように仮想カメラを併進移動または回転移動６０８させる。また、本実施形態では、注視対象が遮られていない位置まで回転移動６０８を行う。さらに、注視対象ではない被写体６０７が仮想カメラの光軸から所定の距離離れるまで回転移動６０８を行ってもよい。 6B is a diagram showing an example of camera parameter correction based on the relationship between the position of the group of objects including the gaze target and the position and attitude of the virtual camera. When the gaze target model 602 moves from inside to outside the gaze area 604 in the captured image of the virtual camera 601 due to the operator's movement operation 606 of the virtual camera and the change in the position of the gaze target model 602, automatic control is enabled and the camera parameters are corrected. When automatic control is enabled, the virtual camera rotates 608 to gaze at the gaze target model 602. Here, when automatic control is enabled because the gaze target model 602 is blocked by a subject 607 that is not the gaze target, the virtual camera rotates 608 by horizontal rotation around the attention point 603 to gaze at the gaze target. While automatic control is enabled, the position information of the gaze target model received from the gaze target setting unit 403 and the camera parameters of the virtual camera received from the virtual camera information acquisition unit 401 are referenced to determine whether the gaze target model is being gazed at for each frame. If it is determined that the object is not being gazed at, the virtual camera is translated or rotated 608 so as to gaze at the model of the object of gaze. In this embodiment, the virtual camera is rotated 608 to a position where the object of gaze is not blocked. Furthermore, the virtual camera may be rotated 608 until the object 607 that is not the object of gaze is a predetermined distance away from the optical axis of the virtual camera.

図６（ｃ）は、操作者による仮想カメラの移動操作に基づくカメラパラメータの修正例を示す図である。制御判定部４０４により自動制御を行うと判定されると、入力部に入力された移動操作及び注視対象のモデルの移動に応じて一部のカメラパラメータが自動制御される。自動制御有効中に、自動制御されているカメラパラメータに対する入力があった場合は、自動制御を解除する。例えば、Ｘ軸及びＹ軸方向の併進移動６０９が入力されているとき、Ｘ軸方向の併進移動に対しては、パンが対応し、Ｙ軸方向の併進移動に対しては、チルトまたはパンが対応する。複数の移動操作が入力部に入力されている場合、自動制御されるカメラパラメータの総数が最小となるように自動制御されるカメラパラメータを決定するため、自動制御されるカメラパラメータはパンとなる。次に、移動操作がＸ軸及びＹ軸方向の併進移動６０９から注目点を中心とした横回転６１０に変更される。このとき、自動制御が有効になったときの操作はＸ軸及びＹ軸方向の併進移動６０９であったため、パンが自動制御されている。操作者は注目点を中心とした横回転６１０を入力したため、入力されるパラメータは仮想カメラの位置を指定するＸおよびＹに加え、仮想カメラの姿勢を指定するパンである。このとき、入力にパンが含まれているため、自動制御されていたパラメータに対する入力が行われたと判定し、自動制御が解除される。 Figure 6 (c) is a diagram showing an example of camera parameter correction based on the movement operation of the virtual camera by the operator. When the control determination unit 404 determines that automatic control is to be performed, some camera parameters are automatically controlled according to the movement operation input to the input unit and the movement of the model of the gaze target. When an input is made to the automatically controlled camera parameters while the automatic control is enabled, the automatic control is released. For example, when translational movement 609 in the X-axis and Y-axis directions is input, pan corresponds to the translational movement in the X-axis direction, and tilt or pan corresponds to the translational movement in the Y-axis direction. When multiple movement operations are input to the input unit, the automatically controlled camera parameters are determined so that the total number of automatically controlled camera parameters is minimized, so that the automatically controlled camera parameters are panned. Next, the movement operation is changed from translational movement 609 in the X-axis and Y-axis directions to horizontal rotation 610 around the point of interest. At this time, since the operation when the automatic control was enabled was translational movement 609 in the X-axis and Y-axis directions, panning is automatically controlled. Because the operator has input horizontal rotation 610 around the point of interest, the input parameters are X and Y, which specify the position of the virtual camera, as well as pan, which specifies the attitude of the virtual camera. At this time, because the input includes pan, it is determined that an input has been made to a parameter that was automatically controlled, and the automatic control is released.

図７は、実施形態１における画像処理装置の処理を表すフローチャートである。 Figure 7 is a flowchart showing the processing of the image processing device in embodiment 1.

ステップＳ７０１において、被写体モデル生成部４０２は、撮影システム１０１から異なる方向から撮影された複数枚の画像や、撮影装置の位置および姿勢のカメラパラメータを取得し、被写体モデルを生成する。 In step S701, the subject model generation unit 402 acquires multiple images captured from different directions by the imaging system 101 and camera parameters of the position and orientation of the imaging device, and generates a subject model.

ステップＳ７０２において、仮想カメラ情報取得部４０１は、情報処理装置１０３の入力部で入力された仮想カメラの移動操作から、仮想カメラの位置および姿勢などのカメラパラメータ情報を取得する。このカメラパラメータ情報は、操作者の入力情報をカメラパラメータに変換したものであるため、初期値または前のフレームのカメラパラメータを変更するための変更情報や更新情報とも言える。取得したカメラパラメータ情報は制御判定部４０４へ送信する。 In step S702, the virtual camera information acquisition unit 401 acquires camera parameter information such as the position and attitude of the virtual camera from the virtual camera movement operation input via the input unit of the information processing device 103. This camera parameter information is obtained by converting the operator's input information into camera parameters, and can therefore also be considered change information or update information for changing the initial values or the camera parameters of the previous frame. The acquired camera parameter information is transmitted to the control determination unit 404.

ステップＳ７０３において、制御判定部４０４は、自動制御が有効になっているか否かを判定する。自動制御が有効になっていればステップＳ７０４に進む。自動制御が有効になっていなければ、ステップＳ７０６に進む。 In step S703, the control determination unit 404 determines whether or not automatic control is enabled. If automatic control is enabled, the process proceeds to step S704. If automatic control is not enabled, the process proceeds to step S706.

ステップＳ７０４において、制御判定部４０４は、ステップＳ７０２で取得したカメラパラメータ情報を参照し、自動制御中のカメラパラメータに対する入力が含まれているか判定する。自動制御中のカメラパラメータに対する入力が含まれていればステップＳ７０５に進む。自動制御中のカメラパラメータに対する入力が含まれていなければステップＳ７０６に進む。 In step S704, the control determination unit 404 refers to the camera parameter information acquired in step S702 and determines whether it includes input for the camera parameters under automatic control. If it includes input for the camera parameters under automatic control, the process proceeds to step S705. If it does not include input for the camera parameters under automatic control, the process proceeds to step S706.

ステップＳ７０５において、制御判定部４０４は、有効になっている自動制御を解除する。このとき、自動制御を解除する情報を位置姿勢制御部に送信する。例えば、自動制御の有効か否かを判定するパラメータａｕｔｏを設け、ａｕｔｏ＝１のとき自動制御が有効であり、ａｕｔｏ＝０のとき自動制御が無効であるとして情報を送信する。 In step S705, the control determination unit 404 cancels the enabled automatic control. At this time, information to cancel the automatic control is sent to the position and orientation control unit. For example, a parameter auto is provided to determine whether or not automatic control is enabled, and information is sent indicating that automatic control is enabled when auto=1 and disabled when auto=0.

ステップＳ７０６において、注視対象設定部４０３は、情報処理装置１０３の入力部から操作者による注視対象を選択する入力を取得する。本実施形態では、Ｓ７０１にて生成した被写体モデルに対応する被写体ＩＤを予め生成しておき、操作者が注視対象の被写体ＩＤを入力することで注視対象を設定する。注視対象を選択する入力がない場合、注目点に最も近い被写体を注視対象として設定する。 In step S706, the gaze target setting unit 403 acquires an input from the operator to select a gaze target from the input unit of the information processing device 103. In this embodiment, a subject ID corresponding to the subject model generated in S701 is generated in advance, and the operator sets the gaze target by inputting the subject ID of the gaze target. If there is no input to select a gaze target, the subject closest to the attention point is set as the gaze target.

ステップＳ７０７において、注視対象設定部４０３は、被写体モデル生成部４０２で生成された被写体モデルの位置情報を参照し、ステップＳ７０６で選択された被写体の位置情報と一致する被写体のモデル情報を取得する。 In step S707, the gaze target setting unit 403 refers to the position information of the subject model generated by the subject model generation unit 402, and obtains the model information of the subject that matches the position information of the subject selected in step S706.

ステップＳ７０８において、制御判定部４０４は、ステップＳ７０２で取得した仮想カメラのカメラパラメータと、被写体モデル生成部４０２で生成された注視対象を含む全被写体のモデル情報を参照し、自動制御の有効化条件を満たすか否かを判断する。有効化条件を満たす場合はステップＳ７０９へ、満たさない場合はステップＳ７１４へ進む。なお、自動制御の有効か否かを示すパラメータａｕｔｏがａｕｔｏ＝１となっている場合は、自動制御の有効化条件を判定せずにステップＳ７０９に進む。 In step S708, the control determination unit 404 refers to the camera parameters of the virtual camera acquired in step S702 and the model information of all subjects including the gaze target generated by the subject model generation unit 402, and determines whether the automatic control activation condition is met. If the activation condition is met, the process proceeds to step S709, and if not, the process proceeds to step S714. Note that if the parameter auto indicating whether automatic control is enabled is auto=1, the process proceeds to step S709 without determining the automatic control activation condition.

ステップＳ７０９において、制御判定部４０４は、仮想カメラと注視対象のモデルの間に位置する注視対象ではない被写体により、撮影画像において注視対象が遮られているかを判断する。注視対象が遮られている場合はステップＳ７１３へ、遮られていない場合はＳ７１０へ進む。 In step S709, the control determination unit 404 determines whether the gaze target is obstructed in the captured image by a subject that is not the gaze target and is located between the virtual camera and the gaze target model. If the gaze target is obstructed, the process proceeds to step S713; if not, the process proceeds to S710.

ステップＳ７１０において、仮想カメラ情報取得部４０１は、ステップＳ７０２で取得した仮想カメラの移動操作が併進移動であるか否かを判断する。併進移動の場合はステップＳ７１２、異なる場合はステップＳ７１１に進む。 In step S710, the virtual camera information acquisition unit 401 determines whether the virtual camera movement operation acquired in step S702 is a translational movement. If it is a translational movement, the process proceeds to step S712; if not, the process proceeds to step S711.

ステップＳ７１１において、位置姿勢制御部４０５は、仮想カメラ情報取得部４０１から取得したカメラパラメータに対して、注視対象のモデルを注視するように仮想カメラを併進移動させ、カメラパラメータを修正する。また、ステップＳ７０２で取得する仮想カメラ情報が前のフレームと同一である場合、注視対象のモデルの移動に応じたカメラパラメータを自動制御する。具体的には、注視対象が移動した場合、注視対象のモデルを注視するように仮想カメラを併進移動させ、カメラパラメータを修正する。そして、修正後のカメラパラメータを仮想視点画像生成部４０６へ送信する。 In step S711, the position and orientation control unit 405 translates the virtual camera so as to gaze at the model of the gaze target, and corrects the camera parameters, based on the camera parameters acquired from the virtual camera information acquisition unit 401. Furthermore, if the virtual camera information acquired in step S702 is the same as that of the previous frame, the camera parameters are automatically controlled in accordance with the movement of the model of the gaze target. Specifically, when the gaze target moves, the virtual camera is translated so as to gaze at the model of the gaze target, and the camera parameters are corrected. The corrected camera parameters are then transmitted to the virtual viewpoint image generation unit 406.

ステップＳ７１２において、位置姿勢制御部４０５は、仮想カメラ情報取得部４０１から取得したカメラパラメータに対して、注視対象のモデルを注視するように仮想カメラを回転移動させ、カメラパラメータを修正する。そして、修正後のカメラパラメータを仮想視点画像生成部４０６へ送信する。 In step S712, the position and orientation control unit 405 rotates and moves the virtual camera so that the virtual camera gazes at the model of the gaze target, thereby correcting the camera parameters acquired from the virtual camera information acquisition unit 401. The position and orientation control unit 405 then transmits the corrected camera parameters to the virtual viewpoint image generation unit 406.

ステップＳ７１３において、位置姿勢制御部４０５は、仮想カメラ情報取得部４０１から取得したカメラパラメータに対して、注視対象のモデルを注視するように注目点を中心とした横回転・縦回転により仮想カメラを移動させ、カメラパラメータを修正する。そして、修正後のカメラパラメータを仮想視点画像生成部４０６へ送信する。 In step S713, the position and orientation control unit 405 corrects the camera parameters acquired from the virtual camera information acquisition unit 401 by moving the virtual camera by horizontal and vertical rotation around the attention point so as to gaze at the model of the gaze target. The corrected camera parameters are then transmitted to the virtual viewpoint image generation unit 406.

ステップＳ７１４において、仮想視点画像生成部４０６は、ステップＳ７０１で生成した被写体のモデル情報と、ステップＳ７０２で取得したカメラパラメータ、または、ステップＳ７１１～Ｓ７１３で修正したカメラパラメータを取得する。そして、仮想カメラの位置および姿勢から見た被写体モデルのレンダリングを行い、仮想視点画像を生成する。このとき、ステップＳ７０６において、注視対象を選択する入力を取得していない場合はステップＳ７０２で取得したカメラパラメータを使用し、取得していた場合はＳ７１１～Ｓ７１３で修正したカメラパラメータを使用する。 In step S714, the virtual viewpoint image generation unit 406 acquires the model information of the subject generated in step S701 and the camera parameters acquired in step S702 or the camera parameters modified in steps S711 to S713. Then, rendering of the subject model as viewed from the position and orientation of the virtual camera is performed to generate a virtual viewpoint image. At this time, if no input for selecting the gaze target has been acquired in step S706, the camera parameters acquired in step S702 are used, and if input has been acquired, the camera parameters modified in steps S711 to S713 are used.

ステップＳ７１５において、情報処理装置の作業終了の指示があったかを判断する。作業終了の指示がない場合はステップＳ７０１戻り、ステップＳ７０１からステップＳ７１５までのフローを繰り返し、作業終了の指示があった場合はフローを終了する。 In step S715, it is determined whether an instruction to end the work on the information processing device has been given. If an instruction to end the work has not been given, the process returns to step S701 and the flow from step S701 to step S715 is repeated, and if an instruction to end the work has been given, the flow ends.

上記処理により、自動制御が有効になったときの操作者の操作内容に応じて、操作者の操作していないパラメータを自動制御し、自動制御が解除されるまで自動制御を続ける。本処理により、操作者の操作したいカメラパラメータが変更した場合にも、自動制御を解除することができ、操作者の意図する仮想視点画像を生成することができる。 By the above process, parameters not operated by the operator are automatically controlled according to the operation contents of the operator when the automatic control is enabled, and the automatic control continues until the automatic control is released. This process can release the automatic control even if the camera parameters that the operator wants to operate change, and the virtual viewpoint image that the operator intends can be generated.

本実施形態ではフレーム毎に処理を行うことになっているが、注視対象を注視する自動制御が有効な時刻を操作者が指定し、その間だけ処理が行われるようにしてもよい。 In this embodiment, the processing is performed for each frame, but the operator may specify the time during which the automatic control of gazing at the gaze target is effective, and the processing may be performed only during that time.

本実施形態では、選択された注視対象のモデル情報を読み込み、操作者による入力部への入力内容や、注視対象を含む被写体群の位置と仮想カメラの位置および姿勢、注目点の関係に基づいて、注視対象を注視するように仮想カメラの位置および姿勢を制御する。このとき、入力内容に応じて予め設定されたカメラパラメータを自動制御して注視する処理が行われるため、操作者は自動制御されるカメラパラメータを容易に予想でき、自動制御による自動制御と手動操作の両立時における操作ミスを防止することができる。また、注視したい被写体を注視できていない場合のみ、仮想カメラのカメラパラメータに自動制御を行うため、仮想視点画像撮影の利点であるカメラ移動の自由度を過度に低下させることがない。 In this embodiment, model information for the selected gaze target is read, and the position and attitude of the virtual camera are controlled to gaze at the gaze target based on the input contents by the operator to the input unit, the positions of the group of subjects including the gaze target, the position and attitude of the virtual camera, and the relationship of the point of interest. At this time, a process of gaze is performed by automatically controlling preset camera parameters according to the input contents, so that the operator can easily predict the camera parameters to be automatically controlled, and it is possible to prevent operational errors when automatic control by automatic control and manual operation are performed at the same time. In addition, automatic control of the camera parameters of the virtual camera is performed only when the subject to be gazed at is not gazed at, so there is no excessive reduction in the degree of freedom of camera movement, which is an advantage of virtual viewpoint image capture.

＜実施形態２＞
第１の実施形態では、操作者による入力部への入力内容や、注視対象を含む被写体群の位置と仮想カメラの位置および姿勢、注目点の関係に基づいて、注視対象を注視するように仮想カメラの位置および姿勢を制御する処理について説明した。このとき、注視のために自動制御されるカメラパラメータにおいては、自動制御されているカメラパラメータに対する操作者による入力部への入力があれば、自動制御を解除した。しかしながら、操縦者が自動制御を解除せずに自動制御中のカメラパラメータを調整したい可能性がある。例えば、第１の実施形態では、自動制御中は注視対象を注視するように自動制御したが、注視領域の端に注視対象がいるような仮想視点画像を生成したい可能性がある。そこで、第２の実施形態では、自動制御対象となっているカメラパラメータに対し、所定値以下の入力であれば自動制御を解除せずにカメラパラメータを変更できるものとする。 <Embodiment 2>
In the first embodiment, a process was described in which the position and attitude of the virtual camera are controlled to gaze at the target of gaze based on the contents of the input to the input unit by the operator, the position of the group of subjects including the target of gaze, the position and attitude of the virtual camera, and the relationship of the point of interest. At this time, in the camera parameters that are automatically controlled for gaze, if there is an input to the input unit by the operator for the automatically controlled camera parameters, the automatic control is released. However, there is a possibility that the operator wants to adjust the camera parameters during automatic control without releasing the automatic control. For example, in the first embodiment, the camera is automatically controlled to gaze at the target of gaze during automatic control, but there is a possibility that the operator wants to generate a virtual viewpoint image in which the target of gaze is at the edge of the gaze area. Therefore, in the second embodiment, it is assumed that the camera parameters that are the target of automatic control can be changed without releasing the automatic control if the input is equal to or less than a predetermined value.

自動制御されるカメラパラメータに対する入力部への入力量が微小の場合は、自動制御が無効化されず、自動制御されるパラメータに反映する。例えば、操作者が併進移動を入力しており、回転移動を自動制御としている場合、操縦者がＸ軸周りの回転移動（パン）を入力したとする。このとき、入力量が所定値以下であれば自動制御を解除せずパラメータを更新する。所定値は予め設定されているものとし、本実施形態では０．５ｍ／ｓの入力量以下であれば自動制御を解除しないものとする。自動制御を解除しない場合、入力した操作はその自動制御が解除されるまで、自動制御の対象となるパラメータに反映される。つまり、自動制御中の仮想視点画像に映る注視対象の位置を調整できるようになり、操作者の所望の位置に注視対象を調整できることになる。入力量が微小か否かの判定は、制御判定部４０４により判定され、入力部に入力された入力量が操作者によりあらかじめ設定された基準値を下回る場合に微小と判定される。操作者は、入力部により基準値を入力し、制御判定部４０４がその数値を記憶、参照する。入力量が微小か否かの判定方法はこれに限定しない。例えば、自動制御されるカメラパラメータに対する入力部への入力量と、その他のカメラパラメータに対する入力量との相対的な比較によって入力量が微小か否かの判定をしてもよい。なお、入力部への入力に対し、重み係数をかけてカメラパラメータを制御する可能性が考えられる。その場合、カメラパラメータの変更量が所定値以下の入力であれば自動制御を解除せずにカメラパラメータを変更できるものとしてもよい。このようにすることで、操縦者による仮想カメラの入力が容易になる。 If the input amount to the input unit for the automatically controlled camera parameters is small, the automatic control is not disabled and is reflected in the automatically controlled parameters. For example, if the operator inputs a translational movement and the rotational movement is automatically controlled, the operator inputs a rotational movement (pan) around the X axis. At this time, if the input amount is less than a predetermined value, the automatic control is not released and the parameters are updated. The predetermined value is set in advance, and in this embodiment, if the input amount is less than 0.5 m/s, the automatic control is not released. If the automatic control is not released, the input operation is reflected in the parameters to be automatically controlled until the automatic control is released. In other words, the position of the gaze target reflected in the virtual viewpoint image during automatic control can be adjusted, and the gaze target can be adjusted to the operator's desired position. The control determination unit 404 determines whether the input amount is small or not, and if the input amount input to the input unit is less than a reference value set in advance by the operator, it is determined to be small. The operator inputs a reference value through the input unit, and the control determination unit 404 stores and refers to the numerical value. The method of determining whether the input amount is small or not is not limited to this. For example, whether the input amount is small may be determined by relatively comparing the input amount to the input unit for the automatically controlled camera parameter with the input amount for the other camera parameters. It is possible to control the camera parameters by applying a weighting coefficient to the input to the input unit. In this case, the camera parameters may be changed without canceling the automatic control if the input is such that the change amount of the camera parameter is equal to or less than a predetermined value. This makes it easier for the operator to input the virtual camera.

尚、本実施形態における制御の一部または全部を上述した実施形態の機能を実現するコンピュータプログラムをネットワークまたは各種記憶媒体を介して画像処理システム等に供給するようにしてもよい。そしてその画像処理システム等におけるコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行するようにしてもよい。その場合、そのプログラム、および該プログラムを記憶した記憶媒体は本開示を構成することとなる。 A computer program that realizes all or part of the control in this embodiment and the functions of the above-described embodiment may be supplied to an image processing system or the like via a network or various storage media. A computer (or a CPU, MPU, etc.) in the image processing system or the like may then read and execute the program. In this case, the program and the storage medium on which the program is stored constitute the present disclosure.

尚、本実施形態の開示は、以下の構成、方法及びプログラムを含む。 The disclosure of this embodiment includes the following configurations, methods, and programs.

（構成１）
複数の撮像装置により撮像される複数の撮像画像に基づいて生成される仮想視点画像に対応する仮想カメラの位置または仮想カメラの姿勢を変更する変更情報を取得する取得手段と、
前記変更情報が、仮想カメラの位置を変更する情報か、仮想カメラの姿勢を変更する情報かを判定する判定手段と、
前記変更情報が前記仮想カメラの位置を変更する情報である場合、前記変更情報に基づいて前記仮想カメラの位置を変更しつつ特定のオブジェクトが前記仮想カメラの光軸上に位置するように前記仮想カメラの姿勢を変更し、前記変更情報が前記仮想カメラの姿勢を変更する情報である場合、前記変更情報に基づいて前記仮想カメラの姿勢を変更しつつ前記特定のオブジェクトが前記仮想カメラの光軸上に位置するように前記仮想カメラの位置を変更する変更手段と、
を有することを特徴とする装置。 (Configuration 1)
an acquisition means for acquiring change information for changing a position or an attitude of a virtual camera corresponding to a virtual viewpoint image generated based on a plurality of captured images captured by a plurality of imaging devices;
a determination means for determining whether the change information is information for changing a position of the virtual camera or information for changing an attitude of the virtual camera;
a modification means for, when the modification information is information for modifying a position of the virtual camera, modifying an attitude of the virtual camera so that a specific object is located on an optical axis of the virtual camera while modifying a position of the virtual camera based on the modification information, and for, when the modification information is information for modifying an attitude of the virtual camera, modifying a position of the virtual camera so that the specific object is located on the optical axis of the virtual camera while modifying the attitude of the virtual camera based on the modification information;
An apparatus comprising:

（構成２）
前記変更手段は、前記変更情報が前記仮想カメラの姿勢を変更する情報である場合、前記変更情報に基づいて前記仮想カメラの姿勢を変更し、さらに前記特定のオブジェクトと前記仮想カメラのとの距離を一定に保つように前記仮想カメラの位置を変更することを特徴とする構成１に記載の装置。 (Configuration 2)
The device described in configuration 1, characterized in that, when the change information is information for changing the attitude of the virtual camera, the change means changes the attitude of the virtual camera based on the change information, and further changes the position of the virtual camera so as to keep a constant distance between the specific object and the virtual camera.

（構成３）
更に、前記変更手段による変更を行うか否かを判定する変更判定手段を有することを特徴とする構成１乃至２の何れか１項に記載の装置。 (Configuration 3)
3. The apparatus according to claim 1, further comprising a change determining means for determining whether or not the change is to be made by the change means.

（構成４）
前記変更判定手段は、前記仮想カメラの注視点から所定の距離に前記特定のオブジェクトが位置していない場合に前記変更手段による変更を行うと判定し、前記仮想カメラの注視点から所定の距離に前記特定のオブジェクトが位置している場合に前記変更手段による変更を行わないと判定することを特徴とする構成３に記載の装置。 (Configuration 4)
The device described in configuration 3, wherein the change determination means determines that a change will be made by the change means when the specific object is not located at a predetermined distance from the virtual camera's point of view, and determines that a change will not be made by the change means when the specific object is located at a predetermined distance from the virtual camera's point of view.

（構成５）
前記取得手段は、オブジェクトの３次元モデルを取得し、
前記変更手段は、前記特定のオブジェクトの３次元モデルの重心位置が前記仮想カメラの光軸上に位置するように前記仮想カメラの位置および姿勢を変更することを特徴とする構成１乃至４の何れか１項に記載の装置。 (Configuration 5)
The obtaining means obtains a three-dimensional model of an object;
The device according to any one of configurations 1 to 4, wherein the modification means modifies the position and orientation of the virtual camera so that the center of gravity of the three-dimensional model of the specific object is located on the optical axis of the virtual camera.

（構成６）
前記仮想カメラの位置は、Ｘ軸、Ｙ軸、Ｚ軸のパラメータで表現され、
前記仮想カメラの姿勢は、パン、チルト、ロールのパラメータで表現されることを特徴とする構成１乃至５の何れか１項に記載の装置。 (Configuration 6)
The position of the virtual camera is expressed by X-axis, Y-axis, and Z-axis parameters,
The apparatus according to any one of configurations 1 to 5, wherein the attitude of the virtual camera is expressed by pan, tilt, and roll parameters.

（構成７）
更に、特定のオブジェクトを設定する設定手段を有することを特徴とする請求項１に記載の情報処理装置。 (Configuration 7)
2. The information processing apparatus according to claim 1, further comprising setting means for setting a specific object.

（構成８）
前記設定手段は、操作者による操作に基づいて、前記特定のオブジェクトを設定することを特徴とする構成７に記載の装置。 (Configuration 8)
The apparatus according to configuration 7, wherein the setting means sets the specific object based on an operation by an operator.

（構成９）
前記設定手段は、複数のオブジェクトの中から前記特定のオブジェクトを指定する入力に基づいて、前記特定のオブジェクトを設定することを特徴とする構成８に記載の装置。 (Configuration 9)
9. The apparatus according to configuration 8, wherein the setting means sets the specific object based on an input specifying the specific object from among a plurality of objects.

（構成１０）
前記設定手段は、前記仮想カメラの注目点に最も近いオブジェクトを前記特定オブジェクトと設定することを特徴とする構成７に記載の装置。 (Configuration 10)
The apparatus according to configuration 7, wherein the setting means sets an object that is closest to a point of interest of the virtual camera as the specific object.

（構成１１）
前記判定手段は、前記特定のオブジェクトと前記仮想カメラとを結ぶ光軸上に前記特定のオブジェクトと異なるオブジェクトが位置していないか判定し、
前記変更手段は、前記特定のオブジェクトと異なるオブジェクトが前記特定のオブジェクトと前記仮想カメラとを結ぶ光軸上に位置しないように、前記仮想カメラの位置および姿勢を変更することを特徴とする構成１乃至１０の何れか１項に記載の装置。 (Configuration 11)
the determining means determines whether or not an object other than the specific object is located on an optical axis connecting the specific object and the virtual camera;
The device described in any one of configurations 1 to 10, characterized in that the modification means changes the position and orientation of the virtual camera so that an object other than the specific object is not located on an optical axis connecting the specific object and the virtual camera.

（構成１２）
前記変更手段は、前記特定のオブジェクトの位置を中心とし、前記特定のオブジェクトの位置から前記仮想カメラまでの距離を半径とした円上を移動するように前記仮想カメラの位置および姿勢を変更することを特徴とする構成１１に記載の装置。 (Configuration 12)
The device described in configuration 11, characterized in that the modification means modifies the position and orientation of the virtual camera so as to move on a circle having the position of the specific object as its center and the distance from the position of the specific object to the virtual camera as its radius.

（構成１３）
前記変更情報は、ジョイスティックの入力に基づいて、前記仮想カメラの位置または前記仮想カメラの姿勢を変更することを特徴とする構成１乃至１２の何れか１項に記載の装置。 (Configuration 13)
13. The device according to any one of configurations 1 to 12, wherein the change information changes the position or attitude of the virtual camera based on a joystick input.

（構成１４）
更に、前記変更手段により変更された前記仮想カメラの位置および姿勢に基づいて生成される仮想視点画像を表示する表示手段を有することを特徴とする構成１乃至１３の何れか１項に記載の装置。 (Configuration 14)
14. The apparatus according to any one of claims 1 to 13, further comprising a display means for displaying a virtual viewpoint image generated based on the position and attitude of the virtual camera changed by the change means.

（構成１５）
前記変更情報は、前記仮想カメラの位置の変更量および前記仮想カメラの姿勢の変更量を含み、
前記変更手段は、前記変更量が所定値以下の場合、前記変更情報に基づいて前記仮想カメラの位置または姿勢を変更することを特徴とする構成１乃至１４の何れか１項に記載の装置。 (Configuration 15)
the change information includes a change amount of the position of the virtual camera and a change amount of the attitude of the virtual camera,
15. The device according to any one of configurations 1 to 14, wherein the change means changes the position or attitude of the virtual camera based on the change information when the amount of change is equal to or less than a predetermined value.

（構成１６）
複数の撮像装置により撮像される複数の撮像画像に基づいて生成される仮想視点画像に対応する仮想カメラの位置または仮想カメラの姿勢を変更する第１変更情報を取得する取得手段と、
特定のオブジェクトを設定する設定手段と、
前記第１変更情報が、仮想カメラの位置を変更する情報か、仮想カメラの姿勢を変更する情報かを判定する判定手段と、
前記第１変更情報と前記判定手段の判定結果とに基づいて、前記特定のオブジェクトが前記仮想カメラの光軸上に位置するように前記仮想カメラの位置および姿勢を変更する第２変更情報を出力する出力手段と、
を有することを特徴とする装置。 (Configuration 16)
an acquisition means for acquiring first change information for changing a position or an attitude of a virtual camera corresponding to a virtual viewpoint image generated based on a plurality of captured images captured by a plurality of imaging devices;
A setting means for setting a particular object;
a determination means for determining whether the first change information is information for changing a position of a virtual camera or information for changing an attitude of the virtual camera;
an output means for outputting second modification information for modifying a position and an attitude of the virtual camera based on the first modification information and a determination result of the determination means so that the specific object is positioned on an optical axis of the virtual camera;
An apparatus comprising:

（方法）
複数の撮像装置により撮像される複数の撮像画像に基づいて生成される仮想視点画像に対応する仮想カメラの位置または仮想カメラの姿勢を変更する変更情報を取得する取得工程と、
特定のオブジェクトを設定する設定工程と、
前記変更情報が、仮想カメラの位置を変更する情報か、仮想カメラの姿勢を変更する情報かを判定する判定工程と、
前記変更情報と前記判定手段の判定結果とに基づいて、前記特定のオブジェクトが前記仮想カメラの光軸上に位置するように前記仮想カメラの位置および姿勢を変更する変更工程と、
を有することを特徴とする情報処理方法。 (Method)
an acquisition step of acquiring change information for changing a position or an attitude of a virtual camera corresponding to a virtual viewpoint image generated based on a plurality of captured images captured by a plurality of imaging devices;
a configuration step for configuring a particular object;
a determination step of determining whether the change information is information for changing a position of a virtual camera or information for changing an attitude of the virtual camera;
a modifying step of modifying a position and an attitude of the virtual camera based on the modification information and a determination result of the determining means so that the specific object is located on an optical axis of the virtual camera;
13. An information processing method comprising:

（プログラム）
構成１乃至１５の何れか１項に記載の装置の各手段をコンピュータにより制御するためのプログラム。 (program)
16. A program for controlling each means of the device according to any one of configurations 1 to 15 by a computer.

４０１仮想カメラ情報取得部
４０３注視対象設定部
４０４制御判定部
４０５位置姿勢制御部 401 Virtual camera information acquisition unit 403 Gaze target setting unit 404 Control determination unit 405 Position and orientation control unit

Claims

an acquisition means for acquiring change information for changing a position or an attitude of a virtual camera corresponding to a virtual viewpoint image generated based on a plurality of captured images captured by a plurality of imaging devices;
a determination means for determining whether the change information is information for changing a position of a virtual camera or information for changing an attitude of the virtual camera;
a modification means for, when the modification information is information for modifying a position of the virtual camera, modifying an attitude of the virtual camera so that the specific object is located on an optical axis of the virtual camera while modifying a position of the virtual camera based on the modification information, and for, when the modification information is information for modifying an attitude of the virtual camera, modifying a position of the virtual camera so that the specific object is located on the optical axis of the virtual camera while modifying the attitude of the virtual camera based on the modification information;
13. An information processing device comprising:

The information processing device according to claim 1, characterized in that, when the change information is information for changing the attitude of the virtual camera, the change means changes the attitude of the virtual camera based on the change information, and further changes the position of the virtual camera so as to keep a constant distance between the specific object and the virtual camera.

The information processing device according to claim 1, further comprising a change determination means for determining whether or not to make a change by the change means.

The information processing device according to claim 3, characterized in that the change determination means determines that the change is to be made by the change means when the specific object is not located at a predetermined distance from the virtual camera's point of view, and determines that the change is not to be made by the change means when the specific object is located at a predetermined distance from the virtual camera's point of view.

The obtaining means obtains a three-dimensional model of an object;
2 . The information processing apparatus according to claim 1 , wherein the change means changes the position and attitude of the virtual camera so that the center of gravity of the three-dimensional model of the specific object is positioned on the optical axis of the virtual camera.

The position of the virtual camera is expressed by X-axis, Y-axis, and Z-axis parameters,
2 . The information processing apparatus according to claim 1 , wherein the attitude of the virtual camera is expressed by pan, tilt, and roll parameters.

The information processing device according to claim 1 further comprises a setting means for setting a specific object.

The information processing device according to claim 7, characterized in that the setting means sets the specific object based on an operation by an operator.

The information processing device according to claim 8, characterized in that the setting means sets the specific object based on an input that specifies the specific object from among a plurality of objects.

The information processing device according to claim 7, characterized in that the setting means sets the object closest to the focus point of the virtual camera as the specific object.

the determining means determines whether or not an object other than the specific object is located on an optical axis connecting the specific object and the virtual camera;
2 . The information processing apparatus according to claim 1 , wherein the change means changes the position and attitude of the virtual camera so that an object other than the specific object is not positioned on an optical axis connecting the specific object and the virtual camera.

The information processing device according to claim 11, characterized in that the change means changes the position and orientation of the virtual camera so that it moves on a circle whose center is the position of the specific object and whose radius is the distance from the position of the specific object to the virtual camera.

The information processing device according to claim 1, characterized in that the change information changes the position or attitude of the virtual camera based on a joystick input.

The information processing device according to claim 1, further comprising a display means for displaying a virtual viewpoint image generated based on the position and orientation of the virtual camera changed by the change means.

the change information includes a change amount of the position of the virtual camera and a change amount of the attitude of the virtual camera,
2 . The information processing apparatus according to claim 1 , wherein the change means changes the position or the attitude of the virtual camera based on the change information when the amount of change is equal to or smaller than a predetermined value.

an acquisition means for acquiring first change information for changing a position or an attitude of a virtual camera corresponding to a virtual viewpoint image generated based on a plurality of captured images captured by a plurality of imaging devices;
A setting means for setting a particular object;
a determination means for determining whether the first change information is information for changing a position of a virtual camera or information for changing an attitude of the virtual camera;
an output means for outputting second modification information for modifying a position and an attitude of the virtual camera based on the first modification information and a determination result of the determination means so that the specific object is positioned on an optical axis of the virtual camera;
13. An information processing device comprising:

an acquisition step of acquiring change information for changing a position or an attitude of a virtual camera corresponding to a virtual viewpoint image generated based on a plurality of captured images captured by a plurality of imaging devices;
a configuration step for configuring a particular object;
a determination step of determining whether the change information is information for changing a position of a virtual camera or information for changing an attitude of the virtual camera;
a modifying step of modifying a position and an attitude of the virtual camera based on the modification information and a determination result of the determining means so that the specific object is located on an optical axis of the virtual camera;
13. An information processing method comprising:

A computer program for controlling each means of the image processing device according to any one of claims 1 to 15 by a computer.