JP2023001850A

JP2023001850A - Information processing apparatus, information processing method, and program

Info

Publication number: JP2023001850A
Application number: JP2022009393A
Authority: JP
Inventors: 祐矢太田; Yuya Ota
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-06-21
Filing date: 2022-01-25
Publication date: 2023-01-06

Abstract

To suppress the influence due to existence of a normal image capturing camera within an image capturing range of a virtual viewpoint image generation camera.SOLUTION: An information processing apparatus obtains first viewpoint information for specifying a virtual viewpoint corresponding to a virtual viewpoint image and second viewpoint information representing a viewpoint of a second image capturing apparatus existing in an image capturing range of a first image capturing apparatus that is used for generating the virtual viewpoint image and performs control so that the image captured by the second image capturing apparatus is output in a case where a position of the second image capturing apparatus specified by the second viewpoint information is included in a field of view of the virtual viewpoint specified by the first viewpoint information.SELECTED DRAWING: Figure 1

Description

本開示は、撮像画像に基づく処理に関する。 The present disclosure relates to processing based on captured images.

複数の撮像装置（仮想視点画像生成用カメラ）を異なる位置に設置して、複数の視点からオブジェクトを撮像して得られた撮像画像を用いて仮想視点画像を生成する方法がある。また、仮想視点画像生成用カメラとは別の撮像装置（通常撮像用カメラ）で撮像を行うことがある。そして、仮想視点画像と、通常撮像用カメラが撮像して得られた撮像画像と、を適宜切り替えて表示する方法がある。 There is a method of installing a plurality of imaging devices (cameras for generating virtual viewpoint images) at different positions and generating a virtual viewpoint image using captured images obtained by capturing images of an object from a plurality of viewpoints. Also, an imaging device (ordinary imaging camera) different from the virtual viewpoint image generation camera may perform imaging. Then, there is a method of appropriately switching and displaying a virtual viewpoint image and a captured image obtained by capturing an image by a normal imaging camera.

特許文献１は、視聴者に与える違和感が少なくなるように通常撮像用カメラの撮像画像と仮想視点画像との切り替えを行う方法が記載されている。 Patent Document 1 describes a method of switching between an image captured by a normal imaging camera and a virtual viewpoint image so as to reduce the viewer's sense of discomfort.

特開２０２０－４２６６５号公報JP 2020-42665 A

しかしながら、仮想視点画像生成用カメラの撮像範囲に通常撮像用カメラが存在する場合、通常撮像用カメラが仮想視点画像に映り込んでしまうことがある。通常撮像用カメラが仮想視点画像に映り込むと、仮想視点画像の品質が低下することがある。 However, when the normal imaging camera exists in the imaging range of the virtual viewpoint image generating camera, the normal imaging camera may be reflected in the virtual viewpoint image. When the normal imaging camera is reflected in the virtual viewpoint image, the quality of the virtual viewpoint image may deteriorate.

本開示の技術は、仮想視点画像生成用カメラの撮像範囲に通常撮像用カメラが存在することによる影響を抑制することを目的とする。 An object of the technique of the present disclosure is to suppress the influence caused by the presence of the normal imaging camera in the imaging range of the virtual viewpoint image generating camera.

本開示の技術に係る情報処理装置は、仮想視点画像に対応する仮想視点を特定するための第１の視点情報と、前記仮想視点画像を生成するために使用される第１の撮像装置の撮像範囲に存在する第２の撮像装置の視点を表す第２の視点情報と、を取得する取得手段と、前記仮想視点画像又は前記第２の撮像装置が撮像して得られた撮像画像を出力する出力手段と、前記第１の視点情報により特定される前記仮想視点の視界に、前記第２の視点情報により特定される前記第２の撮像装置の位置が含まれる場合、前記出力手段により前記第２の撮像装置が撮像して得られた前記撮像画像が出力されるように制御する制御手段と
を有することを特徴とする。 An information processing apparatus according to the technology of the present disclosure includes first viewpoint information for specifying a virtual viewpoint corresponding to a virtual viewpoint image, and imaging by a first imaging device used to generate the virtual viewpoint image. second viewpoint information representing a viewpoint of a second imaging device existing within a range; obtaining means for obtaining second viewpoint information; and outputting the virtual viewpoint image or the captured image obtained by imaging by the second imaging device. and, if the field of view of the virtual viewpoint specified by the first viewpoint information includes the position of the second imaging device specified by the second viewpoint information, the output means outputs the and control means for controlling such that the captured image obtained by imaging by the imaging device of No. 2 is output.

本開示の技術によれば、仮想視点画像生成用カメラの撮像範囲に通常撮像用カメラが存在することによる影響を抑制することができる。 According to the technology of the present disclosure, it is possible to suppress the influence of the presence of the normal imaging camera in the imaging range of the virtual viewpoint image generating camera.

画像処理装置の機能構成を示すブロック図である。2 is a block diagram showing the functional configuration of the image processing device; FIG. 画像処理装置のハードウェアの構成例を示すブロック図である。2 is a block diagram showing a hardware configuration example of an image processing apparatus; FIG. 仮想視点画像生成用カメラ群の撮像範囲を説明するための図である。FIG. 10 is a diagram for explaining an imaging range of a camera group for generating a virtual viewpoint image; 画像出力制御処理の一例を表すフローチャートである。4 is a flowchart showing an example of image output control processing; 視点比較テーブルを説明するための図である。FIG. 11 is a diagram for explaining a viewpoint comparison table; FIG. 出力画像の切替操作を行うユーザのための表示画面を説明するための図である。FIG. 10 is a diagram for explaining a display screen for a user who performs an output image switching operation; 画像出力制御処理のフローチャートである。4 is a flowchart of image output control processing; 仮想カメラを基準としたオブジェクトの相対速度を説明するための図である。FIG. 4 is a diagram for explaining the relative velocity of an object with respect to a virtual camera; 仮想視点画像におけるオブジェクトが占める割合を説明するための図である。FIG. 10 is a diagram for explaining the ratio of objects in a virtual viewpoint image; 画像出力制御処理の一例を表すフローチャートである。4 is a flowchart showing an example of image output control processing;

以下、添付の図面を参照して、実施形態に基づいて本開示の技術の詳細を説明する。 Hereinafter, details of the technology of the present disclosure will be described based on embodiments with reference to the accompanying drawings.

＜第１実施形態＞
［画像処理システムの構成］
図１は、本実施形態の画像処理システムの全体を表す構成図である。画像処理システムは、仮想視点画像生成用カメラ群１１０、通常撮像用カメラ１２０、画像処理装置１００、および画像切替装置１３０を有する。 <First Embodiment>
[Configuration of image processing system]
FIG. 1 is a configuration diagram showing the entire image processing system of this embodiment. The image processing system includes a virtual viewpoint image generation camera group 110 , a normal imaging camera 120 , an image processing device 100 , and an image switching device 130 .

本実施形態の画像処理装置１００は、仮想視点画像を生成することが可能な情報処理装置である。仮想視点画像とは、実際に設置されているカメラの視点とは異なる視点（仮想視点とよぶ）からの見えを表す画像であり、自由視点画像または任意視点画像とも呼ばれる。仮想視点画像は、動画であっても、静止画であってもよい。本実施形態では、仮想視点画像は動画であるものとして説明する。 The image processing device 100 of this embodiment is an information processing device capable of generating a virtual viewpoint image. A virtual viewpoint image is an image representing a view from a different viewpoint (called a virtual viewpoint) from the viewpoint of an actually installed camera, and is also called a free viewpoint image or an arbitrary viewpoint image. A virtual viewpoint image may be a moving image or a still image. In this embodiment, the virtual viewpoint image will be described as a moving image.

なお、本実施形態では、仮想視点を仮想的なカメラ（仮想カメラ）に置き換えて説明する場合がある。このとき、仮想視点の位置は仮想カメラの位置、仮想視点からの視線方向は仮想カメラの向きの方向にそれぞれ対応する。また、仮想視点画像は、仮想カメラにより仮想的に撮像されることにより得られる撮像画像に対応する。仮想カメラの位置および向きは仮想カメラの操縦者が指定することができる。このため、任意の視点からの画像を生成することが可能となる。 In addition, in this embodiment, the virtual viewpoint may be replaced with a virtual camera (virtual camera) for explanation. At this time, the position of the virtual viewpoint corresponds to the position of the virtual camera, and the line-of-sight direction from the virtual viewpoint corresponds to the orientation direction of the virtual camera. A virtual viewpoint image corresponds to a captured image obtained by virtually capturing an image with a virtual camera. The position and orientation of the virtual camera can be specified by the operator of the virtual camera. Therefore, it is possible to generate an image from an arbitrary viewpoint.

本実施形態における仮想視点画像は、自由視点画像とも呼ばれるものであるが、ユーザが自由に（任意に）指定した視点に対応する画像に限定されず、例えば複数の候補からユーザが選択した視点に対応する画像なども仮想視点画像に含まれる。また、本実施形態では仮想視点の指定がユーザ操作により行われる場合を中心に説明するが、仮想視点の指定が画像解析の結果等に基づいて自動で行われてもよい。 The virtual viewpoint image in this embodiment is also called a free viewpoint image, but is not limited to an image corresponding to a viewpoint freely (arbitrarily) specified by the user. A corresponding image is also included in the virtual viewpoint image. Also, in the present embodiment, the case where the designation of the virtual viewpoint is performed by the user's operation will be mainly described, but the designation of the virtual viewpoint may be automatically performed based on the result of image analysis or the like.

画像処理装置１００は、仮想視点用撮像画像取得部１０１、仮想視点指定部１０２，仮想視点画像生成部１０３、物理視点用撮像画像取得部１０４、カメラ情報取得部１０６、類似度算出部１０７、および出力制御部１０８を有する。 The image processing apparatus 100 includes a virtual viewpoint captured image acquisition unit 101, a virtual viewpoint designation unit 102, a virtual viewpoint image generation unit 103, a physical viewpoint captured image acquisition unit 104, a camera information acquisition unit 106, a similarity calculation unit 107, and It has an output control unit 108 .

仮想視点用撮像画像取得部１０１は、スタジオ等の撮像範囲を取り囲むように配置された複数の撮像装置である仮想視点画像生成用カメラ群１１０が時刻同期して撮像して得られた、夫々のカメラの画角に応じた撮像画像を取得する。仮想視点画像生成用カメラ群１１０を構成するカメラの台数および配置は限定しない。 The virtual viewpoint captured image acquisition unit 101 captures each of the virtual viewpoint image generation cameras 110, which are a plurality of imaging devices arranged so as to surround an imaging range such as a studio, in synchronization with each other. A captured image corresponding to the angle of view of the camera is acquired. The number and arrangement of the cameras forming the virtual viewpoint image generation camera group 110 are not limited.

仮想視点指定部１０２は、仮想カメラの操縦者が指示した仮想カメラの位置および姿勢（向き）を少なくとも規定する、仮想カメラの視点情報を生成する。 The virtual viewpoint designation unit 102 generates virtual camera viewpoint information that defines at least the position and orientation (orientation) of the virtual camera designated by the operator of the virtual camera.

仮想カメラの操縦者は、所望の仮想カメラの位置および向き等を、画像処理装置１００と接続している操作部（不図示）を介して指示することができる。操作部（不図示）は、例えばジョイスティックのような装置であるが、操作部（不図示）はジョイスティックに限定されない。その他にも、パーソナルコンピュータの操作に用いるマウス、キーボードといった装置でもよい。 The operator of the virtual camera can specify the desired position and direction of the virtual camera through an operation unit (not shown) connected to the image processing apparatus 100 . The operation unit (not shown) is, for example, a device such as a joystick, but the operation unit (not shown) is not limited to a joystick. In addition, a device such as a mouse or a keyboard used for operating a personal computer may be used.

仮想カメラの視点情報には、世界座標上の３次元位置（仮想カメラの位置）、姿勢（仮想カメラの向き）、焦点距離、主点（仮想カメラ画像上の中心）が含まれるものとする。
仮想カメラの視点情報が生成されることで、仮想カメラの位置および向き等が規定される。 The viewpoint information of the virtual camera includes the three-dimensional position (position of the virtual camera) on world coordinates, orientation (orientation of the virtual camera), focal length, and principal point (center on the virtual camera image).
The position, orientation, and the like of the virtual camera are defined by generating the viewpoint information of the virtual camera.

仮想視点画像生成部１０３は、仮想視点用撮像画像取得部１０１が取得した複数の撮像画像および仮想視点画像生成用カメラ群１１０の位置関係を用いて、仮想視点指定部１０２が指定した仮想カメラからの見えを表す仮想視点画像を生成する。そして、仮想視点画像生成部１０３は、仮想カメラの画角内の画像を仮想視点画像として出力する。 The virtual viewpoint image generation unit 103 uses the positional relationship between the plurality of captured images acquired by the virtual viewpoint captured image acquisition unit 101 and the virtual viewpoint image generation camera group 110 to generate images from the virtual camera specified by the virtual viewpoint specification unit 102 . Generate a virtual viewpoint image that represents the appearance of Then, the virtual viewpoint image generation unit 103 outputs an image within the angle of view of the virtual camera as a virtual viewpoint image.

ここで、仮想視点画像の生成方法の一例として、オブジェクトの三次元形状を表す三次元モデルを生成して、その三次元モデルを仮想カメラから見た場合の二次元画像を射影演算によって表すことによって仮想視点画像を生成する方法を説明する。オブジェクトの三次元形状を表す三次元モデルを三次元形状データともよぶ。 Here, as an example of a method of generating a virtual viewpoint image, a three-dimensional model representing the three-dimensional shape of an object is generated, and a two-dimensional image when the three-dimensional model is viewed from a virtual camera is represented by a projection operation. A method of generating a virtual viewpoint image will be described. A three-dimensional model representing the three-dimensional shape of an object is also called three-dimensional shape data.

はじめに、仮想視点画像生成用カメラ群１１０の撮像画像および配置情報に基づき、撮像範囲内のオブジェクトの三次元モデルを生成する。三次元モデルを構成する方法として、視体積交差法またはＶｉｓｕａｌＨｕｌｌ（以降ＶｉｓｕａｌＨｕｌｌと記す。）と呼ばれる方法がある。ＶｉｓｕａｌＨｕｌｌでは、各仮想視点画像生成用カメラの撮像画像上のオブジェクトのシルエットを、仮想視点画像生成用カメラの光学主点位置からオブジェクトの方向に仮想的に逆投影する。その結果、光学主点位置を頂点とし、断面がオブジェクトのシルエットとなる錐体領域が形成される。そして、仮想視点生成用カメラ毎に形成された錐体領域の重複領域（論理積）を三次元モデルとすることで、オブジェクトの三次元モデルが生成される。 First, a three-dimensional model of an object within an imaging range is generated based on images captured by the virtual viewpoint image generation camera group 110 and location information. As a method for constructing a three-dimensional model, there is a method called a visual volume intersection method or VisualHull (hereinafter referred to as VisualHull). In VisualHull, the silhouette of the object on the captured image of each virtual viewpoint image generation camera is virtually back-projected from the optical principal point position of the virtual viewpoint image generation camera in the direction of the object. As a result, a cone region is formed whose vertex is the position of the optical principal point and whose cross section is the silhouette of the object. Then, a three-dimensional model of the object is generated by using an overlap region (logical product) of the cone regions formed for each virtual viewpoint generation camera as a three-dimensional model.

次に、仮想視点画像生成用カメラ群１１０を構成するカメラのうち、三次元モデルの色付けに用いる撮像画像を撮像したカメラを決定して、三次元モデルに適切な色付けを行うレンダリング処理を行う。色付けに用いるカメラを決定する処理の方法としては、例えば、各仮想視点画像生成用カメラから三次元モデルを構成する各点までの距離を表す距離画像を生成して距離画像に基づき決定する方法がある。その距離画像を利用して、どの仮想視点画像生成用カメラの撮像画像の色を使用するかを選択することで色付けが行われる。 Next, the camera that captured the captured image used for coloring the 3D model is determined from among the cameras constituting the virtual viewpoint image generation camera group 110, and rendering processing is performed to appropriately color the 3D model. As a processing method for determining the cameras to be used for coloring, for example, there is a method of generating a distance image representing the distance from each virtual viewpoint image generation camera to each point constituting a 3D model and determining based on the distance image. be. Coloring is performed by using the distance image to select the color of the captured image of which virtual viewpoint image generation camera is to be used.

なお、仮想視点画像を生成する方法については上述した方法に限定するものではない。
仮想視点画像を生成する方法として、三次元モデルを生成する方法ではなく、モーフィングまたはビルボーディングといったイメージベースの画像処理方法が用いられてもよい。 Note that the method of generating the virtual viewpoint image is not limited to the method described above.
As a method of generating a virtual viewpoint image, an image-based image processing method such as morphing or billboarding may be used instead of a method of generating a three-dimensional model.

仮想視点画像の生成処理は、仮想視点画像生成用カメラ群１１０から送られる画像データをネットワーク接続されたコンピュータ機器である画像処理装置１００に集約して行われるものとして説明する。ネットワーク接続はコンピュータネットワークで最も一般的に使用されているＥｔｈｅｒｎｅｔ（登録商標）であるとするが、Ｅｔｈｅｒｎｅｔ（登録商標）に限定されない。画像処理装置１００は、パーソナルコンピュータ、ワークステーション、サーバといった装置で実現される。しかし、生成する仮想視点画像次第でコンピュータ機器に必要とされる計算能力は異なるため画像処理装置１００の形態は上述した形態に限定されない。他にも例えば、画像処理装置１００は、複数の装置によって構成され、必要な画像生成処理が複数の装置で分担して行われてもよい。画像処理装置１００が複数の装置で構成される場合、前述のネットワーク接続によってデータのやり取りが可能になるように複数の装置間の接続が行われる。 The virtual viewpoint image generation processing is described assuming that the image data sent from the virtual viewpoint image generation camera group 110 is aggregated in the image processing apparatus 100, which is a computer device connected to a network. The network connection is assumed to be Ethernet, which is most commonly used in computer networks, but is not limited to Ethernet. The image processing apparatus 100 is implemented by a device such as a personal computer, workstation, or server. However, the form of the image processing apparatus 100 is not limited to the form described above because the computing power required for the computer device differs depending on the virtual viewpoint image to be generated. In addition, for example, the image processing apparatus 100 may be configured by a plurality of devices, and the necessary image generation processing may be shared among the plurality of devices. When the image processing apparatus 100 is composed of a plurality of devices, the plurality of devices are connected so that data can be exchanged through the aforementioned network connection.

物理視点用撮像画像取得部１０４は、通常撮像用カメラ１２０が撮像して得られた実施の撮像画像を取得する。通常撮像用カメラ１２０は、仮想視点画像生成用カメラ群１１０の撮像範囲内に配置されている実際の撮像装置である。 The physical viewpoint captured image acquisition unit 104 acquires the actual captured image captured by the normal imaging camera 120 . The normal imaging camera 120 is an actual imaging device arranged within the imaging range of the virtual viewpoint image generating camera group 110 .

仮想視点画像は、ズームアップで人物の表情を表すのには適さない場合がある。このため、通常撮像用カメラ１２０は、仮想視点画像生成用カメラ群１１０と比較して、画角内におけるオブジェクトの割合が高い画像を得るために用いられる。例えば、オブジェクトが人物である場合、通常撮像用カメラ１２０は人物の表情などを撮像する目的で配置される。 A virtual viewpoint image may not be suitable for expressing a person's facial expression by zooming in. Therefore, the normal imaging camera 120 is used to obtain an image with a higher percentage of objects within the angle of view than the virtual viewpoint image generating camera group 110 . For example, when the object is a person, the normal imaging camera 120 is arranged for the purpose of imaging the facial expression of the person.

通常撮像用カメラ１２０は、例えば、カメラマンが手持して撮像するためのカメラ、または三脚もしくは撮像用のクレーンに設置されたカメラである。画像処理装置１００および通常撮像用カメラ１２０は、例えば、ＳＤＩ（ＳｅｒｉａｌＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅ）ケーブルで接続される。ＳＤＩは主に業務用映像機器に使用されるインターフェース規格である。本実施形態では、通常撮像用カメラ１２０は、カメラマンが手持して撮像する撮像装置であるものとして説明する。 The normal imaging camera 120 is, for example, a camera held by a cameraman for imaging, or a camera installed on a tripod or a crane for imaging. The image processing device 100 and the normal imaging camera 120 are connected by, for example, an SDI (Serial Digital Interface) cable. SDI is an interface standard mainly used for professional video equipment. In this embodiment, the normal imaging camera 120 will be described as an imaging device that is handheld by a cameraman.

カメラ情報取得部１０６は、仮想視点画像生成部１０３から仮想カメラを規定する視点情報を取得する。仮想カメラの位置および向きを表す視点情報は、仮想視点指定部１０２から仮想視点画像生成部１０３に出力されて、仮想視点画像を生成するために使用される。このため、カメラ情報取得部１０６は、仮想視点画像生成部１０３から仮想カメラ３０４の視点情報を取得することができる。 The camera information acquisition unit 106 acquires viewpoint information defining a virtual camera from the virtual viewpoint image generation unit 103 . The viewpoint information representing the position and orientation of the virtual camera is output from the virtual viewpoint designation unit 102 to the virtual viewpoint image generation unit 103 and used to generate a virtual viewpoint image. Therefore, the camera information acquisition unit 106 can acquire viewpoint information of the virtual camera 304 from the virtual viewpoint image generation unit 103 .

また、カメラ情報取得部１０６は、物理視点用撮像画像取得部１０４から通常撮像用カメラ１２０の情報として、通常撮像用カメラ１２０の位置および向きを表す視点情報を取得する。 In addition, the camera information acquisition unit 106 acquires viewpoint information indicating the position and orientation of the normal imaging camera 120 as the information of the normal imaging camera 120 from the physical viewpoint captured image acquisition unit 104 .

上述したカメラ情報取得部１０６の各視点情報の取得方法は一例であり、カメラ情報取得部１０６は仮想視点指定部１０２から仮想カメラの視点情報を直接取得してもよい。また、カメラ情報取得部１０６は通常撮像用カメラ１２０から通常撮像用カメラ１２０の視点情報を直接取得してもよい。その場合、カメラ情報取得部１０６は、通常撮像用カメラ１２０から前述したＥｔｈｅｒｎｅｔ接続によって視点情報を取得する。 The acquisition method of each viewpoint information of the camera information acquisition unit 106 described above is an example, and the camera information acquisition unit 106 may directly acquire the viewpoint information of the virtual camera from the virtual viewpoint designation unit 102 . Further, the camera information acquisition unit 106 may directly acquire the viewpoint information of the normal imaging camera 120 from the normal imaging camera 120 . In this case, the camera information acquisition unit 106 acquires viewpoint information from the normal imaging camera 120 through the above-described Ethernet connection.

通常撮像用カメラ１２０の視点情報は、通常撮像用カメラ１２０に位置および向きが検知可能なセンサ機器を搭載することで取得可能となっている。または、通常撮像用カメラ１２０の移動可能な空間にあらかじめマーカを設置し、赤外線を投光することで得られるマーカからの反射に基づき通常撮像用カメラ１２０の位置情報が取得されてもよい。または、加速度センサ、ジャイロセンサー等を併用することで、通常撮像用カメラ１２０の向きの情報が取得されてもよい。 The viewpoint information of the normal imaging camera 120 can be acquired by mounting a sensor device capable of detecting the position and orientation in the normal imaging camera 120 . Alternatively, the position information of the normal imaging camera 120 may be acquired based on the reflection from the marker obtained by setting a marker in advance in the space in which the normal imaging camera 120 can move and projecting infrared light. Alternatively, the orientation information of the normal imaging camera 120 may be obtained by using an acceleration sensor, a gyro sensor, or the like.

通常撮像用カメラ１２０の位置および向きは、仮想視点画像生成部１０３で使用される世界座標と位置関係が一致するよう予め調整されている。この調整をすることで、通常撮像用カメラ１２０の視点情報は、仮想カメラの視点情報と共通した世界座標系における位置および向きを表す情報として取得される。世界座標系における位置および向きに調整する方法としては、例えば、仮想カメラと通常撮像用カメラ１２０とを同じ画角となるよう設定する。そして、その際の世界座標と通常撮像用カメラ１２０との座標との関係から、通常撮像用カメラ１２０の視点情報を世界座標に合わせるように調整することが可能である。または、予め世界座標上の位置に対応している場所に通常撮像用カメラ１２０を配置して、通常撮像用カメラ１２０の座標と世界座標との関係から調整することが可能である。 The position and orientation of the normal imaging camera 120 are adjusted in advance so that the positional relationship matches the world coordinates used by the virtual viewpoint image generation unit 103 . By performing this adjustment, the viewpoint information of the normal imaging camera 120 is obtained as information representing the position and orientation in the world coordinate system common to the viewpoint information of the virtual camera. As a method of adjusting the position and orientation in the world coordinate system, for example, the virtual camera and the normal imaging camera 120 are set to have the same angle of view. Then, based on the relationship between the world coordinates and the coordinates of the normal imaging camera 120 at that time, it is possible to adjust the viewpoint information of the normal imaging camera 120 so as to match the world coordinates. Alternatively, it is possible to arrange the normal imaging camera 120 in advance at a location corresponding to the position on the world coordinates, and adjust from the relationship between the coordinates of the normal imaging camera 120 and the world coordinates.

類似度算出部１０７は、カメラ情報取得部１０６が取得した視点情報に基づき、仮想カメラの視点と、通常撮像用カメラ１２０の視点とが類似するかを決定するための類似度の算出を行う。詳細は後述する。 Based on the viewpoint information acquired by the camera information acquisition unit 106, the similarity calculation unit 107 calculates the similarity for determining whether the viewpoint of the virtual camera and the viewpoint of the normal imaging camera 120 are similar. Details will be described later.

出力制御部１０８は、画像切替装置１３０に対して、類似度算出部１０７が算出した類似度に応じた出力制御が行われるように指示を行う。例えば、類似度に基づき視点が類似していると決定した場合、視点が類似していることに応じた出力制御によって画像を出力するように画像切替装置１３０に指示する。詳細は後述する。 The output control unit 108 instructs the image switching device 130 to perform output control according to the degree of similarity calculated by the degree of similarity calculation unit 107 . For example, when it is determined that the viewpoints are similar based on the degree of similarity, the image switching device 130 is instructed to output images by output control according to the similarity of the viewpoints. Details will be described later.

画像切替装置１３０は、画像処理装置１００から仮想視点画像を取得する。また、通常撮像用カメラ１２０から、その仮想視点画像に対応する、通常撮像用カメラ１２０の撮像画像を取得する。画像切替装置１３０はＳＤＩケーブルを経由して仮想視点画像と撮像画像を取得する。画像切替装置１３０は、取得した画像の何れかを出力する出力装置（出力部）である。画像切替装置１３０から出力された画像は視聴者が見ている表示部に表示される。具体的には、画像切替装置１３０は、取得した仮想視点画像または撮像画像の何れかを不図示の放送設備または配信サーバ等へ出力する。 The image switching device 130 acquires virtual viewpoint images from the image processing device 100 . Also, an image captured by the normal imaging camera 120 corresponding to the virtual viewpoint image is acquired from the normal imaging camera 120 . The image switching device 130 acquires the virtual viewpoint image and the captured image via the SDI cable. The image switching device 130 is an output device (output unit) that outputs any of the acquired images. The image output from the image switching device 130 is displayed on the display section viewed by the viewer. Specifically, the image switching device 130 outputs either the acquired virtual viewpoint image or the captured image to broadcasting equipment, a distribution server, or the like (not shown).

本実施形態では、画像切替装置１３０は、通常撮像用カメラ１２０でズーム倍率の高い状態でオブジェクトを撮像して得られた撮像画像と、所望の仮想視点からの仮想視点画像と、を適宜切り替えて表示部に表示されるように制御を行う。通常撮像用カメラ１２０の撮像画像と仮想視点画像と切り替えながら表示することで、より臨場感の高い画像を視聴者に提供することが可能となる。即ち、自由度の高い仮想視点画像の特徴を生かしつつ、表情をとらえるような画像を適宜表示することが可能となる。 In this embodiment, the image switching device 130 appropriately switches between a captured image obtained by capturing an object at a high zoom magnification with the normal imaging camera 120 and a virtual viewpoint image from a desired virtual viewpoint. Control is performed so that it is displayed on the display unit. By switching between the image captured by the normal image capturing camera 120 and the virtual viewpoint image, it is possible to provide the viewer with an image with a higher sense of reality. That is, it is possible to appropriately display an image that captures facial expressions while making use of the features of a virtual viewpoint image with a high degree of freedom.

画像切替装置１３０は、例えば、スイッチャーと呼ばれる装置で実現される。通常、スイッチャーに備わっているスイッチを操作してユーザが出力する画像を切り替える。本実施形態では、ユーザによる切り替え指示に加えて、画像処理装置１００の出力制御部１０８の指示に基づいて、画像切替装置１３０の出力画像の切り替えが制御される。 The image switching device 130 is implemented by, for example, a device called a switcher. Normally, a user switches images to be output by operating a switch provided in the switcher. In this embodiment, the switching of the output image of the image switching device 130 is controlled based on the instruction of the output control unit 108 of the image processing apparatus 100 in addition to the switching instruction by the user.

図１で示した画像処理装置１００において実現される各機能部は、後述する画像処理装置のＣＰＵ２０１（図２参照）が所定のプログラムを実行することにより実現されるものとして説明するが、これに限られるものではない。例えば、演算を高速化するためのＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）やＦＰＧＡ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）などのハードウェアが利用されてもよい。すなわち、画像処理装置１００の各機能部は、ソフトウェアと専用ＩＣなどのハードウェアとの協働で実現されてもよいし、一部またはすべての機能がハードウェアのみで実現されてもよい。また、画像処理装置１００を複数用いることにより各機能部の処理を分散させて実行するような構成が用いられても良い。 Each functional unit realized in the image processing apparatus 100 shown in FIG. 1 will be described as being realized by executing a predetermined program by the CPU 201 (see FIG. 2) of the image processing apparatus, which will be described later. It is not limited. For example, hardware such as GPUs (Graphics Processing Units) and FPGAs (Field Programmable Gate Arrays) for speeding up calculations may be used. That is, each functional unit of the image processing apparatus 100 may be implemented by cooperation of software and hardware such as a dedicated IC, or some or all of the functions may be implemented by hardware alone. Also, a configuration may be used in which a plurality of image processing apparatuses 100 are used to distribute and execute the processing of each functional unit.

［画像処理装置のハードウェア構成］
図２は、画像処理装置１００のハードウェアの構成例を示すブロック図である。画像処理装置１００は、ＣＰＵ２０１、ＲＡＭ２０２，ＲＯＭ２０３、外部記憶装置２０４、Ｉ／Ｆ２０５を有する。 [Hardware Configuration of Image Processing Apparatus]
FIG. 2 is a block diagram showing a hardware configuration example of the image processing apparatus 100. As shown in FIG. The image processing apparatus 100 has a CPU 201 , a RAM 202 , a ROM 203 , an external storage device 204 and an I/F 205 .

ＣＰＵ２０１は、ＲＡＭ２０２およびＲＯＭ２０３に格納されているコンピュータプログラムおよびデータを用いてコンピュータ全体の制御を行う。 The CPU 201 controls the entire computer using computer programs and data stored in the RAM 202 and ROM 203 .

ＲＡＭ２０２は、外部記憶装置２０４からロードされたコンピュータプログラム、データ、Ｉ／Ｆ（インターフェース）２０５を介して外部から取得したデータなどを一時的に記憶するためのエリアを有する。更に、ＲＡＭ２０２は、ＣＰＵ２０１が各種の処理を実行する際に用いるワークエリアを有する。即ち、ＲＡＭ２０２は、例えば、フレームメモリとして割り当てたり、その他の各種のエリアを適宜提供したりすることができる。ＲＯＭ２０３には、本コンピュータの設定データ、ブートプログラムなどが格納されている。 The RAM 202 has an area for temporarily storing computer programs and data loaded from an external storage device 204, data externally acquired via an I/F (interface) 205, and the like. Furthermore, the RAM 202 has a work area used when the CPU 201 executes various processes. That is, the RAM 202 can be allocated, for example, as frame memory, or can provide other various areas as appropriate. The ROM 203 stores setting data of the computer, a boot program, and the like.

外部記憶装置２０４は、ハードディスクドライブ装置に代表される大容量情報記憶装置である。外部記憶装置２０４には、ＯＳ（オペレーティングシステム）、図１に示した画像処理装置１００の各機能をＣＰＵ２０１に実現させるためのコンピュータプログラムが保存されている。更には、外部記憶装置２０４には、処理対象の画像データが保存されていても良い。外部記憶装置２０４に保存されているコンピュータプログラムやデータは、ＣＰＵ２０１による制御に従って適宜ＲＡＭ２０２にロードされ、ＣＰＵ２０１の処理の対象となる。 The external storage device 204 is a large-capacity information storage device typified by a hard disk drive. An external storage device 204 stores an OS (operating system) and a computer program for causing the CPU 201 to implement each function of the image processing apparatus 100 shown in FIG. Furthermore, image data to be processed may be stored in the external storage device 204 . Computer programs and data stored in the external storage device 204 are appropriately loaded into the RAM 202 under the control of the CPU 201 and are processed by the CPU 201 .

Ｉ／Ｆ２０５は、ＬＡＮ、インターネット等のネットワークに接続するためインターフェースであり、画像処理装置１００は、Ｉ／Ｆ２０５を介して様々な情報の取得または送信をすることができる。また、Ｉ／Ｆ２０５を介して、不図示の表示部、操作部、他の機器を接続することができる。バス２０６は上述の各部を繋ぐためのバスである。 An I/F 205 is an interface for connecting to a network such as a LAN or the Internet, and the image processing apparatus 100 can acquire or transmit various information via the I/F 205 . Also, a display unit, an operation unit, and other devices (not shown) can be connected via the I/F 205 . A bus 206 is a bus for connecting the units described above.

表示部（不図示）は、例えば液晶ディスプレイやＬＥＤ等で構成され、ユーザが画像処理装置１００を操作するためのＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）などを表示する。操作部（不図示）は、例えばキーボードやマウス、ジョイスティック、タッチパネル等で構成され、ユーザによる操作を受けて各種の指示をＣＰＵ２０１に入力する。ＣＰＵ２０１は、表示部を制御する表示制御部、及び操作部を制御する操作制御部としても動作する。 The display unit (not shown) is composed of, for example, a liquid crystal display, an LED, etc., and displays a GUI (Graphical User Interface) for the user to operate the image processing apparatus 100, and the like. An operation unit (not shown) is composed of, for example, a keyboard, a mouse, a joystick, a touch panel, etc., and inputs various instructions to the CPU 201 in response to user operations. The CPU 201 also operates as a display control unit that controls the display unit and an operation control unit that controls the operation unit.

なお、画像切替装置１３０のハードウェア構成についても図２と同様であるため説明は省略する。 Note that the hardware configuration of the image switching device 130 is also the same as that of FIG. 2, so the description is omitted.

［カメラの配置について］
図３は、仮想視点画像生成用カメラ群１１０の撮像範囲３００を上から俯瞰して見た状態を表した図である。例えば、撮像範囲３００は、歌手、ダンサー等のオブジェクト３０３がパフォーマンスを行うスタジオである。図３に示すように仮想視点画像生成用カメラ群１１０は、スタジオの周囲に配置され、スタジオを様々な角度から時刻同期して撮像する。その結果、複数の視点の撮像画像が得られる。 [About camera placement]
FIG. 3 is a diagram showing a state in which an imaging range 300 of the virtual viewpoint image generation camera group 110 is viewed from above. For example, the imaging range 300 is a studio where objects 303 such as singers, dancers, etc. perform. As shown in FIG. 3, the virtual viewpoint image generating camera group 110 is arranged around the studio, and captures images of the studio from various angles in synchronization with each other. As a result, captured images of a plurality of viewpoints are obtained.

さらに、撮像範囲３００には、通常撮像用カメラ１２０と、通常撮像用カメラ１２０を用いて撮像を行っているカメラマン３０８が存在している。即ち、スタジオでは、通常撮像用カメラ１２０による撮像と、仮想視点画像生成用カメラ群１１０による撮像と、が同時に行われている。 Further, in the imaging range 300, there are a normal imaging camera 120 and a cameraman 308 who uses the normal imaging camera 120 to perform imaging. That is, in the studio, imaging by the normal imaging camera 120 and imaging by the virtual viewpoint image generation camera group 110 are performed simultaneously.

通常撮像用カメラ１２０は、仮想視点画像生成用カメラ群１１０を構成するカメラのうちの少なくとも１つのカメラの撮像範囲に存在する。このため、通常撮像用カメラ１２０の位置によっては、仮想視点画像生成用カメラ群１１０を構成するカメラのうちの１台以上のカメラの画角内に被写体として含まれる可能性がある。 The normal imaging camera 120 exists within the imaging range of at least one of the cameras that constitute the virtual viewpoint image generating camera group 110 . Therefore, depending on the position of the normal imaging camera 120 , the subject may be included as a subject within the angle of view of one or more cameras among the cameras forming the virtual viewpoint image generating camera group 110 .

図３における仮想カメラ３０４は、仮想カメラの操縦者によって指定された仮想カメラの位置および向きを表す。方向３０５は、仮想カメラ３０４の向きの方向を２次元で表している。この場合、仮想カメラ３０４から仮想視点画像を生成すると、本来のオブジェクト３０３だけではなく通常撮像用カメラ１２０とカメラマン３０８とが含まれる仮想視点画像が生成されてしまう。この場合、例えば、通常撮像用カメラ１２０およびカメラマン３０８が目障りとなることがあるため、通常撮像用カメラ１２０およびカメラマン３０８が仮想視点画像内に表現されないようにするための処理が求められることがある。 Virtual camera 304 in FIG. 3 represents the position and orientation of the virtual camera specified by the operator of the virtual camera. A direction 305 two-dimensionally represents the orientation of the virtual camera 304 . In this case, if a virtual viewpoint image is generated from the virtual camera 304, a virtual viewpoint image containing not only the original object 303 but also the normal imaging camera 120 and the cameraman 308 is generated. In this case, for example, since the normal imaging camera 120 and the cameraman 308 may become an eyesore, processing may be required to prevent the normal imaging camera 120 and the cameraman 308 from being represented in the virtual viewpoint image. .

例えば、仮想視点画像生成用カメラ群１１０を構成するカメラのうち、通常撮像用カメラ１２０およびカメラマン３０８が画角に含まれるカメラのデータを用いないで、仮想視点画像を生成する。このように処理することで、通常撮像用カメラ１２０およびカメラマン３０８が含まれないように仮想視点画像を生成する方法が考えられる。図３では、仮想視点画像生成用カメラ群１１０を構成するカメラ１１０ｂの画角３０７には、通常撮像用カメラ１２０およびカメラマン３０８が含まれることを示している。このため、図３の例では、カメラ１１０ｂのデータを排除して、仮想視点画像を生成することが考えられる。しかしながら、カメラ１１０ｂの画角３０７にはオブジェクト３０３も含まれている。オブジェクト３０３を撮像できていたカメラ１１０ｂのデータを用いないで仮想視点画像を生成すると、仮想視点画像の品質が低下する虞がある。 For example, of the cameras forming the virtual viewpoint image generation camera group 110, the normal imaging camera 120 and the cameraman 308 generate virtual viewpoint images without using the camera data included in the angle of view. By processing in this way, a method of generating a virtual viewpoint image that does not include the normal imaging camera 120 and the cameraman 308 can be considered. FIG. 3 shows that the angle of view 307 of the camera 110b that constitutes the virtual viewpoint image generating camera group 110 includes the normal imaging camera 120 and the cameraman 308. FIG. Therefore, in the example of FIG. 3, it is conceivable to generate a virtual viewpoint image by excluding the data of the camera 110b. However, the object 303 is also included in the angle of view 307 of the camera 110b. If the virtual viewpoint image is generated without using the data of the camera 110b that has captured the object 303, the quality of the virtual viewpoint image may deteriorate.

また、カメラマン３０８が頻繁に移動しながら撮像を行うような場合がある。この場合、オブジェクト３０３が静止していても、仮想視点画像の生成に用いるカメラが数フレームごとに変化することになる。このため、オブジェクト３０３を撮像できていたカメラ１１０ｂのデータが使用できなくなると、フレームごとに仮想視点画像の変化が発生してしまい、仮想視点画像を動画で表示すると視聴者に不自然な見えを感じさせる場合がある。 In addition, there is a case where the cameraman 308 takes images while moving frequently. In this case, even if the object 303 is stationary, the camera used to generate the virtual viewpoint image changes every several frames. Therefore, if the data of the camera 110b that was able to capture the object 303 becomes unusable, the virtual viewpoint image will change for each frame, and if the virtual viewpoint image is displayed as a moving image, it will appear unnatural to the viewer. It can make you feel.

そこで本実施形態では、仮想カメラの位置および向きを表す視点情報と、通常撮像用カメラ１２０の位置および向きを表す視点情報とに基づき夫々の視点が類似しているかを判定する。そして、類似している場合は、通常撮像用カメラ１２０の撮像画像が出力されるように制御する方法を説明する。このような制御を行うことで、仮想カメラからの画像を提供しながら、カメラマンのような本来のオブジェクト以外の人などが映り込むことを抑制することができる。 Therefore, in the present embodiment, based on viewpoint information representing the position and orientation of the virtual camera and viewpoint information representing the position and orientation of the normal imaging camera 120, it is determined whether the respective viewpoints are similar. Then, when the images are similar, a method of controlling so that the image captured by the normal image capturing camera 120 is output will be described. By performing such control, it is possible to prevent a person other than the original object, such as a cameraman, from being captured while providing an image from the virtual camera.

なお、図３では説明を簡易にするため撮像対象を俯瞰して見たような２次元で扱う場合の事例を示しているが、３次元でパラメータを扱った場合も同様に対応可能である。 In order to simplify the explanation, FIG. 3 shows an example of a two-dimensional view of an object to be imaged, but it is also possible to deal with three-dimensional parameters.

［類似度について］
類似度算出部１０７は、仮想カメラ３０４の視点情報と、通常撮像用カメラ１２０の視点情報と、に基づき、仮想カメラ３０４の視点である仮想視点と、通常撮像用カメラ１２０の視点との類似の度合いを表す値である類似度を算出する。図３を用いて類似度の算出の一例について説明する。 [About similarity]
Based on the viewpoint information of the virtual camera 304 and the viewpoint information of the normal imaging camera 120, the similarity calculation unit 107 calculates the degree of similarity between the virtual viewpoint, which is the viewpoint of the virtual camera 304, and the viewpoint of the normal imaging camera 120. A degree of similarity, which is a value representing the degree, is calculated. An example of similarity calculation will be described with reference to FIG.

図３において、方向３０９は、通常撮像用カメラ１２０の姿勢（向き）の方向を２次元で表したものある。また、方向３０５は、仮想カメラ３０４の姿勢（向き）の方向を２次元で表したものある。角度θ３１０は、仮想カメラ３０４の向きと通常撮像用カメラ１２０の向きとの差（方向３０５と方向３０９のとの成す角度）を示す。距離３１１は、仮想カメラ３０４の位置から通常撮像用カメラ１２０の位置までの距離を示している。本実施形態では、距離３１１および角度３１０が視点の類似度として算出される。距離３１１および角度３１０は夫々の視点情報に基づき算出される。 In FIG. 3, a direction 309 represents the orientation (orientation) of the normal imaging camera 120 two-dimensionally. A direction 305 represents the orientation (orientation) of the virtual camera 304 two-dimensionally. An angle θ 310 indicates the difference between the orientation of the virtual camera 304 and the orientation of the normal imaging camera 120 (the angle formed by the directions 305 and 309). A distance 311 indicates the distance from the position of the virtual camera 304 to the position of the normal imaging camera 120 . In this embodiment, the distance 311 and the angle 310 are calculated as the similarity of viewpoints. A distance 311 and an angle 310 are calculated based on each viewpoint information.

算出された類似度と予め決められた閾値とを比較することで、仮想カメラ３０４の視点と、通常撮像用カメラ１２０の視点とが類似するかを決定することができる。本実施形態では、距離３１１が予め決められた第１の閾値より小さく、かつ角度３１０が予め決められた第２の閾値より小さい場合、仮想カメラ３０４の視点と、通常撮像用カメラ１２０の視点とは類似していると決定される。 By comparing the calculated degree of similarity with a predetermined threshold value, it is possible to determine whether the viewpoint of the virtual camera 304 and the viewpoint of the normal imaging camera 120 are similar. In this embodiment, when the distance 311 is smaller than a predetermined first threshold and the angle 310 is smaller than a predetermined second threshold, the viewpoint of the virtual camera 304 and the viewpoint of the normal imaging camera 120 are are determined to be similar.

［画像出力制御について］
図４は、画像出力制御処理の流れを説明するためのフローチャートである。図４のフローチャートで示される一連の処理は、画像処理装置１００のＣＰＵ２０１がＲＯＭ２０３に記憶されているプログラムコードをＲＡＭ２０２に展開し実行することにより行われる。また、図４におけるステップの一部または全部の機能をＡＳＩＣまたは電子回路等のハードウェアで実現してもよい。なお、各処理の説明における記号「Ｓ」は、当該フローチャートにおけるステップであることを意味し、以後のフローチャートにおいても同様とする。 [About image output control]
FIG. 4 is a flowchart for explaining the flow of image output control processing. A series of processes shown in the flowchart of FIG. 4 is performed by the CPU 201 of the image processing apparatus 100 developing the program code stored in the ROM 203 in the RAM 202 and executing the program code. Also, some or all of the functions of the steps in FIG. 4 may be realized by hardware such as ASIC or electronic circuits. Note that the symbol "S" in the description of each process means a step in the flowchart, and the same applies to subsequent flowcharts.

なお、図４のフローチャートの処理と並行して仮想視点画像の生成が行われているものとして説明するが、図４のフローチャートの開始前に予め全フレーム分の仮想視点画像が生成されていてもよい。 It should be noted that the description will be made on the assumption that virtual viewpoint images are generated in parallel with the processing of the flowchart of FIG. good.

Ｓ４０１においてカメラ情報取得部１０６は、仮想視点画像生成部１０３から仮想カメラの位置および向きを表す視点情報（第１の視点情報）を取得する。本ステップでは、画像切替装置１３０が仮想カメラに出力を切り替えた場合に出力される仮想視点画像に対応する仮想カメラの視点情報が取得される。画像出力制御処理は繰り返し行われるため、次に本ステップの処理が行われる場合は、次に出力される仮想視点画像に対応する仮想カメラの視点情報が取得される。 In S401 , the camera information acquisition unit 106 acquires viewpoint information (first viewpoint information) representing the position and orientation of the virtual camera from the virtual viewpoint image generation unit 103 . In this step, viewpoint information of the virtual camera corresponding to the virtual viewpoint image output when the image switching device 130 switches the output to the virtual camera is acquired. Since the image output control process is performed repeatedly, when the process of this step is performed next time, the viewpoint information of the virtual camera corresponding to the virtual viewpoint image to be output next is acquired.

Ｓ４０２においてカメラ情報取得部１０６は、通常撮像用カメラ１２０の位置および向きを表す視点情報（第２の視点情報）を取得する。本ステップで取得される視点情報は、Ｓ４０１の仮想視点画像に対応する撮像画像を得る際の通常撮像用カメラ１２０の視点情報である。例えば、Ｓ４０１で取得される仮想視点に基づく仮想視点画像の時刻と同じ時刻に、通常撮像用カメラ１２０が撮像した際の通常撮像用カメラ１２０の視点情報が取得されることになる。 In S402 , the camera information acquisition unit 106 acquires viewpoint information (second viewpoint information) representing the position and orientation of the normal imaging camera 120 . The viewpoint information acquired in this step is the viewpoint information of the normal imaging camera 120 when obtaining the captured image corresponding to the virtual viewpoint image in S401. For example, at the same time as the time of the virtual viewpoint image based on the virtual viewpoint acquired in S401, the viewpoint information of the normal imaging camera 120 when the normal imaging camera 120 captured the image is acquired.

Ｓ４０３において類似度算出部１０７は、取得された夫々の視点情報である、通常撮像用カメラ１２０の視点情報と仮想カメラの視点情報とに基づき、２つの視点の類似度を算出する。前述したとおり、本実施形態では類似度として、仮想カメラの向きと通常撮像用カメラ１２０の向きとの差を示す角度および、仮想カメラの位置と通常撮像用カメラ１２０の位置との距離とが、算出される。 In S403, the similarity calculation unit 107 calculates the similarity between the two viewpoints based on the obtained viewpoint information of the normal imaging camera 120 and the viewpoint information of the virtual camera. As described above, in this embodiment, as similarity, the angle indicating the difference between the orientation of the virtual camera and the orientation of the normal imaging camera 120, and the distance between the position of the virtual camera and the position of the normal imaging camera 120, Calculated.

Ｓ４０４において出力制御部１０８は、Ｓ４０３で導出された類似度に基づき、通常撮像用カメラ１２０の視点と仮想カメラの視点とが類似するかを決定する。通常撮像用カメラ１２０が複数ある場合は、仮想カメラと視点が類似する通常撮像用カメラ１２０があるかを決定する。そして決定の結果によって処理を切り替える。 In S404, the output control unit 108 determines whether the viewpoint of the normal imaging camera 120 and the virtual camera are similar based on the degree of similarity derived in S403. If there are a plurality of normal imaging cameras 120, it is determined whether there is a normal imaging camera 120 with a similar viewpoint to that of the virtual camera. Then, the processing is switched according to the result of the determination.

仮想視点に類似する通常撮像用カメラ１２０があると決定された場合（Ｓ４０４がＹＥＳ）、Ｓ４０５に進み、出力制御部１０８は、画像切替装置１３０に対して類似度に基づく出力制御をＯＮにする指示を行う。 If it is determined that there is a normal imaging camera 120 similar to the virtual viewpoint (YES in S404), the process proceeds to S405, and the output control unit 108 turns on the output control based on the degree of similarity for the image switching device 130. give instructions.

類似度に基づく出力制御とは、通常撮像用カメラ１２０の撮像画像に出力する画像を自動で切り替える制御である。また、既に、通常撮像用カメラ１２０の撮像画像を出力している場合は、ユーザが仮想視点画像に出力する画像を切り替える指示を行ってもユーザの指示を受け付けないで通常撮像用カメラ１２０の撮像画像を出力する制御である。このため、仮想視点画像と類似する画角の画像であり、かつ通常撮像用カメラ１２０およびカメラマン３０８が含まれない画像を出力して表示させることができる。 Output control based on similarity is control for automatically switching the image to be output to the captured image of the normal imaging camera 120 . In addition, when the image captured by the normal imaging camera 120 has already been output, even if the user issues an instruction to switch the image to be output to the virtual viewpoint image, the user's instruction is not accepted and the normal imaging camera 120 is used to capture the image. This is control for outputting an image. Therefore, it is possible to output and display an image that has an angle of view similar to that of the virtual viewpoint image and that does not include the normal imaging camera 120 and the cameraman 308 .

一方、仮想視点と類似する通常撮像用カメラ１２０は無いと決定された場合（Ｓ４０４がＮＯ）、Ｓ４０６に進み、出力制御部１０８は、画像切替装置１３０に対して類似度に基づく出力制御をＯＦＦにする指示を行う。このため、画像切替装置１３０は、ユーザの切り替え指示に応じた画像を出力する。 On the other hand, if it is determined that there is no normal imaging camera 120 similar to the virtual viewpoint (NO in S404), the process proceeds to S406, and the output control unit 108 turns off the output control based on the degree of similarity for the image switching device 130. give instructions to Therefore, the image switching device 130 outputs an image according to the user's switching instruction.

画像処理装置１００は、上述したＳ４０１～Ｓ４０６の処理を予め決められた所定の周期（一定の時間間隔）で所定期間、継続して実施する。即ち、所定数分のフレームが出力されると、再度Ｓ４０１～Ｓ４０６の処理が行われ、次の所定数分のフレームの出力については、類似度に基づく出力制御をＯＮにする指示をするかＯＦＦとする指示をするかの決定が行われる。次のＳ４０１～Ｓ４０６を行う時間間隔が短い、即ち、類似度に基づく出力制御をＯＮとするかＯＦＦとするかの決定処理の頻度が高いほど、仮想カメラおよび通常撮像用カメラ１２０の動きに対して追従性が高くなるが、計算負荷が高くなる。 The image processing apparatus 100 continuously performs the above-described processes of S401 to S406 at predetermined cycles (constant time intervals) for a predetermined period. That is, when the predetermined number of frames are output, the processing of S401 to S406 is performed again, and for the output of the next predetermined number of frames, an instruction to turn ON or OFF the output control based on the degree of similarity is given. A decision is made as to whether to issue an instruction to The shorter the time interval at which the next steps S401 to S406 are performed, that is, the higher the frequency of the decision processing to turn ON or OFF the output control based on the similarity, the more the motion of the virtual camera and the normal imaging camera 120 is affected. , the tracking performance becomes higher, but the computational load increases.

なお、仮想カメラは複数指定されることもあり、通常撮像用カメラが複数存在することもある。この場合でも、本実施形態の制御を行うことが可能である。 A plurality of virtual cameras may be designated, and a plurality of normal imaging cameras may exist. Even in this case, it is possible to perform the control of this embodiment.

図５は、複数の仮想カメラと、複数の通常撮像用カメラとが存在する場合において、それぞれの視点を比較した結果を表す視点比較テーブル５００を示す図である。図５の視点比較テーブルは、仮想カメラが仮想カメラ１～４の４台、通常撮像用カメラが通常撮像用カメラ１～２の２台存在する場合の視点比較テーブルを示している。複数の仮想カメラがある場合、類似度算出部１０７は、仮想カメラごとに通常撮像用カメラとの視点の類似度を算出する。そして、出力制御部１０８は、類似度と閾値を比較して、仮想カメラの視点と通常撮像用カメラの視点とが類似するかを夫々決定してその結果を視点比較テーブルに保持する。 FIG. 5 is a view showing a viewpoint comparison table 500 representing the result of comparing the respective viewpoints when there are a plurality of virtual cameras and a plurality of normal imaging cameras. The viewpoint comparison table in FIG. 5 shows a viewpoint comparison table when there are four virtual cameras 1 to 4 as virtual cameras and two normal imaging cameras 1 to 2 as normal imaging cameras. When there are a plurality of virtual cameras, the similarity calculation unit 107 calculates the similarity of the viewpoint with the normal imaging camera for each virtual camera. Then, the output control unit 108 compares the degree of similarity with the threshold value, determines whether the viewpoint of the virtual camera and the viewpoint of the normal imaging camera are similar, and stores the result in a viewpoint comparison table.

出力制御部１０８は、類似度に基づく出力制御として視点比較テーブル５００に応じた出力制御が行われるように画像切替装置１３０に指示する。例えば、図４のフローチャートではＳ４０３で類似度を算出した後、Ｓ４０４では出力制御部１０８は、視点比較テーブル５００を生成して、仮想カメラに類似する通常撮像用カメラがあるか判定される。仮想カメラに類似する通常撮像用カメラがあると判定された場合、Ｓ４０５では類似度に基づく出力制御として視点比較テーブル５００に応じた出力制御が行われるように画像切替装置１３０に指示する。なお、Ｓ４０４～Ｓ４０６の処理に替えて、出力制御部１０８は、視点比較テーブル５００を生成して、類似度に基づく出力制御として視点比較テーブル５００に応じた出力制御が行われるように画像切替装置１３０に指示してもよい。 The output control unit 108 instructs the image switching device 130 to perform output control according to the viewpoint comparison table 500 as output control based on the degree of similarity. For example, in the flowchart of FIG. 4, after calculating the degree of similarity in S403, in S404 the output control unit 108 generates the viewpoint comparison table 500 and determines whether there is a normal imaging camera similar to the virtual camera. If it is determined that there is a normal imaging camera similar to the virtual camera, in S405 the image switching device 130 is instructed to perform output control based on the degree of similarity in accordance with the viewpoint comparison table 500 . Note that instead of the processing of S404 to S406, the output control unit 108 generates the viewpoint comparison table 500, and performs output control according to the viewpoint comparison table 500 as output control based on the degree of similarity. 130 may be instructed.

画像切替装置１３０において実行される視点比較テーブル５００に応じた出力制御の一例を図５に基づき説明する。図５に示すとおり仮想カメラ１の視点に対して通常撮像用カメラ１の視点が類似し、通常撮像用カメラ２の視点は類似しないと決定されている。この状況で、仮想カメラ１に対応する仮想視点画像を出力する指示をユーザがした場合、画像切替装置１３０は、仮想カメラ１と視点が類似する通常撮像用カメラ１の撮像画像に出力を自動で切り替える出力制御を行う。また、通常撮像用カメラ１から通常撮像用カメラ２への切り替えはユーザの任意のタイミングで切替可能なように出力制御を行う。 An example of output control according to the viewpoint comparison table 500 executed in the image switching device 130 will be described with reference to FIG. As shown in FIG. 5, it is determined that the viewpoint of the normal imaging camera 1 is similar to the viewpoint of the virtual camera 1, and that the viewpoint of the normal imaging camera 2 is not similar. In this situation, when the user gives an instruction to output the virtual viewpoint image corresponding to the virtual camera 1, the image switching device 130 automatically outputs the captured image of the normal imaging camera 1 whose viewpoint is similar to that of the virtual camera 1. Control the output to switch. In addition, output control is performed so that switching from the normal imaging camera 1 to the normal imaging camera 2 can be performed at any timing of the user.

仮想カメラ２に対応する仮想視点画像を出力する指示をユーザがした場合、画像切替装置１３０は、仮想カメラ２と視点が類似する通常撮像用カメラ２の撮像画像に自動で出力が切り替わる出力制御を行う。また、通常撮像用カメラ２の撮像画像から通常撮像用カメラ１の撮像画像への出力の切り替えは、ユーザの任意のタイミングで切替可能なように出力制御を行う。 When the user gives an instruction to output the virtual viewpoint image corresponding to the virtual camera 2, the image switching device 130 automatically switches the output to the captured image of the normal imaging camera 2 whose viewpoint is similar to that of the virtual camera 2. conduct. In addition, output control is performed so that the output can be switched from the captured image of the normal imaging camera 2 to the captured image of the normal imaging camera 1 at arbitrary timing of the user.

仮想カメラ３の視点はどちらの通常撮像用カメラの視点に対しても類似しないと決定されている。この状況で仮想カメラ３に対応する仮想視点画像を出力する指示をユーザがした場合、画像切替装置１３０は、ユーザの任意のタイミングで仮想カメラ３の仮想視点画像へ切り替えるように出力制御を行う。 It has been determined that the viewpoint of virtual camera 3 is not similar to the viewpoint of either normal imaging camera. In this situation, when the user gives an instruction to output the virtual viewpoint image corresponding to the virtual camera 3, the image switching device 130 performs output control so as to switch to the virtual viewpoint image of the virtual camera 3 at the user's arbitrary timing.

仮想カメラ４の視点はどちらの通常撮像用カメラの視点に対しても類似する。この状況で仮想カメラ４に対応する仮想視点画像を出力する指示をユーザがした場合、画像切替装置１３０は、どちらかの通常撮像用カメラの撮像画像に自動で出力を切り替える出力制御を行う。通常撮像用カメラ１または通常撮像用カメラ２のどちらの撮像画像に切り替えるかについては、予め通常撮像用カメラに優先度を設定しておき、優先度が高いカメラの撮像画像に出力を切り替えればよい。または、視点が類似するか否かを決定するのではなく、類似度に基づき仮想カメラと視点がより類似している通常撮像用カメラを決定して、決定されたカメラの撮像画像に出力が切り替わるようにしもよい。例えば、仮想カメラとの距離が小さい方の通常撮像用カメラの撮像画像に切り替わるようにする。 The viewpoint of the virtual camera 4 is similar to that of either normal imaging camera. In this situation, when the user gives an instruction to output the virtual viewpoint image corresponding to the virtual camera 4, the image switching device 130 performs output control to automatically switch the output to the captured image of one of the normal imaging cameras. As to whether to switch to the captured image of the normal imaging camera 1 or the normal imaging camera 2, the priority is set in advance to the normal imaging camera, and the output is switched to the captured image of the camera with the higher priority. . Alternatively, instead of determining whether or not the viewpoints are similar, a normal imaging camera whose viewpoint is more similar to the virtual camera is determined based on the degree of similarity, and the output is switched to the image captured by the determined camera. You can do it. For example, the image is switched to the captured image of the normal imaging camera that is closer to the virtual camera.

以上説明したように本実施形態によれば、仮想視点画像と通常撮像用カメラの撮像画像とを切り替えて出力する場合において、仮想カメラの画角にカメラマン等が映り込むようなときでも表示される画像の品質の低下を抑制することが可能となる。 As described above, according to the present embodiment, when the virtual viewpoint image and the captured image of the normal imaging camera are switched and output, the image can be displayed even when the cameraman or the like is reflected in the angle of view of the virtual camera. It becomes possible to suppress deterioration in image quality.

なお、上述した説明では、仮想カメラの視点と通常撮像用カメラの視点とが類似するかを決定するために用いられる類似度を、カメラの位置および姿勢を表す視点情報から算出する方法を説明した。他にも、仮想視点画像と通常撮像用カメラの撮像画像との画像の類似度に基づき、仮想カメラの視点と通常撮像用カメラの視点とが類似するかを決定してもよい。 In the above description, the method of calculating the degree of similarity used for determining whether the viewpoint of the virtual camera and the viewpoint of the normal imaging camera are similar from the viewpoint information representing the position and orientation of the camera has been described. . Alternatively, it may be determined whether the viewpoint of the virtual camera and the viewpoint of the normal imaging camera are similar based on the similarity between the virtual viewpoint image and the image captured by the normal imaging camera.

例えば、カメラ情報取得部１０６は、仮想視点画像生成部１０３からは仮想視点画像を、物理視点用撮像画像取得部１０４からは通常撮像用カメラの撮像画像を取得する。カメラ情報取得部１０６は取得した仮想視点画像および撮像画像を類似度算出部１０７に出力する。類似度算出部１０７は仮想視点画像と通常撮像用カメラの撮像画像との画像データから画像の類似度を算出する。出力制御部１０８は、画像の類似度があらかじめ決められた閾値を超える場合は、視点が類似すると決定する。視点が類似すると決定した場合の出力制御の方法は上述した方法と同じである。 For example, the camera information acquisition unit 106 acquires the virtual viewpoint image from the virtual viewpoint image generation unit 103 and the captured image of the normal imaging camera from the physical viewpoint captured image acquisition unit 104 . The camera information acquisition unit 106 outputs the acquired virtual viewpoint image and captured image to the similarity calculation unit 107 . A similarity calculation unit 107 calculates the similarity of images from the image data of the virtual viewpoint image and the captured image of the normal imaging camera. The output control unit 108 determines that the viewpoints are similar when the degree of similarity between the images exceeds a predetermined threshold. The method of output control when it is determined that the viewpoints are similar is the same as the method described above.

画像の類似度の算出方法としては、例えば、画像上の特徴点を抽出し、特徴点のマッチング度合いを類似度として算出する方法がある。または、あらかじめオブジェクトを機械学習の手法で識別できるようにしておき、仮想視点画像と通常撮像用カメラの撮像画像とのそれぞれからオブジェクトの識別を行い、オブジェクトの位置関係を比較することで類似度を算出してもよい。これらの類似度の算出方法は一例であり、本実施形態では画像の類似度を算出する方法は限定しない。 As a method of calculating the degree of similarity between images, for example, there is a method of extracting feature points on an image and calculating the degree of matching of the feature points as the degree of similarity. Alternatively, objects can be identified in advance using a machine learning method, and the similarity can be calculated by identifying the objects from the virtual viewpoint image and the image captured by the normal imaging camera, and comparing the positional relationship of the objects. can be calculated. These similarity calculation methods are examples, and the method for calculating the image similarity is not limited in this embodiment.

また、前述した仮想カメラの視点情報と通常撮像用カメラの視点情報とに基づく視点の類似度と、仮想視点画像と通常撮像用カメラの撮像画像との画像の類似度と、を組み合わせて視点が類似するかを決定してもよい。例えば、画像の類似度の算出は、視点の類似度の算出と比較して画像処理装置１００の演算負荷が高い。このため、例えば、視点情報に基づき距離３１１が予め決められた第１の閾値より小さく、かつ角度３１０が予め決められた第２の閾値より小さいと決定された場合のみ、さらに画像の類似度を算出する。そして、画像の類似度が決められた閾値より高い場合は視点が類似すると決定してもよい。このように２段階で視点が類似するかを決定することにより、負荷を抑制しつつ視点が類似するかの決定の精度を高めることができる。 In addition, the similarity between the viewpoint based on the viewpoint information of the virtual camera and the viewpoint information of the normal imaging camera and the similarity between the virtual viewpoint image and the image captured by the normal imaging camera are combined to determine the viewpoint. Similarity may be determined. For example, calculation of image similarity requires a higher computational load on the image processing apparatus 100 than calculation of viewpoint similarity. For this reason, for example, only when it is determined that the distance 311 is smaller than a predetermined first threshold value and the angle 310 is smaller than a predetermined second threshold value based on the viewpoint information, the image similarity is further evaluated. calculate. Then, it may be determined that the viewpoints are similar when the similarity of the images is higher than a predetermined threshold. By determining whether the viewpoints are similar in two stages in this way, it is possible to increase the accuracy of determining whether the viewpoints are similar while suppressing the load.

＜第２実施形態＞
第１実施形態では、出力制御部１０８が実行する出力制御として、仮想カメラと視点が類似する通常撮像用カメラがある場合、通常撮像用カメラの撮像画像が出力されるように制御を実行する方法を説明した。本実施形態では、仮想カメラと視点が類似する通常撮像用カメラがある場合、画像切替装置１３０のスイッチを操作して出力する画像の切り替えるユーザにその旨を通知する方法を説明する。本実施形態については、第１実施形態からの差分を中心に説明する。特に明記しない部分については第１実施形態と同じ構成および処理である。 <Second embodiment>
In the first embodiment, as the output control executed by the output control unit 108, when there is a normal imaging camera having a viewpoint similar to that of the virtual camera, a method of executing control so that an image captured by the normal imaging camera is output. explained. In the present embodiment, when there is a normal imaging camera having a viewpoint similar to that of the virtual camera, a method of notifying the user who operates the switch of the image switching device 130 to switch the output image will be described. This embodiment will be described with a focus on differences from the first embodiment. Parts that are not particularly specified have the same configuration and processing as in the first embodiment.

本実施形態では、仮想カメラ１および仮想カメラ２の２台の仮想カメラが指定されており、通常撮像用カメラ１および通常撮像用カメラ２の２台のカメラが存在する場合について説明する。 In this embodiment, two virtual cameras, virtual camera 1 and virtual camera 2, are designated, and a case where two cameras, normal imaging camera 1 and normal imaging camera 2, exist will be described.

図６は、画像切替装置１３０のユーザが見ることが可能な表示部（不図示）に表示される画面６０１を表す図である。画面６０１が表示される不図示の表示部が、画像切替装置１３０に接続されているものとする。 FIG. 6 is a diagram showing a screen 601 displayed on a display unit (not shown) that can be viewed by the user of the image switching device 130. As shown in FIG. It is assumed that a display unit (not shown) on which screen 601 is displayed is connected to image switching device 130 .

画面６０１には、同時刻の仮想視点画像および撮像画像が表示される画面６０２～６０５が含まれる。例えば、オブジェクト３０３を２台の仮想カメラと２台の通常撮像用カメラの４台のカメラで同時に撮像しているような状態である場合、画面６０１が画像切替装置１３０の表示部（不図示）に表示される。画像切替装置１３０で出力する画像の切り替えを行うユーザは、画面６０１を見ながら、画像を選択してその画像が出力されるように画像の切り替えをすることができる。 The screen 601 includes screens 602 to 605 on which the virtual viewpoint image and the captured image at the same time are displayed. For example, when the object 303 is being imaged simultaneously by four cameras, two virtual cameras and two normal imaging cameras, the screen 601 is displayed on the display unit (not shown) of the image switching device 130. to be displayed. A user who switches images to be output by the image switching device 130 can select an image while viewing the screen 601 and switch the images so that the selected image is output.

画面６０２は通常撮像用カメラ１が撮像して得られた撮像画像を表示するための画面である。画面６０３は通常撮像用カメラ２が撮像して得られた撮像画像を表示するための画面である。画面６０４は仮想カメラ１からの見え表す仮想視点画像を表示するための画面である。画面６０５は仮想カメラ２からの見えを表す仮想視点画像を表示するための画面である。 A screen 602 is a screen for displaying an imaged image obtained by imaging by the normal imaging camera 1 . A screen 603 is a screen for displaying an imaged image obtained by imaging by the normal imaging camera 2 . A screen 604 is a screen for displaying a virtual viewpoint image seen from the virtual camera 1 . A screen 605 is a screen for displaying a virtual viewpoint image representing the view from the virtual camera 2 .

画面６０４は、画像切替装置１３０から出力している画像を表示している画面である。即ち、画面６０４に表示されている画像はユーザの指示によって出力されている画像である。画面６０４は、他の画面６０２、６０３、６０５と比較して画面の枠の太さが強調ように表示制御される。図６上では表現されていないが、画面６０４は、他の画面と異なる着色がなされている。例えば、画面６０４は赤色の太枠で囲われる。このため、現在、画像切替装置１３０が出力している画像は画面６０４に表示されている画像であることをユーザが確認しやすくなる。 A screen 604 displays an image output from the image switching device 130 . That is, the image displayed on the screen 604 is the image output according to the user's instruction. The screen 604 is display-controlled such that the thickness of the frame of the screen is emphasized compared to the other screens 602 , 603 , and 605 . Although not shown in FIG. 6, the screen 604 is colored differently from the other screens. For example, the screen 604 is surrounded by a thick red frame. Therefore, the user can easily confirm that the image currently being output by the image switching device 130 is the image displayed on the screen 604 .

図６を用いて、出力制御部１０８の指示に基づき画像切替装置１３０で実行される、本実施形態の出力制御について説明する。前述したように、画像切替装置１３０を操作するユーザによって出力が指示されている画像は仮想カメラ１の仮想視点画像であるものとする。そして、通常撮像用カメラ１および通常撮像用カメラ２のうち、通常撮像用カメラ１は仮想カメラ１と視点が類似すると決定されたものとする。この場合、現在選択されている仮想カメラ１の画面６０４には、視点が類似する通常撮像用カメラが存在することを示す通知６０７が仮想視点画像に重畳して表示される。 Output control according to the present embodiment, which is executed by the image switching device 130 based on instructions from the output control unit 108, will be described with reference to FIG. As described above, it is assumed that the image whose output is instructed by the user who operates the image switching device 130 is the virtual viewpoint image of the virtual camera 1 . Then, of the normal imaging camera 1 and the normal imaging camera 2, it is determined that the normal imaging camera 1 has a similar viewpoint to the virtual camera 1. FIG. In this case, on the screen 604 of the currently selected virtual camera 1, a notification 607 indicating that there is a normal imaging camera with a similar viewpoint is displayed superimposed on the virtual viewpoint image.

なお、画像切替装置１３０を操作するユーザへの通知６０７には、類似する通常撮像用カメラを特定するための情報が含まれていてもよい。即ち、通知６０７には、通常撮像用カメラ１が仮想カメラ１と視点が類似する旨が含まれていてもよい。 Note that the notification 607 to the user who operates the image switching device 130 may include information for specifying a similar normal imaging camera. That is, the notification 607 may include information that the viewpoint of the normal imaging camera 1 is similar to that of the virtual camera 1 .

また、ユーザへの通知方法としては、通知６０７に代えて、または通知６０７に追加して、ユーザに注意を喚起できるように音または光を出す機構を用いて音または光による通知を行ってもよい。 As a method of notifying the user, instead of the notification 607 or in addition to the notification 607, notification by sound or light may be performed using a mechanism for emitting sound or light so as to call the attention of the user. good.

本実施形態では、複数のカメラが存在する場合について説明をした。このため、第１実施形態で説明したように、出力制御部１０８は、視点比較テーブル５００を生成して、視点比較テーブル５００に基づき出力制御を行うように画像切替装置１３０に指示すればよい。 In this embodiment, the case where there are multiple cameras has been described. Therefore, as described in the first embodiment, the output control unit 108 may generate the viewpoint comparison table 500 and instruct the image switching device 130 to perform output control based on the viewpoint comparison table 500 .

または、仮想カメラおよび通常撮像用カメラはそれぞれ１台でもよい。その場合、図４のフローチャートにおいて、Ｓ４０５によって類似度に基づく出力制御をＯＮにする指示された場合、画像切替装置１３０はユーザへ通知する。Ｓ４０６で類似度に基づく出力制御をＯＦＦにする指示されたらユーザへの通知は行われないように制御される。 Alternatively, one virtual camera and one normal imaging camera may be provided. In this case, in the flowchart of FIG. 4, if an instruction to turn on output control based on similarity is given in S405, the image switching device 130 notifies the user. If an instruction to turn off the output control based on the degree of similarity is given in S406, control is performed so as not to notify the user.

なお、画面６０１を表示する表示部は、画像処理装置１００に接続されていてもよい。この場合、出力制御部１０８の指示に基づき画像処理装置１００のＣＰＵ２０１が画面６０１を表示するように表示制御を行う。 Note that the display unit that displays the screen 601 may be connected to the image processing apparatus 100 . In this case, the CPU 201 of the image processing apparatus 100 performs display control so that the screen 601 is displayed based on instructions from the output control unit 108 .

以上説明したように本実施形態によれば、出力画像の切り替えは、ユーザの指示によって行われる。そのため、視点が類似する通常撮像用カメラがあっても、あえて出力を切り替えない等の切り替えるタイミングをユーザが調整することが可能となる。 As described above, according to the present embodiment, the output image is switched according to the user's instruction. Therefore, even if there are normal imaging cameras with similar viewpoints, it is possible for the user to adjust the switching timing, such as not switching the output.

＜第３実施形態＞
本実施形態では、所定の条件を満たす場合、類似度の算出をスキップして、類似度に基づく出力制御をＯＦＦにするよう画像切替装置１３０に指示を行う方法を説明する。本実施形態については、第１実施形態からの差分を中心に説明する。特に明記しない部分については第１実施形態と同じ構成および処理である。第１実施形態との画像処理装置１００の機能構成の差異は、類似度算出部１０７での処理が異なる点である。 <Third Embodiment>
In this embodiment, a method of instructing the image switching device 130 to skip the similarity calculation and turn off the output control based on the similarity when a predetermined condition is satisfied will be described. This embodiment will be described with a focus on differences from the first embodiment. Parts that are not particularly specified have the same configuration and processing as in the first embodiment. The functional configuration of the image processing apparatus 100 differs from that of the first embodiment in that the similarity calculation unit 107 performs different processing.

図７は、本実施形態における画像出力制御処理を説明するためのフローチャートである。図７のフローチャートで示される一連の処理は、画像処理装置１００のＣＰＵ２０１がＲＯＭ２０３に記憶されているプログラムコードをＲＡＭ２０２に展開し実行することにより行われる。Ｓ７０２～Ｓ７０７は、図４で示したＳ４０１～Ｓ４０６と同様の処理である。本実施形態では、Ｓ７０２～７０７を行う前に、Ｓ７０１において類似度算出部１０７は、視点の類似度の算出を実行しない条件（除外条件）を満たすかを判定する。 FIG. 7 is a flowchart for explaining image output control processing in this embodiment. A series of processes shown in the flowchart of FIG. 7 are performed by the CPU 201 of the image processing apparatus 100 developing the program code stored in the ROM 203 in the RAM 202 and executing the program code. S702 to S707 are the same processes as S401 to S406 shown in FIG. In this embodiment, before performing S702 to S707, in S701, the similarity calculation unit 107 determines whether or not a condition (exclusion condition) for not executing viewpoint similarity calculation is satisfied.

除外条件を満たす場合とは、例えば、仮想カメラを基準としたオブジェクトの相対速度があらかじめ定められた閾値を超える場合である。または、オブジェクトを表す領域が仮想視点画像全体に占める割合があらかじめ定められた閾値を下回る場合も除外条件を満たすと判定してもよい。 A case where the exclusion condition is satisfied is, for example, a case where the relative velocity of the object with respect to the virtual camera exceeds a predetermined threshold. Alternatively, it may be determined that the exclusion condition is satisfied when the ratio of the area representing the object to the entire virtual viewpoint image is below a predetermined threshold value.

仮想カメラの画角に通常撮像用カメラ１２０またはカメラマン３０８が含まれていても、オブジェクト３０３の動きが速い、またはオブジェクト３０３が占める割合が小さいなどの場合、視聴者は仮想視点画像の画質低下に気がつきにくい。このような場合、ユーザの指示どおりに画像が切り替えられて表示されればよいため、類似度に基づく出力制御はＯＦＦとすることが好ましい。このため、本実施形態では除外条件を満たす場合（Ｓ７０１がＹＥＳ）、Ｓ７０７に進む。Ｓ７０７はＳ４０６と同じ処理である。なお、第１実施形態の図４のフローチャートの処理と同様に、図７のフローチャートの処理は、一定の時間間隔で繰り返し実行される。 Even if the normal imaging camera 120 or the cameraman 308 is included in the angle of view of the virtual camera, if the object 303 moves quickly or the proportion of the object 303 occupied by the object 303 is small, the image quality of the virtual viewpoint image may deteriorate. hard to notice. In such a case, it is preferable to turn off the output control based on the degree of similarity, since it is only necessary to switch and display the images as instructed by the user. Therefore, in this embodiment, when the exclusion condition is satisfied (YES in S701), the process proceeds to S707. S707 is the same processing as S406. As with the processing of the flowchart of FIG. 4 of the first embodiment, the processing of the flowchart of FIG. 7 is repeatedly executed at regular time intervals.

図８は、仮想視点画像生成用カメラ群１１０の撮像範囲３００を上から俯瞰して見た状態を示した図である。図８を用いて、Ｓ７０１の処理の一例として、仮想カメラを基準としたオブジェクトの相対速度があらかじめ定められた閾値を超える場合に、除外条件を満たすと判定する例を説明する。 FIG. 8 is a diagram showing a state in which the imaging range 300 of the virtual viewpoint image generation camera group 110 is viewed from above. As an example of the processing of S701, an example of determining that the exclusion condition is satisfied when the relative velocity of the object with respect to the virtual camera exceeds a predetermined threshold will be described with reference to FIG.

図８において、通常撮像用カメラ１２０と仮想カメラ３０４とは、図３と同様に、同時にオブジェクト３０３を撮像している状態を表している。前述したとおりＳ７０１～Ｓ７０７は一定の時間間隔で繰り返し実行されている。位置８０１は、前回のＳ７０１の処理で求められた仮想カメラ３０４の位置であり、位置８０２は、今回のＳ７０１の処理で求められた仮想カメラ３０４の位置を示す。このように図８の仮想カメラ３０４は、移動しながらオブジェクト３０３を撮像しているように指定されたことを示している。図８では、オブジェクト３０３は前回のＳ７０１の処理時には位置８０３に存在しており、今回のＳ７０１の処理時には位置８０４に位置していたことを示している。 In FIG. 8, the normal imaging camera 120 and the virtual camera 304 represent a state in which the object 303 is imaged at the same time, as in FIG. As described above, S701 to S707 are repeatedly executed at regular time intervals. A position 801 is the position of the virtual camera 304 obtained in the previous processing of S701, and a position 802 indicates the position of the virtual camera 304 obtained in the current processing of S701. Thus, the virtual camera 304 in FIG. 8 indicates that it has been specified to image the object 303 while moving. FIG. 8 shows that the object 303 existed at a position 803 during the previous processing of S701, and was located at a position 804 during the current processing of S701.

本例では、Ｓ７０１において類似度算出部１０７は、仮想カメラ３０４の位置を取得して保持する。このため、Ｓ７０１で除外条件を満たさないと判定した場合、本実施形態では、Ｓ７０４で仮想カメラ３０４の位置は決定されなくてもよい。また、Ｓ７０１では、仮想カメラの位置だけでなく仮想カメラの向きの情報を取得してもよい。この場合、Ｓ７０３はスキップしてもよい。 In this example, the similarity calculation unit 107 acquires and holds the position of the virtual camera 304 in S701. Therefore, if it is determined in S701 that the exclusion condition is not satisfied, the position of the virtual camera 304 may not be determined in S704 in this embodiment. Further, in S701, information on the orientation of the virtual camera may be acquired in addition to the position of the virtual camera. In this case, S703 may be skipped.

そして、Ｓ７０１において類似度算出部１０７は、仮想カメラの移動速度を算出する。図７のフローチャートのＳ７０１～Ｓ７０７が実行される一定の時間間隔はあらかじめ設定されているため、仮想カメラの位置の変化から仮想カメラの移動距離を算出することで、仮想カメラの移動速度を算出することができる。 Then, in S701, the similarity calculation unit 107 calculates the moving speed of the virtual camera. Since a certain time interval for executing S701 to S707 in the flowchart of FIG. 7 is set in advance, the moving speed of the virtual camera is calculated by calculating the moving distance of the virtual camera from the change in the position of the virtual camera. be able to.

次に、Ｓ７０１において類似度算出部１０７は、オブジェクト３０３の位置を取得して、オブジェクト３０３の移動速度を算出する。仮想カメラと同様に、オブジェクト３０３の位置の変化から移動距離を算出することで、移動速度を算出することができる。 Next, in S701 , the similarity calculation unit 107 acquires the position of the object 303 and calculates the moving speed of the object 303 . As with the virtual camera, the movement speed can be calculated by calculating the movement distance from the change in the position of the object 303 .

仮想視点画像生成部１０３が仮想視点画像の生成処理を行うためにオブジェクトの三次元モデルを生成する過程で、オブジェクトの三次元モデルの世界座標上における位置が求まる。このため、類似度算出部１０７は、仮想視点画像生成部１０３からオブジェクトの位置の概略が取得可能である。 In the process of generating the three-dimensional model of the object in order for the virtual viewpoint image generating unit 103 to generate the virtual viewpoint image, the position of the three-dimensional model of the object on the world coordinates is obtained. Therefore, the similarity calculation unit 107 can obtain the outline of the position of the object from the virtual viewpoint image generation unit 103 .

オブジェクトの三次元モデルが生成されない場合、仮想視点画像生成用カメラ群１１０のうちの複数台のカメラの画像からオブジェクトを識別する。そして、あらかじめ把握している各仮想視点画像生成用カメラの位置関係からオブジェクトの位置を算出することが可能である。画像からオブジェクトを識別する技術については限定しない。例えば、動く物体を背景と分離する技術、または予めオブジェクトを機械学習によって学習して識別する方法などを用いればよい。 When the three-dimensional model of the object is not generated, the object is identified from the images of a plurality of cameras in the virtual viewpoint image generation camera group 110 . Then, it is possible to calculate the position of the object from the positional relationship of each virtual viewpoint image generation camera that is grasped in advance. The technology for identifying objects from images is not limited. For example, a technique of separating moving objects from the background, or a method of learning and identifying objects in advance by machine learning may be used.

そして、Ｓ７０１において類似度算出部１０７は、仮想カメラの移動速度およびオブジェクトの移動速度に基づき、仮想カメラを基準としたオブジェクトの相対速度を算出する。そして、類似度算出部１０７は、仮想カメラを基準としたオブジェクトの相対速度があらかじめ定められた閾値を超えるかを判定する。閾値を超える場合、除外条件を満たすと判定する。 Then, in S701, the similarity calculation unit 107 calculates the relative speed of the object with respect to the virtual camera based on the moving speed of the virtual camera and the moving speed of the object. Then, the similarity calculation unit 107 determines whether the relative velocity of the object with respect to the virtual camera exceeds a predetermined threshold. If the threshold is exceeded, it is determined that the exclusion condition is satisfied.

仮想カメラを基準としたオブジェクトの相対速度があらかじめ定められた閾値を超える場合、画像の視聴者は画質の低下に気がつきにくい。このため、上述の実施形態で説明した類似度に基づく出力制御を行う必要がない。このため、Ｓ７０１で除外条件を満たすと判定された場合、類似度に基づく出力制御をＯＦＦにするよう出力制御部１０８が画像切替装置１３０に指示を行う。 When the relative velocity of the object with respect to the virtual camera exceeds a predetermined threshold, the viewer of the image is less likely to perceive the deterioration of the image quality. Therefore, it is not necessary to perform the output control based on the degree of similarity described in the above embodiment. Therefore, when it is determined in S701 that the exclusion condition is satisfied, the output control unit 108 instructs the image switching device 130 to turn off the output control based on the degree of similarity.

なお、オブジェクトが２つ（２人）以上の場合は、例えば、全てのオブジェクトの相対速度を算出して、全ての相対速度が閾値を超えた場合、Ｓ７０１で除外条件を満たすと判定すればよい。または、仮想カメラの位置から所定の範囲内にいるオブジェクトの相対速度を算出して、算出した相対速度が全て閾値を超えた場合、Ｓ７０１で除外条件を満たすと判定すればよい。 If there are two objects (two people) or more, for example, the relative velocities of all the objects are calculated, and if the relative velocities of all the objects exceed the threshold, it is determined in S701 that the exclusion condition is satisfied. . Alternatively, the relative velocities of objects within a predetermined range from the position of the virtual camera may be calculated, and if all the calculated relative velocities exceed the threshold, it may be determined in S701 that the exclusion condition is satisfied.

図９は、仮想視点画像生成用カメラ群１１０の撮像範囲３００を上から俯瞰して見た状態と、仮想カメラ３０４に対応する仮想視点画像９０１を示す図である。次に、図９を用いて、Ｓ７０１の処理の一例として、仮想視点画像におけるオブジェクトを表す領域の占める割合が閾値より小さい場合に、除外条件を満たすと判定する例を説明する。 FIG. 9 is a diagram showing a state in which the imaging range 300 of the virtual viewpoint image generation camera group 110 is viewed from above, and a virtual viewpoint image 901 corresponding to the virtual camera 304 . Next, as an example of the processing of S701, an example of determining that the exclusion condition is satisfied when the ratio of the area representing the object in the virtual viewpoint image is smaller than the threshold will be described with reference to FIG.

図９では、通常撮像用カメラ１２０と仮想カメラ３０４とがオブジェクト３０３を同時に撮像している状態を示している。図９の仮想カメラ３０４は、仮想視点画像生成用カメラ群１１０の撮像範囲３００の範囲外に位置するように指定されている。このように、仮想カメラ３０４は世界座標が定義されている領域内であれば仮想的にどこまでも移動することが可能である。そのため、仮想カメラ３０４の位置としては一般的な状態である。オブジェクト３０３に重なるような位置に、仮想カメラ３０４を移動させることも可能である。 FIG. 9 shows a state in which the normal imaging camera 120 and the virtual camera 304 are imaging the object 303 at the same time. The virtual camera 304 in FIG. 9 is designated to be positioned outside the imaging range 300 of the virtual viewpoint image generation camera group 110 . In this way, the virtual camera 304 can virtually move anywhere within the area where the world coordinates are defined. Therefore, the position of the virtual camera 304 is a general state. It is also possible to move the virtual camera 304 to a position overlapping the object 303 .

図９では、仮想カメラ３０４はオブジェクト３０３を撮像している状態であり、仮想視点画像９０１は、図９の仮想カメラ３０４の画角を示す画像である。図９の仮想カメラ３０４はオブジェクト３０３を、引きの状態で撮像しているため、仮想視点画像９０１上では画像全体に対するオブジェクト３０３が占める割合は小さい。仮想視点画像９０１のように、画像全体に対するオブジェクトが占める割合が小さい場合、視聴者は画質の低下に気がつきにくい。このため、仮想視点画像の全体に対して、仮想視点画像に含まれるオブジェクトを表す領域の占める割合が閾値より小さいと判定した場合、除外条件を満たすと判定して、Ｓ７０７に進む。 In FIG. 9, the virtual camera 304 is capturing an image of the object 303, and the virtual viewpoint image 901 is an image showing the angle of view of the virtual camera 304 in FIG. Since the virtual camera 304 in FIG. 9 captures the object 303 in a drawn state, the proportion of the entire image occupied by the object 303 on the virtual viewpoint image 901 is small. When an object occupies a small proportion of the entire image, as in the virtual viewpoint image 901, the viewer is less likely to notice the deterioration of the image quality. Therefore, if it is determined that the proportion of the area representing the object included in the virtual viewpoint image to the entire virtual viewpoint image is smaller than the threshold, it is determined that the exclusion condition is satisfied, and the process proceeds to S707.

仮想視点画像（仮想カメラの画角）の画像全体に対するオブジェクトを表す領域が占める割合の算出は、例えば、仮想視点画像の生成処理の過程で実行することができる。具体的には、仮想視点画像生成部１０３は、オブジェクトの三次元モデルを生成したら、三次元モデルの外接する直方体の頂点を仮想視点画像の仮想カメラ座標側に透視投影変換する。このようにオブジェクトを表す領域が仮想視点画像全体に占める割合を求めることができる。この方法で得られたオブジェクトを表す領域が画像全体に占める割合は、仮想視点画像生成部１０３から出力されて、類似度算出部１０７が取得することができる。 The calculation of the ratio of the area representing the object to the entire image of the virtual viewpoint image (angle of view of the virtual camera) can be performed, for example, in the process of generating the virtual viewpoint image. Specifically, after generating the three-dimensional model of the object, the virtual viewpoint image generating unit 103 performs perspective projection transformation on the vertices of the circumscribing rectangular parallelepiped of the three-dimensional model to the virtual camera coordinate side of the virtual viewpoint image. Thus, it is possible to obtain the ratio of the area representing the object to the entire virtual viewpoint image. The ratio of the area representing the object obtained by this method to the entire image can be output from the virtual viewpoint image generation unit 103 and obtained by the similarity calculation unit 107 .

または、類似度算出部１０７は、仮想視点画像を仮想視点画像生成部１０３から取得して、仮想視点画像からオブジェクトを識別する処理を行うことでオブジェクトを表す領域が仮想視点画像の全体に占める割合を決定してもよい。この場合、オブジェクトを抽出する方法は限定しない。例えば、前述したように、動く物体を背景と分離する技術、または予めオブジェクトを機械学習によって学習して識別する方法などを用いればよい。 Alternatively, the similarity calculation unit 107 obtains the virtual viewpoint image from the virtual viewpoint image generation unit 103 and performs processing to identify the object from the virtual viewpoint image, thereby calculating the proportion of the area representing the object to the entire virtual viewpoint image. may be determined. In this case, the method of extracting the object is not limited. For example, as described above, a technique for separating moving objects from the background, or a method for identifying objects by learning them in advance by machine learning may be used.

または、類似度算出部１０７は、仮想カメラ３０４の画角（視界）、すなわち仮想視点画像９０１に通常撮像用カメラ１２０が含まれている場合に、類似しているとの判断をしてもよい。 Alternatively, the similarity calculation unit 107 may determine similarity when the angle of view (field of view) of the virtual camera 304, that is, the virtual viewpoint image 901 includes the normal imaging camera 120. .

このような判断がされる場合は、画像処理システムに通常撮像用カメラ１２０の位置を取得可能な機構が備わっており、その機構により取得可能であるとする。カメラの位置を取得する機構としては、予め撮影エリアに複数の反射マーカを設置し、その反射マーカを撮影することでカメラ自身の位置情報を算出する手段が挙げられる。取得した通常撮像用カメラ１２０の位置は、例えば仮想空間上の原点と、原点に設置したカメラ座標との関係から、対応関係を算出することで、カメラ座標を仮想空間上の座標に対応付けられる。また、通常撮像用カメラ１２０について仮想空間上の位置を算出し、その位置を仮想視点画像９０１に射影変換することで、仮想視点画像９０１の画角に含まれるかの判断が可能となる。 When such determination is made, it is assumed that the image processing system is provided with a mechanism capable of acquiring the position of the normal imaging camera 120, and that acquisition is possible by this mechanism. As a mechanism for acquiring the position of the camera, there is a means for calculating the position information of the camera itself by setting a plurality of reflective markers in advance in the photographing area and photographing the reflective markers. For the acquired position of the normal imaging camera 120, the camera coordinates can be associated with the coordinates in the virtual space by calculating the correspondence from the relationship between the origin in the virtual space and the coordinates of the camera installed at the origin, for example. . Further, by calculating the position of the normal imaging camera 120 in the virtual space and subjecting the position to the virtual viewpoint image 901 by projective transformation, it becomes possible to determine whether or not it is included in the angle of view of the virtual viewpoint image 901 .

また、仮想空間上でオブジェクトの一つとして三次元モデルとなっている通常撮像用カメラ１２０を、仮想カメラ３０４の画角に投影し、通常撮像用カメラ１２０の一部が仮想視点画像９０１に含まれるか否かで判断されてもよい。その他、仮想視点画像９０１そのものを画像データとしてオブジェクト認識処理を行い、通常撮像用カメラ１２０が識別された場合に、仮想視点画像９０１に通常撮像用カメラ１２０が含まれているとの判断が行われてもよい。もしくは、仮想視点画像９０１の画角全体ではなく、画角内のあらかじめ定められた所定範囲に対して、上記述べた通常撮像用カメラ１２０が含まれるかが判断されてもよい。 Also, the normal imaging camera 120, which is a three-dimensional model as one of the objects in the virtual space, is projected onto the angle of view of the virtual camera 304, and a part of the normal imaging camera 120 is included in the virtual viewpoint image 901. It may be determined whether or not In addition, object recognition processing is performed using the virtual viewpoint image 901 itself as image data, and when the normal imaging camera 120 is identified, it is determined that the virtual viewpoint image 901 includes the normal imaging camera 120 . may Alternatively, it may be determined whether the above-described normal imaging camera 120 is included in a predetermined range within the angle of view instead of the entire angle of view of the virtual viewpoint image 901 .

このように仮想カメラ３０４の画角に対して通常撮像用カメラ１２０が含まれる場合に、出力制御部１０８は、類似度に基づく出力制御として通常撮像用カメラ１２０の映像を出力するように画像切替装置１３０に指示する。なお、仮想カメラ３０４の画角に通常撮像用カメラ１２０が含まれる場合に、類似度が高いものとして判定を行うと説明したが、これに限定されない。すなわち、類似度の判定に基づく処理とは独立して、仮想カメラ３０４の画角に通常撮像用カメラ１２０が含まれる場合に、出力画像を通常撮像用カメラ１２０の撮像画像に切り替える処理が行われてもよい。 In this way, when the normal imaging camera 120 is included in the angle of view of the virtual camera 304, the output control unit 108 performs image switching so that the image of the normal imaging camera 120 is output as output control based on the degree of similarity. The device 130 is instructed. Although it has been described that the similarity is determined to be high when the normal imaging camera 120 is included in the angle of view of the virtual camera 304, the present invention is not limited to this. That is, independently of the processing based on similarity determination, when the normal imaging camera 120 is included in the angle of view of the virtual camera 304, the processing of switching the output image to the image captured by the normal imaging camera 120 is performed. may

以下、図１０を使用して、仮想カメラ３０４の画角に通常撮像用カメラ１２０が含まれる場合に、出力画像を通常撮像用カメラ１２０の撮像画像に切り替える処理について説明する。なお、図４と同様の処理ステップについては、同じ符号を付し、説明を省略する。また、以下では、仮想カメラ３０４の画角に通常撮像用カメラ１２０が含まれるかの判定が類似度算出部１０７において行われるものとするが、類似度算出部１０７とは別の処理部により行われる構成でもよい。 Processing for switching the output image to the image captured by the normal imaging camera 120 when the angle of view of the virtual camera 304 includes the normal imaging camera 120 will be described below with reference to FIG. 10 . Note that processing steps similar to those in FIG. 4 are denoted by the same reference numerals, and descriptions thereof are omitted. In the following description, it is assumed that the similarity calculation unit 107 determines whether the normal imaging camera 120 is included in the angle of view of the virtual camera 304. Any configuration may be used.

Ｓ１００１において、類似度算出部１０７は、仮想カメラ３０４の位置および向きを表す視点情報に基づき、仮想カメラ３０４の画角（視界）を特定する。また、類似度算出部１０７は、通常撮像用カメラ１２０の視点情報により表される、通常撮像用カメラ１２０の位置と、仮想カメラ３０４の画角との関係を算出する。 In S1001 , the similarity calculation unit 107 identifies the angle of view (field of view) of the virtual camera 304 based on viewpoint information representing the position and orientation of the virtual camera 304 . The similarity calculation unit 107 also calculates the relationship between the position of the normal imaging camera 120 represented by the viewpoint information of the normal imaging camera 120 and the angle of view of the virtual camera 304 .

Ｓ１００２において、類似度算出部１０７は、Ｓ１００２において算出した関係に基づき、通常撮像用カメラ１２０の位置が、仮想カメラ３０４の画角に含まれるかを判定する。このときの判定方法としては、例えば、通常撮像用カメラ１２０の位置座標が仮想カメラ３０４の画角に含まれるかを特定する、あるいは、仮想カメラ３０４に対応する仮想視点画像に通常撮像用カメラ１２０が含まれるかを特定する方法などが用いられる。 In S1002 , the similarity calculation unit 107 determines whether the position of the normal imaging camera 120 is included in the angle of view of the virtual camera 304 based on the relationship calculated in S1002 . As a determination method at this time, for example, it is specified whether the position coordinates of the normal imaging camera 120 are included in the angle of view of the virtual camera 304 , or if the virtual viewpoint image corresponding to the virtual camera 304 is displayed by the normal imaging camera 120 . A method of specifying whether is included is used.

含まれる場合、Ｓ１００３において、出力制御部１０８は、画像切替装置１３０に対して出力制御をＯＮにする。含まれない場合、Ｓ１００４において、出力制御部１０８は、画像切替装置１３０に対して出力制御をＯＦＦにする。なお、Ｓ１００３及びＳ１００４における処理は、それぞれＳ４０５及びＳ４０６と同様の処理であるものとする。 If included, the output control unit 108 turns on output control for the image switching device 130 in S1003. If not included, the output control unit 108 turns OFF the output control for the image switching device 130 in S1004. Note that the processes in S1003 and S1004 are the same as those in S405 and S406, respectively.

以上説明した処理により、仮想カメラ３０４の画角に通常撮像用カメラ１２０が含まれる場合、出力画像が通常撮像用カメラ１２０の撮像画像に切り替わるため、仮想視点画像に通常撮像用カメラ１２０が映り込むことを抑制することができる。なお、図１０で説明した処理に、さらに類似度による判定を組み合わせてもよい。 By the processing described above, when the normal imaging camera 120 is included in the angle of view of the virtual camera 304, the output image is switched to the image captured by the normal imaging camera 120, so that the normal imaging camera 120 is reflected in the virtual viewpoint image. can be suppressed. It should be noted that determination based on the degree of similarity may be further combined with the processing described with reference to FIG. 10 .

さらに、出力制御部１０８は、通常撮像用カメラ１２０の映像を出力するように切り替える際に、仮想視点画像生成部１０３に対して仮想カメラ３０４が通常撮像用カメラ１２０の位置に移動するように指示を行ってもよい。この時、仮想カメラ３０４が通常撮像用カメラ１２０の位置に到達した時点で、通常撮像用カメラ１２０の映像に切り替える。 Further, the output control unit 108 instructs the virtual viewpoint image generation unit 103 to move the virtual camera 304 to the position of the normal imaging camera 120 when switching to output the image of the normal imaging camera 120 . may be performed. At this time, when the virtual camera 304 reaches the position of the normal imaging camera 120 , the image is switched to that of the normal imaging camera 120 .

以上説明したように本実施形態によれば、出力される仮想視点画像の品質低下を抑制しながらユーザが指定した画像を出力する制御をすることができる。 As described above, according to the present embodiment, it is possible to control the output of an image specified by the user while suppressing deterioration in the quality of the output virtual viewpoint image.

＜その他の実施形態＞
上述した実施形態では、画像処理装置１００と画像切替装置１３０とは別の装置であるものとして説明したが、画像切替装置１３０の機能が画像処理装置１００に含まれていてもよい。 <Other embodiments>
In the above embodiment, the image processing device 100 and the image switching device 130 are described as being separate devices, but the functions of the image switching device 130 may be included in the image processing device 100 .

本開示は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present disclosure provides a program that implements one or more functions of the above-described embodiments to a system or device via a network or storage medium, and one or more processors in a computer of the system or device reads and executes the program. It can also be realized by processing to It can also be implemented by a circuit (for example, ASIC) that implements one or more functions.

１００画像処理装置
１０６カメラ情報取得部
１０７類似度算出部
１０８出力制御部 100 image processing device 106 camera information acquisition unit 107 similarity calculation unit 108 output control unit

Claims

First viewpoint information for specifying a virtual viewpoint corresponding to a virtual viewpoint image, and a viewpoint of a second imaging device existing within an imaging range of the first imaging device used to generate the virtual viewpoint image an acquisition means for acquiring second viewpoint information representing
output means for outputting the virtual viewpoint image or the captured image obtained by capturing by the second imaging device;
When the field of view of the virtual viewpoint specified by the first viewpoint information includes the position of the second imaging device specified by the second viewpoint information, the second imaging device is selected by the output means. and control means for controlling such that the captured image obtained by imaging is output.

calculating means for calculating a similarity between viewpoints based on the first viewpoint information and the second viewpoint information;
When the similarity of the viewpoints is higher than a threshold, the control means performs control so that the captured image obtained by the second imaging device is output by the output means. The information processing device according to claim 1 .

image acquisition means for acquiring the virtual viewpoint image corresponding to the virtual viewpoint and the captured image obtained by capturing by the second imaging device;
a calculating means for calculating a degree of similarity between the virtual viewpoint image and the captured image captured by the second imaging device;
has
When the degree of similarity of the images is higher than a threshold, the control means controls the output means to output the captured image obtained by imaging by the second imaging device. The information processing device according to claim 1 .

The acquisition means is
A first position that is the position of the virtual viewpoint and a first direction that is a line-of-sight direction from the virtual viewpoint are obtained as the first viewpoint information, and the second imaging is performed as the second viewpoint information. obtaining a second position, which is the position of the device, and a second direction, which is the orientation direction of the second imaging device;
The calculation means is
4. The information processing apparatus according to claim 2, wherein an angle representing a distance from said first position to said second position and a difference between said first direction and said second direction is calculated. .

The control means is
Control according to the degree of similarity to the output means capable of switching and outputting either the captured image obtained by imaging by the second imaging device or the virtual viewpoint image according to a user's instruction. 5. The information processing apparatus according to any one of claims 2 to 4, wherein an instruction is given so that the

The control means is
6. The method according to any one of claims 2 to 5, wherein when there are a plurality of said second imaging devices, it is determined whether the degree of similarity between each viewpoint of said plurality of second imaging devices and said virtual viewpoint is higher than a threshold. The information processing apparatus according to any one of items 1 and 2.

The control means is
When there are a plurality of the second imaging devices with the degree of similarity higher than a threshold, a captured image obtained by imaging by one of the plurality of second imaging devices is output. 7. The information processing apparatus according to claim 6, wherein control is performed so as to:

a receiving means for receiving an instruction from a user to switch an image output by the output means;
The control means is
When the similarity between the virtual viewpoint and the viewpoint of the second imaging device is higher than a threshold, and the virtual viewpoint image is output, the captured image obtained by imaging by the second imaging device. and even if the user instructs to switch the output image from the captured image obtained by imaging by the second imaging device to the virtual viewpoint image, the second imaging is performed without following the instruction. 8. The information processing apparatus according to any one of claims 2 to 7, wherein the output means is instructed to perform control for outputting the captured image obtained by imaging by the apparatus.

The control means is
The information processing apparatus according to any one of claims 2 to 8, wherein control not based on the degree of similarity is performed when a predetermined condition is satisfied.

the imaging range includes an object;
10. The information processing apparatus according to claim 9, wherein when the predetermined condition is satisfied, the relative speed of the object with respect to the virtual viewpoint is greater than a predetermined value.

the imaging range includes an object;
11. The information processing apparatus according to claim 9, wherein the predetermined condition is satisfied when the proportion of the object in the virtual viewpoint image is smaller than a predetermined value.

The output means is
outputting the virtual viewpoint image or the captured image obtained by imaging by the second imaging device based on a user's instruction;
The control means is
When the field of view of the virtual viewpoint specified by the first viewpoint information includes the position of the second imaging device specified by the second viewpoint information, a predetermined notification is sent to the user. 12. The information processing apparatus according to any one of claims 1 to 11, wherein control is performed to

The predetermined notification includes information indicating that the field of view of the virtual viewpoint specified by the first viewpoint information includes the position of the second imaging device specified by the second viewpoint information. 13. The information processing apparatus according to claim 12, which is a notification.

When the virtual viewpoint image includes an image representing the second imaging device, the control means causes the output means to output the captured image obtained by imaging by the second imaging device. The information processing apparatus according to any one of claims 1 to 13, characterized by controlling.

First viewpoint information for specifying a virtual viewpoint corresponding to a virtual viewpoint image, and a viewpoint of a second imaging device existing within an imaging range of the first imaging device used to generate the virtual viewpoint image an obtaining step of obtaining second viewpoint information representing
an output step of outputting the virtual viewpoint image or the captured image obtained by capturing by the second imaging device;
When the field of view of the virtual viewpoint specified by the first viewpoint information includes the position of the second imaging device specified by the second viewpoint information, the second imaging device is selected in the output step. and a control step of controlling so that the captured image obtained by imaging is output.

A program for causing a computer to function as each means of the information processing apparatus according to any one of claims 1 to 14.