JP7277119B2

JP7277119B2 - Image processing system and image processing method

Info

Publication number: JP7277119B2
Application number: JP2018225717A
Authority: JP
Inventors: アディヤンムジビヤ; 幸太朗丸山; 達広奥野
Original assignee: 株式会社デジタルガレージ
Priority date: 2017-11-30
Filing date: 2018-11-30
Publication date: 2023-05-18
Anticipated expiration: 2038-11-30
Also published as: JP2019102091A

Description

本発明は、画像処理システム、及び画像処理方法に関するもので、特に、表示画像を配信する際に用いる画像処理システム、及び画像処理方法に係わる。 The present invention relates to an image processing system and an image processing method, and more particularly to an image processing system and an image processing method used when distributing display images.

ヘッドマウントディスプレイ（以下、ＨＭＤという）と称される三次元画像表示デバイスでは、ユーザの左右の眼それぞれを対象として、画像を相互に独立して表示され、その独立した画像の各々を左右の眼で同時に視ることにより、ユーザは三次元の仮想空間画像を認識できる。また、ＨＭＤには、ユーザの頭部の動きや視線の動きを検出するセンサが設けられている。このようなＨＭＤを利用して、スポーツや音楽イベントの画像を仮想空間画像で配信するようなサービスが注目されている。また、このような仮想空間画像の配信サービスの普及とともに、仮想空間画像に広告画像を埋め込むことで、高い広告効果が期待できる。また、仮想空間画像から、特定の仮想物体の画像を抽出する技術が知られている（例えば、特許文献１）。 In a three-dimensional image display device called a head-mounted display (hereinafter referred to as HMD), images are displayed independently of each other for the left and right eyes of a user, and each of the independent images is displayed by the left and right eyes. The user can perceive a three-dimensional virtual space image by simultaneously viewing the . In addition, the HMD is provided with a sensor that detects the movement of the user's head and the movement of the line of sight. Attention is paid to a service that distributes images of sports and music events as virtual space images using such HMDs. In addition, with the spread of such virtual space image distribution services, embedding an advertisement image in a virtual space image can be expected to have a high advertising effect. Also, there is known a technique for extracting an image of a specific virtual object from a virtual space image (for example, Patent Document 1).

特開２０１０－２６２６０１号公報JP 2010-262601 A

特許文献１には、画像から特徴量を算出し、算出した特徴量と、予め記憶しておいた所定の物体の特徴量とを比較し、画像に描画されている物体が所定の物体か否かを判定する技術が開示されている。このような技術を用いれば、抽出した仮想物体に応じて、広告画像を挿入したり、画像を効果的に演出させたり強調させたりする画像を追加することができ、仮想空間画像の付加価値を高めることができる。しかしながら、画像の特徴量が大きくなると、画像付加するための処理負荷が増える。 In Patent Document 1, a feature amount is calculated from an image, and the calculated feature amount is compared with the feature amount of a predetermined object stored in advance to determine whether the object drawn in the image is the predetermined object. A technique for determining whether is disclosed. By using such technology, it is possible to insert an advertisement image or add an image that effectively directs or emphasizes the image according to the extracted virtual object, thereby increasing the added value of the virtual space image. can be enhanced. However, when the feature amount of the image increases, the processing load for image addition increases.

上述の課題を鑑み、本発明は画像付加するための処理負荷を増大させずに、表示画像に対して適切な位置に付加画像を挿入できる画像処理システム、及び画像処理方法を提供することを目的とする。 SUMMARY OF THE INVENTION In view of the above problems, it is an object of the present invention to provide an image processing system and an image processing method capable of inserting an additional image at an appropriate position in a displayed image without increasing the processing load for image addition. and

上述の課題を解決するために、本発明の一態様に係る画像処理システムは、仮想空間画像に含まれる所定のパターンを認識するパターン認識部であり、ユーザの視線方向が変わることに応じて前記仮想空間画像が変形した場合、前記仮想空間画像の変形量に基づき前記所定のパターンを変形するパターン認識部と、前記パターン認識部により変形された前記所定のパターンに基づいて、前記変形された所定のパターンに応じた形状の領域であって前記仮想空間画像に付加する付加画像を表示させる表示領域を、前記仮想空間画像から抽出する抽出部と、前記表示領域に応じて前記付加画像の形状を前記変形した所定のパターンの形状に対応するように変換する変換部と、前記抽出部により抽出された前記表示領域に、前記変換部により変換された前記付加画像を表示させる表示設定部と、を備える。 In order to solve the above-described problems, an image processing system according to an aspect of the present invention is a pattern recognition unit that recognizes a predetermined pattern included in a virtual space image, and the above-described a pattern recognition unit that transforms the predetermined pattern based on a deformation amount of the virtual space image when the virtual space image is deformed ; an extraction unit for extracting from the virtual space image a display region having a shape corresponding to the pattern of the virtual space image and displaying an additional image to be added to the virtual space image; and a display setting unit configured to display the additional image converted by the conversion unit in the display area extracted by the extraction unit. Prepare.

また、本発明の一態様に係る画像処理システムは、表示画像に含まれる所定のパターンを認識するパターン認識部であり、ユーザの視線方向が変わることに応じて前記表示画像が変形した場合、前記表示画像の変形量に基づき前記所定のパターンを変形するパターン認識部と、前記パターン認識部により変形された前記所定のパターンに基づいて、前記変形された所定のパターンに応じた形状の領域であって前記表示画像に付加する付加画像を表示させる表示領域を、前記表示画像から抽出する抽出部と、前記抽出部により抽出された前記表示領域に、前記変形した所定のパターンの形状に対応するように変形された前記付加画像を表示させる設定を行う表示設定部と、を備える。 Further, an image processing system according to an aspect of the present invention is a pattern recognition unit that recognizes a predetermined pattern included in a display image, and when the display image is deformed according to a change in the line-of-sight direction of the user, the above a pattern recognition unit that transforms the predetermined pattern based on a deformation amount of a display image; an extraction unit for extracting from the display image a display area in which an additional image to be added to the display image is displayed, and a display area extracted by the extraction unit so as to correspond to the deformed predetermined pattern shape. and a display setting unit for setting to display the additional image that has been deformed into the image.

また、本発明の一態様に係る画像処理方法は、画像処理システムにおける画像処理方法であって、パターン認識部が、仮想空間画像に含まれる所定のパターンを認識し、ユーザの視線方向が変わることに応じて前記仮想空間画像が変形した場合、前記仮想空間画像の変形量に基づき前記所定のパターンを変形し、抽出部が、前記パターン認識部により変形された前記所定のパターンに基づいて、前記変形された所定のパターンに応じた形状の領域であって前記仮想空間画像に付加する付加画像を表示させる表示領域を、前記仮想空間画像から抽出し、変換部が、前記表示領域に応じて前記付加画像の形状を前記変形した所定のパターンの形状に対応するように変換し、表示設定部が、前記抽出部により抽出された前記表示領域に、前記変換部により変換された前記付加画像を表示させる画像処理方法である。 Further, an image processing method according to an aspect of the present invention is an image processing method in an image processing system, in which a pattern recognition unit recognizes a predetermined pattern included in a virtual space image, and changes a user's line-of-sight direction. When the virtual space image is deformed according to the A display area having a shape corresponding to a deformed predetermined pattern and displaying an additional image to be added to the virtual space image is extracted from the virtual space image, and the conversion unit extracts the display area according to the display area. The shape of the additional image is converted so as to correspond to the shape of the deformed predetermined pattern , and the display setting unit displays the additional image converted by the conversion unit in the display area extracted by the extraction unit. This is an image processing method that allows

また、本発明の一態様に係る画像処理システムにおける画像処理方法であって、パターン認識部が、表示画像に含まれる所定のパターンを認識し、ユーザの視線方向が変わることに応じて前記表示画像が変形した場合、前記表示画像の変形量に基づき前記所定のパターンを変形し、抽出部が、前記パターン認識部により変形された前記所定のパターンに基づいて、前記変形された所定のパターンに応じた形状の領域であって前記表示画像に付加する付加画像を表示させる表示領域を、前記表示画像から抽出し、表示設定部が、前記抽出部により抽出された前記表示領域に、前記変形した所定のパターンの形状に対応するように変形された前記付加画像を表示させる設定を行う画像処理方法である。 Further, in the image processing method in the image processing system according to one aspect of the present invention , the pattern recognition unit recognizes a predetermined pattern included in the display image, and the display image is detected in response to a change in the line-of-sight direction of the user. is deformed, the predetermined pattern is deformed based on the amount of deformation of the display image, and the extracting unit deforms the predetermined pattern according to the deformed predetermined pattern based on the predetermined pattern deformed by the pattern recognition unit a display area in which an additional image to be added to the display image is displayed is extracted from the display image, and a display setting unit adds the deformed predetermined The image processing method performs setting for displaying the additional image deformed so as to correspond to the shape of the pattern .

本発明によれば、パターン認識により認識された所定のパターンに基づいて、表示画像に付加する付加画像を表示させる表示領域を抽出することで、より少ない処理負荷で、効果的に付加画像のコンテンツを表示させることができる。 According to the present invention, a display area for displaying an additional image to be added to a display image is extracted based on a predetermined pattern recognized by pattern recognition. can be displayed.

第１の実施の形態の画像処理システム１の概要を示す図である。1 is a diagram showing an overview of an image processing system 1 according to a first embodiment; FIG. 第１の実施の形態に係る仮想空間画像配信サーバ１０または付加画像配信サーバ１１の概要を示すブロック図である。2 is a block diagram showing an outline of a virtual space image distribution server 10 or an additional image distribution server 11 according to the first embodiment; FIG. 第１の実施の形態のユーザ端末装置１２の概要を示すブロック図である。It is a block diagram showing an outline of user terminal device 12 of a 1st embodiment. 第１の実施の形態の三次元画像表示デバイス１３の概要を示すブロック図である。1 is a block diagram showing an outline of a three-dimensional image display device 13 according to a first embodiment; FIG. 第１の実施形態の画像処理システム１の動作の概要を示す説明図である。FIG. 3 is an explanatory diagram showing an overview of the operation of the image processing system 1 of the first embodiment; 第１の実施形態の画像処理システム１における付加コンテンツを挿入する処理の概要を示す説明図である。4 is an explanatory diagram showing an outline of processing for inserting additional content in the image processing system 1 of the first embodiment; FIG. 第１の実施形態の付加コンテンツにおける領域設定の説明図である。FIG. 4 is an explanatory diagram of area setting in additional content according to the first embodiment; 第１の実施形態の付加コンテンツにおける領域設定の説明図である。FIG. 4 is an explanatory diagram of area setting in additional content according to the first embodiment; 第１の実施形態の付加コンテンツにおける領域設定の説明図である。FIG. 4 is an explanatory diagram of area setting in additional content according to the first embodiment; 射影変換の説明図である。FIG. 4 is an explanatory diagram of projective transformation; 第１の実施形態の付加コンテンツにおける領域設定の説明図である。FIG. 4 is an explanatory diagram of area setting in additional content according to the first embodiment; 第１の実施形態において各視聴ルームに付加される付加コンテンツの画像を示す説明図である。FIG. 4 is an explanatory diagram showing images of additional content added to each viewing room in the first embodiment; 第１の実施の形態の画像処理システム１において付加画像の処理を説明するための機能ブロック図である。4 is a functional block diagram for explaining processing of an additional image in the image processing system 1 of the first embodiment; FIG. 第１の実施の形態の画像処理システム１における付加コンテンツの付加処理を説明するためのフローチャートである。4 is a flowchart for explaining additional content addition processing in the image processing system 1 of the first embodiment. 第２の実施の形態の画像処理システム１における付加コンテンツの付加処理を説明するためのフローチャートである。10 is a flowchart for explaining additional content addition processing in the image processing system 1 of the second embodiment.

以下、本実施の形態について図面を参照しながら説明する。
＜第１の実施形態＞
図１は、第１の実施の形態の画像処理システム１の概要を示す図である。図１に示すように、第１の実施の形態に係る画像処理システム１は、仮想空間画像配信サーバ１０と、付加画像配信サーバ１１と、ユーザ端末装置１２と、三次元画像表示デバイス１３とを含む。仮想空間画像配信サーバ１０、付加画像配信サーバ１１、及びユーザ端末装置１２は、ネットワーク１５を介して接続されている。 Hereinafter, this embodiment will be described with reference to the drawings.
<First Embodiment>
FIG. 1 is a diagram showing an outline of an image processing system 1 according to the first embodiment. As shown in FIG. 1, an image processing system 1 according to the first embodiment includes a virtual space image distribution server 10, an additional image distribution server 11, a user terminal device 12, and a three-dimensional image display device 13. include. The virtual space image distribution server 10 , additional image distribution server 11 , and user terminal device 12 are connected via a network 15 .

仮想空間画像配信サーバ１０は、表示画像の一つである仮想空間画像のコンテンツを配信するサーバである。仮想空間画像配信サーバ１０には、仮想空間画像として３６０度全天球の三次元画像が蓄積されている。仮想空間画像配信サーバ１０は、ユーザ端末装置１２からの要求に基づいて、仮想空間画像をユーザ端末装置１２に送信する。表示画像は、複数の静止画が連続して表示することで構成される動画（映像ともいう）である。なお、仮想空間画像配信サーバ１０は、仮想空間画像のコンテンツを配信しているが、配信の対象は、仮想空間画像のコンテンツに限定されるものではなく、表示が可能な表示画像のコンテンツであればよい。例えば、二次元画像のコンテンツであってもよい。 The virtual space image distribution server 10 is a server that distributes contents of a virtual space image, which is one of display images. The virtual space image delivery server 10 stores 360-degree spherical three-dimensional images as virtual space images. The virtual space image distribution server 10 transmits virtual space images to the user terminal device 12 based on a request from the user terminal device 12 . A display image is a moving image (also referred to as video) formed by continuously displaying a plurality of still images. Although the virtual space image distribution server 10 distributes virtual space image content, the distribution target is not limited to virtual space image content, and may be display image content that can be displayed. Just do it. For example, the content may be a two-dimensional image.

付加画像配信サーバ１１は、主コンテンツに付加する広告画像等の付加コンテンツの画像（付加画像ともいう）を配信するサーバである。付加画像配信サーバ１１には、広告画像等の付加コンテンツの画像が蓄積されている。 The additional image distribution server 11 is a server that distributes an image of additional content such as an advertisement image added to the main content (also referred to as an additional image). Images of additional content such as advertisement images are accumulated in the additional image distribution server 11 .

なお、この例では、ネットワーク１５上に１つの仮想空間画像配信サーバ１０及び付加画像配信サーバ１１のみ図示されているが、仮想空間画像配信サーバ１０や付加画像配信サーバ１１は１つである必要はなく、複数に分散した構成としても良い。また、仮想空間画像配信サーバ１０と付加画像配信サーバ１１とは、物理的に同一のサーバ上に構築できる。 In this example, only one virtual space image distribution server 10 and one additional image distribution server 11 are shown on the network 15, but it is not necessary to have one virtual space image distribution server 10 and one additional image distribution server 11. It is also possible to adopt a configuration in which the number is distributed to a plurality of parts instead of the number. Also, the virtual space image distribution server 10 and the additional image distribution server 11 can be built on the same physical server.

ユーザ端末装置１２は、ユーザ１４が保有する端末である。ユーザ端末装置１２としては、ＰＣ（Personal Computer）、タブレット端末、スマートフォン、ゲーム端末等が利用できる。また、ユーザ端末装置１２として専用の端末を用意しても良い。ユーザ端末装置１２には、三次元画像表示デバイス１３が接続されている。また、ユーザ端末装置１２には、仮想空間画像のデータから仮想空間画像を再生するためのアプリケーションプログラムがインストールされている。ここでいうユーザ１４が保有するとは、ユーザ１４が用いることができる状態のことをいう。 The user terminal device 12 is a terminal owned by the user 14 . As the user terminal device 12, a PC (Personal Computer), a tablet terminal, a smart phone, a game terminal, or the like can be used. Also, a dedicated terminal may be prepared as the user terminal device 12 . A three-dimensional image display device 13 is connected to the user terminal device 12 . An application program for reproducing a virtual space image from virtual space image data is installed in the user terminal device 12 . Here, the state that the user 14 owns it means that the user 14 can use it.

三次元画像表示デバイス１３は、ユーザ１４の左右の眼夫々を対象として、表示画像を相互に独立して表示し、その独立した表示画像の夫々を左右の眼で同時に視ることにより、ユーザに三次元の仮想空間画像を認識可能とする。三次元画像表示デバイス１３としては、ユーザ１４の頭部に装着するような所謂ＨＭＤが用いられる。三次元画像表示デバイス１３には、左右の眼それぞれを対象とした表示部が設けられる。また、三次元画像表示デバイス１３には、ユーザ１４の頭部の動きを検出するセンサや、ユーザ１４の視線を検出するカメラ等の撮像部等が設けられる。 The three-dimensional image display device 13 displays display images independently of each other for the left and right eyes of the user 14. By simultaneously viewing the independent display images with the left and right eyes, the user can A three-dimensional virtual space image can be recognized. A so-called HMD that is worn on the head of the user 14 is used as the three-dimensional image display device 13 . The three-dimensional image display device 13 is provided with display units for the left and right eyes, respectively. Further, the three-dimensional image display device 13 is provided with a sensor for detecting the movement of the user's head and an imaging unit such as a camera for detecting the line of sight of the user 14 .

なお、三次元画像表示デバイス１３は、ＨＭＤに限定されるものではない。メガネ型の表示デバイスや、裸眼立体ディスプレイ等、他の方式の三次元表示デバイスを用いても良い。また、ユーザ端末装置１２と三次元画像表示デバイス１３との間は、ワイヤで接続しても良いし、又はＢｌｕｅｔｏｏｔｈ（登録商標）等によりワイヤレスで接続しても良い。また、ユーザ端末装置１２と三次元画像表示デバイス１３とを一体的に構成しても良い。また、この例では、１つのユーザ端末装置１２及び三次元画像表示デバイス１３のみが図示されているが、ユーザ端末装置１２及び三次元画像表示デバイス１３は、複数存在しても良い。 Note that the three-dimensional image display device 13 is not limited to an HMD. Other types of three-dimensional display devices, such as a glasses-type display device and an autostereoscopic display, may also be used. Also, the user terminal device 12 and the three-dimensional image display device 13 may be connected by wire, or wirelessly by Bluetooth (registered trademark) or the like. Also, the user terminal device 12 and the three-dimensional image display device 13 may be configured integrally. Also, in this example, only one user terminal device 12 and three-dimensional image display device 13 are illustrated, but a plurality of user terminal devices 12 and three-dimensional image display devices 13 may exist.

ネットワーク１５は、例えばインターネットやイントラネットと称されるＩＰ（Internet Protocol）ネットワークである。物理層及びデータリンク層の構成としては、ＬＡＮ（Local Area Network）、無線ＬＡＮ、ＣＡＴＶ（Community Antenna TeleVision）通信網、仮想専用網（Virtual Private Network）、電話回線網、移動体通信網、衛星通信網等が想定される。 The network 15 is, for example, an IP (Internet Protocol) network called the Internet or an intranet. The configuration of the physical layer and data link layer includes LAN (Local Area Network), wireless LAN, CATV (Community Antenna TeleVision) communication network, virtual private network, telephone line network, mobile communication network, satellite communication. A net is assumed.

図２は、第１の実施の形態の仮想空間画像配信サーバ１０及び付加画像配信サーバ１１の概要を示すブロック図である。図２に示すように、仮想空間画像配信サーバ１０及び付加画像配信サーバ１１は、プロセッサ１０１と、メモリ１０２と、データベース１０３と、通信インターフェース１０４とを備えている。 FIG. 2 is a block diagram showing an outline of the virtual space image distribution server 10 and additional image distribution server 11 of the first embodiment. As shown in FIG. 2, the virtual space image distribution server 10 and the additional image distribution server 11 are provided with a processor 101, a memory 102, a database 103, and a communication interface 104. FIG.

プロセッサ１０１はＣＰＵ（Central Processing Unit）からなり、プログラムに基づいて、各種の処理を実行する。メモリ１０２は、ＲＯＭ( Read Only Memory)、ＲＡＭ( Random Access Memory )、ストレージデバイス等からなる。データベース１０３は、ＨＤＤ（Hard Disk Drive）等の大容量のストレージデバイスから構成されている。通信インターフェース１０４は、ネットワーク１５を介して、データを送受するためのインターフェースである。 The processor 101 is composed of a CPU (Central Processing Unit) and executes various processes based on programs. The memory 102 includes a ROM (Read Only Memory), a RAM (Random Access Memory), a storage device, and the like. The database 103 is composed of a large-capacity storage device such as an HDD (Hard Disk Drive). A communication interface 104 is an interface for transmitting and receiving data via the network 15 .

なお、仮想空間画像配信サーバ１０と付加画像配信サーバ１１は、その基本的な構成は同様であるが、仮想空間画像配信サーバ１０のデータベース１０３には、主コンテンツとなる仮想空間画像のデータが蓄積されているのに対して、付加画像配信サーバ１１のデータベース１０３には、広告画像等の付加コンテンツの画像が蓄積されている。 Although the virtual space image distribution server 10 and the additional image distribution server 11 have the same basic configuration, the virtual space image data serving as the main content is accumulated in the database 103 of the virtual space image distribution server 10. On the other hand, the database 103 of the additional image distribution server 11 stores additional content images such as advertisement images.

図３は、第１の実施の形態のユーザ端末装置１２の概要を示すブロック図である。図３に示すように、ユーザ端末装置１２は、プロセッサ２０１と、メモリ２０２と、ストレージデバイス２０３と、通信インターフェース２０４と、入出力インターフェース２０５とを備えている。 FIG. 3 is a block diagram showing an overview of the user terminal device 12 according to the first embodiment. As shown in FIG. 3 , the user terminal device 12 comprises a processor 201 , memory 202 , storage device 203 , communication interface 204 and input/output interface 205 .

プロセッサ２０１はＣＰＵからなり、プログラムに基づいて、各種の処理を実行する。メモリ２０２は、ＲＯＭ、ＲＡＭ等からなり、プログラムに従って各種の処理を実行する。ストレージデバイス２０３は、ＨＤＤ、フラッシュメモリドライブ、光ディスクドライブ等からなり、各種のプログラムやデータを保存する。通信インターフェース２０４は、ネットワーク１５を介して、データを送受するためのインターフェースである。通信インターフェース２０４としては、有線ＬＡＮ（Ethernet（登録商標））、無線ＬＡＮ（IEEE802.11）等が用いられる。また、通信インターフェース２０４は、ＷＣＤＭＡ（登録商標）（Wideband Code Division Multiple Access）やＬＴＥ（Long Term Evolution）等の移動体通信網の規格のものであっても良い。入出力インターフェース２０５は、ユーザ端末装置１２に各種の機器を接続するインターフェースである。入出力インターフェース２０５としては、例えばＵＳＢ（Universal Serial Bus）、ＩＥＥＥ１３９４、及びＨＤＭＩ（登録商標）（High-Definition Multimedia Interface）が用いられる。また、入出力インターフェース２０５としては、Ｂｌｕｅｔｏｏｔｈ（登録商標）やＩｒＤＡ（Infrared Data Association）のようなワイヤレスのインターフェースを用いても良い。入出力インターフェース２０５には、三次元画像表示デバイス１３が接続される。 The processor 201 is composed of a CPU and executes various processes based on programs. The memory 202 is composed of ROM, RAM, etc., and executes various processes according to programs. A storage device 203 is composed of an HDD, a flash memory drive, an optical disk drive, etc., and stores various programs and data. A communication interface 204 is an interface for transmitting and receiving data via the network 15 . As the communication interface 204, a wired LAN (Ethernet (registered trademark)), a wireless LAN (IEEE802.11), or the like is used. Also, the communication interface 204 may conform to mobile communication network standards such as WCDMA (registered trademark) (Wideband Code Division Multiple Access) and LTE (Long Term Evolution). The input/output interface 205 is an interface that connects various devices to the user terminal device 12 . As the input/output interface 205, for example, USB (Universal Serial Bus), IEEE1394, and HDMI (registered trademark) (High-Definition Multimedia Interface) are used. As the input/output interface 205, a wireless interface such as Bluetooth (registered trademark) or IrDA (Infrared Data Association) may be used. The 3D image display device 13 is connected to the input/output interface 205 .

図４は、第１の実施の形態の三次元画像表示デバイス１３の概要を示すブロック図である。図４に示すように、三次元画像表示デバイス１３は、入出力インターフェース３０１と、画像処理部３０２と、表示部であるディスプレイ３０３（３０３ａ及び３０３ｂ）と、プロセッサ３０５と、メモリ３０６と、操作入力部３０７と、撮像部である視線検出カメラ３０８（３０８ａ及び３０８ｂ）と、センサである姿勢センサ３０９とを含む。 FIG. 4 is a block diagram showing an outline of the three-dimensional image display device 13 of the first embodiment. As shown in FIG. 4, the three-dimensional image display device 13 includes an input/output interface 301, an image processing unit 302, a display 303 (303a and 303b) as a display unit, a processor 305, a memory 306, an operation input It includes a unit 307, a line-of-sight detection camera 308 (308a and 308b) that is an imaging unit, and an orientation sensor 309 that is a sensor.

入出力インターフェース３０１は、ユーザ端末装置１２と接続されるインターフェースである。入出力インターフェース３０１としては、例えばＵＳＢ、ＩＥＥＥ１３９４、及びＨＤＭＩ等が用いられる。また、入出力インターフェース３０１としては、ＢｌｕｅｔｏｏｔｈやＩｒＤＡのようなワイヤレスのインターフェースを用いても良い。 The input/output interface 301 is an interface connected to the user terminal device 12 . As the input/output interface 301, for example, USB, IEEE1394, HDMI, etc. are used. Also, as the input/output interface 301, a wireless interface such as Bluetooth or IrDA may be used.

画像処理部３０２は、プロセッサ３０５の制御の下に、入出力インターフェース３０１から入力された仮想空間画像のデータから、左眼を対象とした左眼用の表示画像の映像信号と右眼を対象とした右眼用の表示画像の映像信号を生成する。また、画像処理部３０２は、左眼用の表示画像と右眼用の表示画像とのディスプレイ３０３への出力タイミングの同期を図る同期制御を行う。 Under the control of the processor 305, the image processing unit 302 converts the virtual space image data input from the input/output interface 301 into a video signal of a display image for the left eye and a video signal for the right eye. A video signal of the display image for the right eye is generated. The image processing unit 302 also performs synchronization control for synchronizing the output timings of the display image for the left eye and the display image for the right eye to the display 303 .

ディスプレイ３０３ａ及び３０３ｂは、画像処理部３０２からの左眼用の表示画像の映像信号及び右眼用の表示画像の映像信号に基づいて、それぞれ、左眼用の表示画像及び右眼用の表示画像を映し出す。ディスプレイ３０３ａ及び３０３ｂとしては、ＬＣＤ（Liquid Crystal Display）や有機ＥＬ（Electro Luminescence）ディスプレイが用いられる。また、ディスプレイ３０３ａ及び３０３ｂは、画像処理部３０２によって左眼用の画像と右眼用の画像との出力タイミングを同期制御する同期制御信号に基づいて、左右の視界を交互に遮蔽するシャッターを備えている。 The displays 303a and 303b display the display image for the left eye and the display image for the right eye, respectively, based on the video signal for the display image for the left eye and the video signal for the display image for the right eye from the image processing unit 302. reflect. An LCD (Liquid Crystal Display) or an organic EL (Electro Luminescence) display is used as the displays 303a and 303b. The displays 303a and 303b are provided with shutters that alternately block the left and right fields of view based on a synchronization control signal for synchronously controlling the output timing of the image for the left eye and the image for the right eye by the image processing unit 302. ing.

プロセッサ３０５は、プログラムに基づいて、装置全体の制御を行っている。メモリ３０６は、プログラムに基づいて、各種の処理データの読み出し／書き込みを行う。操作入力部３０７は、各種のボタンからなり、各種の動作設定を行う。なお、操作入力部３０７としては、リモートコントローラを用いても良い。 A processor 305 controls the entire apparatus based on a program. The memory 306 reads/writes various processing data based on a program. The operation input unit 307 includes various buttons and performs various operation settings. A remote controller may be used as the operation input unit 307 .

視線検出カメラ３０８ａ及び３０８ｂは、ユーザ１４の視点、もしくは視点及び視線を検出するために、ユーザ１４の左右の眼の画像である眼画像を撮影する。第１の実施の形態は、非接触型の測定方法によるアイトラッキングを行い、ユーザ１４の視点を検出できる視点検出用のカメラを用いている。ここでの非接触型の測定方法は、ユーザ１４の眼に弱い赤外線を当てながら視線検出カメラ３０８でユーザの眼を撮像することにより視点を検出する方法である。なお、第１の実施の形態では、視線検出カメラ３０８が、非接触型の測定方法により、赤外線を用いた撮像を行うことにより視点を検出する場合を例示した。しかしながら、これに限定されるものではなく、視線検出カメラ３０８は、少なくともユーザ１４の視点を検出することができればよく、撮像形態は問わない。また、視線検出カメラ３０８が、接触型など任意の測定方法を採用してもよいし、赤外線が用いられなくてもよい。 The line-of-sight detection cameras 308a and 308b capture eye images, which are images of the left and right eyes of the user 14, in order to detect the user's 14 viewpoint, or the viewpoint and line of sight. The first embodiment uses a viewpoint detection camera capable of detecting the viewpoint of the user 14 by performing eye tracking using a non-contact measurement method. The non-contact measurement method here is a method of detecting the viewpoint by capturing an image of the user's eyes with the line-of-sight detection camera 308 while irradiating the eyes of the user 14 with weak infrared rays. In the first embodiment, the visual line detection camera 308 detects the viewpoint by capturing an image using infrared rays by a non-contact measurement method. However, it is not limited to this, and the line-of-sight detection camera 308 only needs to be able to detect at least the viewpoint of the user 14, regardless of the imaging mode. In addition, the line-of-sight detection camera 308 may adopt an arbitrary measurement method such as a contact type, and infrared rays may not be used.

姿勢センサ３０９は、ユーザ１４の姿勢を検出するために、ユーザ１４の頭部の動きを検出する。姿勢センサ３０９として、例えば三軸ジャイロセンサ、三軸加速度センサ等が用いられる。なお、本実施形態では、三次元画像表示デバイス１３としてＨＭＤを適用しているため、ユーザ１４の頭部の動きを検出している。しかしながら、これに限定されるものでなく、三次元画像表示デバイス１３は、ユーザ１４の眼の情報（視点情報や視線情報）と傾き情報とを検出できるものであれば、デバイス形態は問わない。 The posture sensor 309 detects movement of the user's 14 head in order to detect the posture of the user 14 . As the attitude sensor 309, for example, a three-axis gyro sensor, a three-axis acceleration sensor, or the like is used. In this embodiment, since an HMD is applied as the 3D image display device 13, the movement of the user 14's head is detected. However, the 3D image display device 13 is not limited to this, and the device form does not matter as long as the 3D image display device 13 can detect eye information (viewpoint information and line-of-sight information) and tilt information of the user 14 .

次に、第１の実施の形態に係る画像処理システム１の動作の概要について説明する。図５は、第１の実施形態の画像処理システム１の動作の概要を示す説明図である。以下の説明では、主コンテンツとして、コンサートのライブ映像を、表示画像としての仮想空間画像で配信する場合について説明する。なお、主コンテンツを、コンサートのライブ映像としているが、スポーツのライブ映像や、他のエンターテインメントのライブ映像であってもよい。また、主コンテンツとして、録画された録画ライブ映像であってもよい。録画ライブ映像では、録画後に動画編集などをすることが可能である。録画ライブ映像に比べて、ライブ映像は、リアルタイムでの生の映像であるため、アクシデントが起こりやすい。しかしながら、ライブ映像は、宣伝の鮮度が良く広告効果が見込める。第１の実施の形態は、ライブ映像に適した形態である。 Next, an overview of the operation of the image processing system 1 according to the first embodiment will be described. FIG. 5 is an explanatory diagram showing an overview of the operation of the image processing system 1 of the first embodiment. In the following description, a case will be described in which a live video of a concert is distributed as a main content in the form of a virtual space image as a display image. Although the main content is a concert live video, it may be a sports live video or other entertainment live video. Also, the main content may be recorded live video. With recorded live video, it is possible to edit the video after recording. Compared to recorded live video, live video is raw video in real time, so accidents are more likely to occur. However, with live video, the freshness of advertisements is good and advertising effects can be expected. The first embodiment is suitable for live video.

主コンテンツとしてコンサートのライブ映像を仮想空間画像で配信する場合、例えば、３６０度全天球のカメラ４０２を用いてコンサートのライブ映像４０１が撮影され、この実写を基に、ライブ会場の仮想空間画像のコンテンツ画像が生成される。このコンテンツ画像が仮想空間画像配信サーバ１０に保存される。 When distributing a live video of a concert as a virtual space image as the main content, for example, a live video 401 of the concert is captured using a 360-degree omnidirectional camera 402, and based on this actual shot, a virtual space image of the live venue is generated. content image is generated. This content image is stored in the virtual space image delivery server 10 .

ユーザ１４は、三次元画像表示デバイス１３を頭部に装着し、ユーザ端末装置１２により仮想空間画像配信サーバ１０にアクセスする。そして、ユーザ操作を基に、コンサートのライブ映像の要求をユーザ端末装置１２から仮想空間画像配信サーバ１０に送信する。仮想空間画像配信サーバ１０は、コンサートのライブ映像の要求を受信すると、仮想空間画像配信サーバ１０から、コンサートの３６０度全天球のライブ映像の仮想空間画像を読み出す。 The user 14 wears the three-dimensional image display device 13 on the head and accesses the virtual space image distribution server 10 using the user terminal device 12 . Based on the user's operation, a request for live video of the concert is transmitted from the user terminal device 12 to the virtual space image distribution server 10 . When the virtual space image delivery server 10 receives the request for the live video of the concert, the virtual space image of the 360-degree omnidirectional live video of the concert is read from the virtual space image delivery server 10 .

ユーザ端末装置１２は、受信した主コンテンツの三次元画像４０３を再生し、三次元画像表示デバイス１３に送る。三次元画像表示デバイス１３は、ユーザ１４の視野（視野領域）に応じて、コンサートのライブ映像の三次元画像４０３を表示する。ここで、ユーザ１４が頭部を動かしたり、視線を動かしたりして、視野角を変化させると、三次元画像表示デバイス１３に表示される三次元画像４０３が視野に応じて変化する。これにより、ユーザ１４は、恰もコンサート会場でライブ映像を見ているような感覚でライブ映像を楽しむことができる。ここでいう視野角は、中心視領域と周辺視領域とから構成される視野領域のことを意味する（以下、同様）。 The user terminal device 12 reproduces the received 3D image 403 of the main content and sends it to the 3D image display device 13 . The 3D image display device 13 displays a 3D image 403 of live video of the concert according to the field of view (viewing area) of the user 14 . Here, when the user 14 changes the viewing angle by moving the head or moving the line of sight, the 3D image 403 displayed on the 3D image display device 13 changes according to the viewing angle. As a result, the user 14 can enjoy the live video as if watching the live video at the concert hall. The viewing angle here means a viewing area composed of a central viewing area and a peripheral viewing area (the same applies hereinafter).

また、ユーザ１４は、三次元画像表示デバイス１３により主コンテンツの画像（三次元画像４０３）を見ることで、ライブ映像を楽しんでいる。ここで、画像処理システム１を用いたライブ鑑賞は、直にライブを鑑賞しに行くのに比べて、次に示すメリットがある。 Also, the user 14 enjoys the live video by viewing the main content image (three-dimensional image 403 ) on the three-dimensional image display device 13 . Here, viewing a live performance using the image processing system 1 has the following merits compared to going directly to a live performance.

例えば、図５に示すように、映像アングルを変えることで、複数の鑑賞位置から、ライブを鑑賞することができ、手軽に好きな撮影アングルからのライブ鑑賞が可能となる。特に、実際にライブに行っても、現場では見ることができない、観客席や観客エリアが存在しない例えば上空からのライブ鑑賞などができる。図５に示す例では、１つの主コンテンツに対して、４つの異なる鑑賞位置から撮影した撮影アングルの画像４０４ａ、４０４ｂ、４０４ｃ、４０４ｄとする。このように、１つの主コンテンツを予め設定した視聴条件に基づき、主コンテンツに対して複数の異なる画像を提供する。 For example, as shown in FIG. 5, by changing the video angle, the live performance can be viewed from a plurality of viewing positions, making it possible to easily view the live performance from any shooting angle. In particular, even if you actually go to a live performance, you can watch the live performance from the sky, for example, where there are no spectator seats or spectator areas, which you cannot see on site. In the example shown in FIG. 5, images 404a, 404b, 404c, and 404d are taken from four different viewing positions for one main content. In this way, a plurality of different images are provided for the main content based on viewing conditions set in advance for one main content.

第１の実施の形態では、主コンテンツに対して複数の異なる画像のことを、それぞれ視聴ルームと定義し、図５では、４つの異なる撮影アングルの主コンテンツの画像４０４ａ、４０４ｂ、４０４ｃ、４０４ｄを、視聴ルーム４０４ａ、４０４ｂ、４０４ｃ、４０４ｄとする。なお、ここでは、視聴条件として、異なる撮影アングルを条件としているが、これに限定されるものではなく、ルーム属性情報（下記参照）やユーザ属性情報（下記参照）等に基づく付加価値を視聴条件としてもよい。 In the first embodiment, a plurality of different images for the main content are defined as viewing rooms, and in FIG. , viewing rooms 404a, 404b, 404c, and 404d. Here, the viewing conditions are different shooting angles, but the viewing conditions are not limited to this, and added value based on room attribute information (see below), user attribute information (see below), etc. may be

例えば、コンサートのライブ映像では、撮影アングルの異なるライブ映像の視聴ルーム４０４ａ、４０４ｂ、４０４ｃ、４０４ｄに限定されずに、スタンダードルーム、スポンサードルーム、デコレートルーム、ファンルーム等、同一対象のライブ映像（主コンテンツの三次元画像４０３）で視点や装飾を変えた幾つかの視聴ルーム４０４ａ、４０４ｂ、４０４ｃ、４０４ｄが用意（設定）されてもよい。この場合、ユーザ１４は、視聴時の選択、ルーム属性情報、各自の嗜好に応じたユーザ属性情報等に応じて、視聴ルーム４０４ａ、４０４ｂ、４０４ｃ、４０４ｄを切り替えることができる。また、ユーザ１４が制作したユーザ１４の視聴ルームで視聴することもできる。ユーザ属性情報は、ユーザ１４が予め設定登録したユーザ１４の嗜好情報のことをいう。ユーザ属性情報は、予め設定した複数のユーザ１４に関する属性の属性項目の情報である。ユーザ属性情報は、属性項目に対してユーザ１４が自ら個別入力した情報であってもよい。ユーザ属性情報は、ユーザ自身がユーザ端末装置２を用いて入力し、入力したユーザ１４の行動（入力情報）を、仮想空間画像配信サーバ１０、付加画像配信サーバ１１、ユーザ端末装置２、三次元画像表示デバイス１３の少なくとも１つに記録し、記録したユーザ１４の入力情報に基づいて機械学習を行い、その結果、最適とされるユーザ行動情報であってもよい。例えば、ユーザ属性情報として、年齢、性別、出身地、現在住所などの個人情報がある。また、ユーザ属性情報として、好きなアーティスト、好きな音楽のジャンル、ライブへの視聴回数、実際のライブへの参加回数、ファン歴、会員登録しているアーティスト名などの音楽に関する音楽情報がある。また、ユーザ属性情報として、アウトドアの趣味、インドアの趣味、グルメの趣味、ファッションの趣味など、個人嗜好に関する嗜好情報がある。 For example, live video of a concert is not limited to live video viewing rooms 404a, 404b, 404c, and 404d with different shooting angles. Several viewing rooms 404a, 404b, 404c, 404d may be prepared (set) with different viewpoints and decorations in the main content 3D image 403). In this case, the user 14 can switch between the viewing rooms 404a, 404b, 404c, and 404d according to the selection at the time of viewing, room attribute information, user attribute information according to individual preferences, and the like. Also, it is possible to view in the user's 14 viewing room created by the user 14 . The user attribute information refers to preference information of the user 14 preset and registered by the user 14 . The user attribute information is attribute item information of attributes related to a plurality of preset users 14 . The user attribute information may be information individually input by the user 14 for the attribute item. The user attribute information is input by the user himself/herself using the user terminal device 2, and the behavior (input information) of the input user 14 is transmitted to the virtual space image distribution server 10, the additional image distribution server 11, the user terminal device 2, the three-dimensional The information may be user behavior information that is recorded in at least one of the image display devices 13, machine learning is performed based on the recorded input information of the user 14, and as a result, the optimum user behavior information is obtained. For example, user attribute information includes personal information such as age, gender, hometown, and current address. User attribute information includes music-related music information such as favorite artists, favorite music genres, number of live viewings, number of actual live participation, fan history, and names of registered artists. Further, user attribute information includes preference information related to individual preferences, such as outdoor hobbies, indoor hobbies, gourmet hobbies, and fashion hobbies.

また、主コンテンツの画像５０１（図６参照）には、広告画像のような付加コンテンツの画像を挿入できる。図６は、第１の実施の形態の画像処理システム１における付加コンテンツを挿入する処理の概要を示す説明図である。前述したように、付加画像配信サーバ１１には、広告画像等の付加コンテンツの画像５１１（５１１ａ、５１１ｂ）が蓄積されている。付加コンテンツの表示領域枠５０２（５０２ａ、５０２ｂ）は、主コンテンツの画像５０１の特徴の抽出により設定される。そして、付加画像配信サーバ１１は、付加コンテンツの画像５１１をユーザ端末装置１２に配信する。これにより、図６に示すように、主コンテンツの画像５０１に、広告画像等の付加コンテンツの画像５１１ａ及び５１１ｂが付加される。 Also, an additional content image such as an advertisement image can be inserted into the main content image 501 (see FIG. 6). FIG. 6 is an explanatory diagram showing an outline of processing for inserting additional content in the image processing system 1 according to the first embodiment. As described above, the additional image distribution server 11 stores additional content images 511 (511a and 511b) such as advertisement images. The additional content display area frame 502 (502a, 502b) is set by extracting the features of the image 501 of the main content. Then, the additional image distribution server 11 distributes the additional content image 511 to the user terminal device 12 . As a result, as shown in FIG. 6, additional content images 511a and 511b such as advertisement images are added to the main content image 501. FIG.

このように、第１の実施の形態に係る画像処理システム１では、広告画像や、仮想空間画像のコンテンツを強調させる画像等を付加コンテンツの画像５１１ａ及び５１１ｂとして主コンテンツの画像５０１上に付加して表示できる。主コンテンツの画像５０１に対してこのような付加コンテンツの画像５１１ａ及び５１１ｂを付加することで、ユーザ１４に多様な情報を提供することができ、また、収益の増大を図ることができる。 As described above, in the image processing system 1 according to the first embodiment, the advertisement image, the image for emphasizing the content of the virtual space image, and the like are added to the main content image 501 as the additional content images 511a and 511b. can be displayed. By adding such additional content images 511a and 511b to the main content image 501, it is possible to provide a variety of information to the user 14 and increase profits.

次に、第１の実施の形態に係る画像処理システム１における付加コンテンツの挿入について詳述する。付加コンテンツの画像５１１は、主コンテンツに対して自然に合致するようにするとともに、ユーザ１４の注目度が高く、広告効果が高い場所に付加することが望まれる。また、仮想空間画像では、ユーザ１４の視野角は、動的に変化する。このため、ユーザ１４の視野角によって、最適な付加コンテンツの位置や形状は変化する。そこで、第１の実施の形態では、付加コンテンツの領域を以下のような事項に基づいて決定している。なお、以下の事項では数字（１）～（４）を付けて説明しているが、付加コンテンツの領域の決定の説明をし易くするために列挙しているだけである。 Next, insertion of additional content in the image processing system 1 according to the first embodiment will be described in detail. The image 511 of the additional content should naturally match the main content, and should be added to a place where the degree of attention of the user 14 is high and the advertising effect is high. Also, in the virtual space image, the viewing angle of the user 14 changes dynamically. Therefore, the optimum position and shape of additional content change depending on the viewing angle of the user 14 . Therefore, in the first embodiment, the additional content area is determined based on the following matters. Note that although numbers (1) to (4) are used in the following description, they are listed only for the purpose of facilitating the description of determination of additional content areas.

（１）表示中の表示画像に含まれる３６０度全天球空間内にある所望のパターンをパターン認識し、このパターン認識に基づいて、付加コンテンツの表示領域枠５０２を設定する。
（２）認識したパターンに対して、視線方向に応じて射影変換マトリクスを計算し、付加コンテンツの表示領域枠５０２の形状を変換する。
（３）変換した表示領域枠がユーザ１４にとって見にくい表示となる場合には、付加コンテンツを表示しない。
（４）ユーザ１４の視線滞留時間を判定し、ユーザ１４の視線滞留時間が一定時間以上の場合のみ、付加コンテンツを表示する。 (1) A desired pattern in the 360-degree omnidirectional space included in the display image being displayed is pattern-recognized, and the display area frame 502 of the additional content is set based on this pattern recognition.
(2) For the recognized pattern, a projective transformation matrix is calculated according to the line-of-sight direction, and the shape of the additional content display area frame 502 is transformed.
(3) If the converted display area frame is difficult for the user 14 to view, the additional content is not displayed.
(4) The user 14's line-of-sight retention time is determined, and additional content is displayed only when the line-of-sight retention time of the user 14 is equal to or longer than a certain period of time.

以下、（１）～（４）に示した事項について説明する。まず、事項（１）のパターン認識による付加コンテンツの領域設定について説明する。図７～図１１は、第１の実施の形態における付加コンテンツの領域設定の説明図である。図７～図１１において、破線Ｘは、主コンテンツの映像の３６０度全天球を示している。 The items shown in (1) to (4) will be described below. First, area setting of additional content by pattern recognition of item (1) will be described. 7 to 11 are explanatory diagrams of area setting for additional content according to the first embodiment. 7 to 11, the broken line X indicates the 360-degree omnidirectional sphere of the video of the main content.

図７は、ユーザ１４が正面を向いているときの画像を示している。ユーザ１４が正面を向いているときには、図７に示すように、主コンテンツの画像５０１が映し出される。この場合、画像処理システム１によれば、主コンテンツの画像５０１の中で例えば長方形のパターンをパターン認識し、認識されたパターンの部分に付加コンテンツの表示領域枠５０２ａ及び５０２ｂを設定すると、自然な感覚で主コンテンツに付加コンテンツの画像５１１ａ及び５１１ｂを挿入できる。 FIG. 7 shows the image when the user 14 is facing forward. When the user 14 faces the front, a main content image 501 is displayed as shown in FIG. In this case, according to the image processing system 1, for example, a rectangular pattern in the main content image 501 is pattern-recognized, and the additional content display area frames 502a and 502b are set in the recognized pattern portions, so that a natural image can be obtained. Images 511a and 511b of the additional content can be inserted into the main content by feeling.

つまり、広告画像のような付加コンテンツの形状は、長方形であることが多い。このことから、画像処理システム１によれば、主コンテンツの画像５０１の中で長方形の形状をパターン認識し、この長方形の形状の部分に長方形の形状の付加コンテンツの画像５１１を挿入すれば、自然な感覚で付加コンテンツを挿入できる。同様に、付加コンテンツの形状が円形なら、主コンテンツの画像５０１の中で円形の形状をパターン認識し、この円形の形状の部分に円形の形状の付加コンテンツの画像５１１を挿入すれば、自然な感覚で付加コンテンツを挿入できる。また、付加コンテンツの全体の色が赤色なら、主コンテンツの画像５０１の中で、赤色の部分をパターン認識し、この赤色の部分に赤色の付加コンテンツの画像５１１を挿入すれば、自然な感覚で付加コンテンツを挿入できる。更に、特定の目印（マーク）をパターン認識しても良い。また、予め学習したパターンでも良い。このように、主コンテンツの画像５０１中の所望のパターンをパターン認識し、このパターン認識に基づいて、付加コンテンツの表示領域枠５０２を設定すれば、主コンテンツの画像５０１に溶け込んだ自然な感覚で付加コンテンツを挿入できる。 In other words, the shape of additional content such as advertisement images is often rectangular. Therefore, according to the image processing system 1, if the rectangular shape is pattern-recognized in the main content image 501, and the rectangular additional content image 511 is inserted into the rectangular portion, the image processing system 1 can naturally perform the pattern recognition. You can insert additional content in a similar way. Similarly, if the shape of the additional content is circular, pattern recognition of the circular shape in the main content image 501 is performed, and a circular additional content image 511 is inserted into the circular portion to obtain a natural image. Additional content can be inserted by feeling. In addition, if the entire color of the additional content is red, pattern recognition of the red portion in the main content image 501 and insertion of the red additional content image 511 into the red portion can be performed in a natural way. You can insert additional content. Furthermore, pattern recognition may be performed for a specific mark. Alternatively, a pattern learned in advance may be used. In this way, by recognizing a desired pattern in the main content image 501 and setting the display area frame 502 of the additional content based on this pattern recognition, the display area frame 502 can be set with a natural feeling that blends into the main content image 501 . You can insert additional content.

なお、従来技術のような物体認識システムのみによって物体を特定する場合、特定しようとする物体の種類が多いと、全ての物体ごとの特徴量を予め取得しておく必要があり、物体認識の処理に負荷がかかる。また撮像対象の特徴量を全ての種類の物体の特徴量と比較する処理負荷がかかる。この例では、例えば長方形のように、特定の特徴量のみでパターン認識が行える。このため、より少ない処理負荷で、パターン認識が行える。 When an object is specified only by an object recognition system like the conventional technology, if there are many types of objects to be specified, it is necessary to acquire the feature amount for each object in advance, and the object recognition process is loaded. In addition, a processing load is required to compare the feature amount of the object to be imaged with the feature amounts of all types of objects. In this example, pattern recognition can be performed using only specific feature amounts, such as rectangles. Therefore, pattern recognition can be performed with less processing load.

次に、事項（２）について説明する。前述の図７では、主コンテンツの画像５０１は、ユーザ１４の視線が正面にあるときの仮想空間画像である。つまり、主コンテンツの画像５０１は、ユーザ１４が、仮想で主コンテンツの画像５０１に対して向かって正面に居る状態で、主コンテンツの画像５０１を視たときの仮想空間画像である。このため、主コンテンツ５０１の画像においてパターン認識した長方形の形状は、そのまま長方形の形状となる。つまり、主コンテンツ５０１の画像は視覚的な効果が生じずにパターン認識した長方形の形状のままである。このことから、画像処理システム１は、図７に示したように、主コンテンツ５０１以外の長方形の形状のパターンを認識し、認識されたパターンの部分に、付加コンテンツの表示領域枠５０２ａ及び５０２ｂを設定している。 Next, matter (2) will be described. In FIG. 7 described above, the main content image 501 is a virtual space image when the line of sight of the user 14 is in front. That is, the main content image 501 is a virtual space image when the user 14 views the main content image 501 in a state in which the user 14 is virtually in front of the main content image 501 . Therefore, the rectangular shape obtained by pattern recognition in the image of the main content 501 becomes a rectangular shape as it is. In other words, the image of the main content 501 remains in the shape of the pattern-recognized rectangle without any visual effect. Accordingly, as shown in FIG. 7, the image processing system 1 recognizes a rectangular pattern other than the main content 501, and places additional content display area frames 502a and 502b in the recognized pattern. have set.

これに対して、図８の例では、主コンテンツの画像５０１は、ユーザが、仮想で主コンテンツの画像５０１に対して向かって正面右側に居る状態で、主コンテンツの画像５０１を視たときの仮想空間画像である。ユーザ１４は、仮想で主コンテンツの画像５０１に対して向かって正面右側に居る状態であるので、主コンテンツの画像５０１は、ユーザ１４が、視線を主コンテンツの画像５０１に向くように左側に動かした状態の画像となる。図８に示すように、遠近法に基づき、ユーザ１４の視点を軸にして対象物までの距離が近いところが大きく、対象物までの距離が多いところが小さくなるように視野が成形される。図８に示す主コンテンツの画像５０１では、主コンテンツの画像５０１の図面右側端縁が最大に視え、図面左側端縁が最小に視える台形の形状となる。つまり、仮想でユーザ１４の視点が、図７に示す視点から、図８に示す視点に変わることで、主コンテンツの画像５０１は、長方形の形状から台形の形状に変わる。このようにユーザ１４の視点が変化し、ユーザ１４の視線が動くと、図８に示すように、主コンテンツの画像５０１は、視覚的な効果から、長方形の形状は台形の形状に変化する。このことから、画像処理システム１は、主コンテンツの画像５０１を画像のパターン認識の基準とし、図８に示す表示画像では、主コンテンツの画像５０１の画像の形状に基づいて、パターン認識する形状を台形とする。そこで、付加コンテンツの表示領域枠５０２ａ及び５０２ｂを視線方向に応じて長方形の形状から台形の形状に変換する必要がある。付加コンテンツの表示領域枠５０２ａ及び５０２ｂの形状変換は、射影変換マトリクス（ホモグラフィ）を計算することで達成できる。図９は射影変換の説明図である。射影変換では、図９に示すように、長方形の形状を他の方向から見たときの形状に変換できる。このように、第１の実施の形態では、認識したパターンに対して、視線方向に応じて射影変換マトリクスを計算し、付加コンテンツの形状を設定している。 On the other hand, in the example of FIG. 8, the main content image 501 is a virtual image of the main content image 501 when the user is viewing the main content image 501 in a state where the user is virtually on the front right side of the main content image 501 . It is a virtual space image. Since the user 14 is virtually on the front right side of the main content image 501 , the main content image 501 is moved to the left so that the user 14 faces the main content image 501 . image. As shown in FIG. 8, based on perspective, the field of view is shaped around the viewpoint of the user 14 so that the area where the distance to the object is short is large and the area where the distance to the object is long is small. The main content image 501 shown in FIG. 8 has a trapezoidal shape in which the right edge of the main content image 501 in the drawing is the largest and the left edge in the drawing is the smallest. That is, when the virtual viewpoint of the user 14 changes from the viewpoint shown in FIG. 7 to the viewpoint shown in FIG. 8, the main content image 501 changes from a rectangular shape to a trapezoidal shape. When the viewpoint of the user 14 changes in this way and the line of sight of the user 14 moves, as shown in FIG. 8, the image 501 of the main content changes from a rectangular shape to a trapezoidal shape due to visual effects. Therefore, the image processing system 1 uses the main content image 501 as a reference for image pattern recognition, and uses the display image shown in FIG. Trapezoidal. Therefore, it is necessary to transform the display area frames 502a and 502b of the additional content from the rectangular shape to the trapezoidal shape according to the line-of-sight direction. The shape transformation of the additional content display area frames 502a and 502b can be achieved by calculating a projective transformation matrix (homography). FIG. 9 is an explanatory diagram of projective transformation. In projective transformation, as shown in FIG. 9, a rectangular shape can be transformed into a shape when viewed from another direction. As described above, in the first embodiment, the shape of the additional content is set by calculating the projective transformation matrix for the recognized pattern according to the line-of-sight direction.

なお、第１の実施の形態では付加コンテンツを、表示領域枠５０２ｂの形状に変換する場合、付加コンテンツの内容に応じた変換をおこなってもよい。具体的には、付加コンテンツの輪郭の形状を変換する場合に、付加コンテンツにおいて視認され易い箇所（例えば、長方形）の縦横比率を所定閾値以上に変更しないようにしてもよい。視認され易い箇所の縦横比率が大きく変化すると、視認するユーザ１４に違和感を想起させてしまう可能性があるためである。 In addition, in the first embodiment, when the additional content is converted into the shape of the display area frame 502b, the conversion may be performed according to the contents of the additional content. Specifically, when converting the shape of the outline of the additional content, the aspect ratio of the easily visible portion (for example, rectangle) in the additional content may not be changed to a predetermined threshold value or more. This is because, if the aspect ratio of a portion that is easily visible changes significantly, the user 14 who sees the image may feel uncomfortable.

次に、事項（３）について説明する。図７の例では、主コンテンツの映像の３６０度全天球の仮想空間画像から、長方形のパターンを認識して付加コンテンツの表示領域枠５０２ａ及び５０２ｂを設定している。しかしながら、ユーザ１４の後ろ側の画像のように、ユーザ１４にとって死角となる位置では、付加コンテンツを表示しても、ユーザ１４は視認できない。また、図８に示したように、認識したパターンに対して、視線方向に応じて射影変換マトリクスを計算し、付加コンテンツの形状を変換すると、付加コンテンツの領域がユーザ１４にとって見にくい表示となる場合がある。このことから、第１の実施の形態では、ユーザ１４にとって見にくい表示となる場合には、図９に示すように、付加コンテンツを表示しないようにする。図９に示す形態の場合、付加コンテンツの表示領域枠５０２aが非表示もしくは別の予め設定した予備コンテンツが表示される。予備コンテンツの画像は、ユーザ１４が、主コンテンツの画像５０１や表示領域枠５０２ｂに表示された付加コンテンツの画像５１１ｂに没頭するような背景的な画像である。付加コンテンツの表示の有無のより具体的な条件は、図１０を参照して、付加コンテンツの表示領域枠５０２を射影マトリクスにより変換し、この変換した付加コンテンツの表示領域枠５０２の面積が所定の閾値以上なら付加コンテンツの画像５１１の表示を行い、付加コンテンツの表示領域枠５０２の面積が所定の閾値未満なら付加コンテンツの画像５１１の表示は行わない。更に、変換した付加コンテンツの表示領域枠５０２の形状の縦横比率、表示領域枠５０２が多角形の場合に、その内角の大きさ等を条件として、ユーザ１４にとって見にくい表示となるか否かを判定しても良い。図８，図９では、付加コンテンツの表示領域枠５０２ａ及び５０２ｂの形状を射影変換マトリクスを計算して変換した結果、付加コンテンツの表示領域枠５０２ａの面積が小さくなり、ユーザ１４にとって見にくい表示となっている。このため、図９に示す形態では、付加コンテンツの表示領域枠５０２ａには付加コンテンツを表示しないようにしている。 Next, matter (3) will be described. In the example of FIG. 7, the display area frames 502a and 502b of the additional content are set by recognizing a rectangular pattern from the 360-degree omnidirectional virtual space image of the video of the main content. However, even if the additional content is displayed in a blind spot for the user 14, such as the image behind the user 14, the user 14 cannot see it. Further, as shown in FIG. 8, when the projective transformation matrix is calculated according to the line-of-sight direction for the recognized pattern, and the shape of the additional content is transformed, the display of the additional content area becomes difficult for the user 14 to see. There is For this reason, in the first embodiment, when the display is difficult for the user 14 to see, as shown in FIG. 9, the additional content is not displayed. In the case of the form shown in FIG. 9, the additional content display area frame 502a is not displayed, or another preset auxiliary content is displayed. The preliminary content image is a background image that allows the user 14 to become immersed in the main content image 501 and the additional content image 511b displayed in the display area frame 502b. More specific conditions for whether or not to display additional content are as follows: referring to FIG. If the area is equal to or greater than the threshold, the additional content image 511 is displayed, and if the area of the additional content display area frame 502 is less than the predetermined threshold, the additional content image 511 is not displayed. Furthermore, it is determined whether or not the display is difficult for the user 14 to see, based on conditions such as the aspect ratio of the shape of the display area frame 502 of the converted additional content, and the size of the interior angle if the display area frame 502 is a polygon. You can In FIGS. 8 and 9, as a result of converting the shapes of the additional content display area frames 502a and 502b by calculating the projective transformation matrix, the area of the additional content display area frame 502a is reduced, resulting in a display that is difficult for the user 14 to see. ing. Therefore, in the form shown in FIG. 9, the additional content is not displayed in the additional content display area frame 502a.

次に、事項（４）について説明する。図７に示すように、主コンテンツの画像５０１が映し出されているとき、ユーザ１４の視線が一定であるとは限らない。ここで、ユーザ１４が頻繁に目線を向けている部分の表示は広告効果は高いが、ユーザ１４が殆ど目線を移動させない部分の表示では、広告効果は期待できない。そこで、第１の実施の形態では、ユーザ１４の視線滞留時間を判定し、ユーザの視線滞留時間が一定時間以上の場合のみ、付加コンテンツを表示している。例えば、ユーザ１４の視線滞留時間は付加コンテンツの表示領域枠５０２ａに対しては一定時間以上長くなっているが、付加コンテンツの表示領域枠５０２ｂに対しては一定時間未満であると判定した際、図１１に示すように、付加コンテンツの表示領域枠５０２ｂには、付加コンテンツを表示しないようにしている。 Next, matter (4) will be described. As shown in FIG. 7, when the image 501 of the main content is displayed, the line of sight of the user 14 is not always constant. Here, the display of the portion where the user 14 frequently looks at has a high advertising effect, but the display of the portion where the user 14 hardly moves the line of sight cannot be expected to have an advertising effect. Therefore, in the first embodiment, the line-of-sight retention time of the user 14 is determined, and additional content is displayed only when the user's line-of-sight retention time is longer than a certain period of time. For example, when it is determined that the line-of-sight dwell time of the user 14 is longer than a predetermined time with respect to the additional content display area frame 502a, but is less than the predetermined time with respect to the additional content display area frame 502b, As shown in FIG. 11, the additional content is not displayed in the additional content display area frame 502b.

また、第１の実施の形態では、各視聴ルームに応じて、付加コンテンツの内容が異なるようにしている。図１２は、各視聴ルームＡ，Ｂ，Ｃに付加される付加コンテンツの画像５１１を示している。図１２では、視聴ルームＡ、視聴ルームＢ、視聴ルームＣの３つの視聴ルームの画像が示されている。視聴ルームＡの付加コンテンツの画像５２１ａ及び５２１ｂ、視聴ルームＢの付加コンテンツの画像５２２ａ及び５２２ｂ、視聴ルームＣの付加コンテンツの画像５２３ａ及び５２３ｂとしては、各視聴ルームＡ，Ｂ，Ｃの属性情報（以下、ルーム属性情報とする）に基づいて互いに内容の異なる付加コンテンツの画像（５２１ａ及び５２１ｂ、５２２ａ及び５２２ｂ、５２３ａ及び５２３ｂ）が付加されている。なお、各視聴ルームＡ，Ｂ，Ｃに付加した付加コンテンツの画像（５２１ａ及び５２１ｂ、５２２ａ及び５２２ｂ、５２３ａ及び５２３ｂ）は、コンテンツが重複してもよい。ユーザ１４は、複数の視聴ルームＡ，Ｂ，Ｃの中から、所望の視聴ルームＡ，Ｂ，Ｃの画像を見ることができる。 Also, in the first embodiment, the content of the additional content is different depending on each viewing room. FIG. 12 shows an additional content image 511 added to each of the viewing rooms A, B, and C. As shown in FIG. In FIG. 12, images of three viewing rooms, viewing room A, viewing room B, and viewing room C, are shown. As the additional content images 521a and 521b of the viewing room A, the additional content images 522a and 522b of the viewing room B, and the additional content images 523a and 523b of the viewing room C, attribute information ( Additional content images (521a and 521b, 522a and 522b, 523a and 523b) having different contents are added based on room attribute information. The additional content images (521a and 521b, 522a and 522b, 523a and 523b) added to the viewing rooms A, B, and C may overlap. The user 14 can view images of desired viewing rooms A, B, and C from among the plurality of viewing rooms A, B, and C.

視聴ルームＡ，Ｂ，Ｃのルーム属性情報は、予め設定した複数の視聴に関する属性（属性項目）の情報である。ルーム属性情報は、各視聴ルームＡ，Ｂ，Ｃの趣向に合わせて予め設定した設定情報であってもよい。または、ルーム属性情報は、属性項目に対してユーザ１４が自ら個別入力した情報であってもよい。ルーム属性情報がユーザ１４が自ら個別入力した情報である場合、ユーザ１４自身がユーザ端末装置２を用いて入力し、入力した入力情報を、仮想空間画像配信サーバ１０、付加画像配信サーバ１１、ユーザ端末装置２、三次元画像表示デバイス１３の少なくとも１つに記録し、記録したユーザ１４の入力情報に基づいて機械学習を行った結果、最適とされるユーザ行動情報であってもよい。 The room attribute information of the viewing rooms A, B, and C is information of preset attributes (attribute items) related to viewing. The room attribute information may be setting information preset according to the preference of each of the viewing rooms A, B, and C. FIG. Alternatively, the room attribute information may be information individually input by the user 14 for the attribute item. When the room attribute information is information individually input by the user 14 himself/herself, the input information is input by the user 14 himself/herself using the user terminal device 2, and the input information is sent to the virtual space image distribution server 10, the additional image distribution server 11, the user It may be user behavior information that is recorded in at least one of the terminal device 2 and the three-dimensional image display device 13 and that is optimal as a result of machine learning based on the recorded input information of the user 14 .

例えば、視聴ルームＡがスポンサードルーム、視聴ルームＢがデコレートルーム、視聴ルームＣがファンルームなら、視聴ルームＡの付加コンテンツの画像５２１ａ及び５２１ｂはそのスポンサーの製品の広告画像とし、視聴ルームＢの付加コンテンツの画像５２２ａ及び５２２ｂはゴージャスな製品の広告画像とし、視聴ルームＣの付加コンテンツの画像５２３ａ及び５２３ｂはアーティストの関連製品の広告画像とする。これにより、視聴ルームを見ているであろうユーザ１４の嗜好に合わせた製品の広告画像を表示でき、広告効果の向上が期待できる。 For example, if the viewing room A is a sponsored room, the viewing room B is a decorated room, and the viewing room C is a fan room, the additional content images 521a and 521b in the viewing room A are advertising images of the sponsor's products, and the viewing room B Additional content images 522a and 522b are advertising images of gorgeous products, and additional content images 523a and 523b of viewing room C are advertising images of artists' related products. As a result, it is possible to display an advertisement image of a product that matches the taste of the user 14 who is probably watching the viewing room, and an improvement in the advertisement effect can be expected.

さらに詳説すると、視聴ルームＡの場合、ルーム属性情報は、スポンサードに関するスポンサー属性情報である。視聴ルームＡでは、スポンサー属性情報について、ライブの開催者やスポンサーの意向に基づき、スポンサー属性情報の各情報に対して重み付けを行い、表示する付加コンテンツの画像５２１ａ及び５２１ｂの表示順、表示時間などが設定される。例えば、スポンサー属性情報には、広告対象となる製品やサービスの情報、広告ジャンルの情報、広告内容の情報などの広告に関する広告情報がある。スポンサー属性情報には、広告対象となる製品やサービスの表示順に関する情報、広告対象となる製品やサービスの表示時間に関する情報、スポンサー費用の情報、季節や曜日など表示する時期の情報などの表示出力に関する出力情報がある。スポンサー属性情報には、マーケティングで用いられるターゲット層の情報などの広告マッチングに用いるマッチング情報がある。 More specifically, in the case of viewing room A, the room attribute information is sponsor attribute information regarding sponsorship. In the viewing room A, each item of the sponsor attribute information is weighted based on the intention of the organizer of the live event and the sponsor, and the display order and display time of the additional content images 521a and 521b to be displayed are weighted. is set. For example, the sponsor attribute information includes advertising information related to advertising, such as information on products and services to be advertised, information on advertising genres, and information on advertising content. Sponsor attribute information includes display output such as information on the display order of advertised products and services, information on the display time of advertised products and services, sponsor cost information, and information on the time of display such as season and day of the week. There is output information about Sponsor attribute information includes matching information used for advertisement matching, such as target layer information used in marketing.

視聴ルームＢの場合、ルーム属性情報は、デコレートに関するデコレート属性情報である。視聴ルームＢでは、デコレート属性情報について、ユーザ属性情報に基づきデコレート属性情報の各情報に対して重み付けを行い、表示する付加コンテンツの画像５２２ａ及び５２２ｂの表示順、表示時間などが設定される。もしくは、デコレート属性情報について、ユーザ１４によって事前にデコレート属性情報の各情報に対して任意の取捨選択による重み付けを行い、表示する付加コンテンツの画像５２２ａ及び５２２ｂの表示順、表示時間などが設定される。例えば、デコレート属性情報には、広告対象となる製品やサービスの情報、広告ジャンルの情報、広告内容の情報などの広告に関する広告情報がある。デコレート属性情報には、マーケティングで用いられるターゲット層の情報などの広告マッチングに用いるマッチング情報がある。 In the case of viewing room B, the room attribute information is decorating attribute information relating to decorating. In the viewing room B, each piece of decorating attribute information is weighted based on the user attribute information, and the display order and display time of the additional content images 522a and 522b to be displayed are set. Alternatively, for the decorate attribute information, the user 14 weights each piece of the decorate attribute information in advance by arbitrary selection, and sets the display order and display time of the additional content images 522a and 522b to be displayed. . For example, the decorating attribute information includes advertisement information related to advertisements such as information on products and services to be advertised, information on advertisement genres, and information on advertisement contents. Decorate attribute information includes matching information used for advertisement matching, such as information on target demographics used in marketing.

視聴ルームＣの場合、ルーム属性情報は、ファンに関するファン属性情報である。視聴ルームＣでは、ファン属性情報について、ユーザ属性情報に基づきファン属性情報の各情報に対して重み付けを行い、表示する付加コンテンツの画像５２３ａ及び５２３ｂの表示順、表示時間などが設定される。もしくは、ファン属性情報について、ユーザ１４によって事前にファン属性情報の各情報に対して任意の取捨選択による重み付けを行い、表示する付加コンテンツの画像５２３ａ及び５２３ｂの表示順、表示時間などが設定される。例えば、ファン属性情報には、アーティストの関連製品や関連サービスの情報、アーティストの情報などのアーティストに関するアーティスト情報がある。ファン属性情報には、ライブへの視聴回数、実際のライブへの参加回数、ファン歴、アーティストのファンクラブの有無などのファンに関するファン情報がある。 In the case of the viewing room C, the room attribute information is fan attribute information regarding fans. In the viewing room C, each item of the fan attribute information is weighted based on the user attribute information, and the display order and display time of the additional content images 523a and 523b to be displayed are set. Alternatively, for the fan attribute information, the user 14 weights each piece of the fan attribute information in advance by arbitrarily selecting and selecting, and sets the display order and display time of the additional content images 523a and 523b to be displayed. . For example, the fan attribute information includes artist information related to the artist, such as information on the artist's related products and related services, and artist information. Fan attribute information includes fan information about fans, such as the number of live viewings, number of actual live participations, fan history, and presence or absence of an artist's fan club.

次に、第１の実施の形態に係る画像処理システム１における付加コンテンツの付加について具体的に説明する。 Next, addition of additional content in the image processing system 1 according to the first embodiment will be specifically described.

図１３は、第１の実施の形態の画像処理システム１において付加画像の処理を説明するための機能ブロック図である。図１３に示すように、第１の実施の形態に係る画像処理システム１において付加コンテンツの画像５１１の処理を行うための機能は、コンテンツ制御部１５１とコンテンツデータベース部１５２とで実現できる。コンテンツ制御部１５１の機能は、仮想空間画像配信サーバ１０のサイト、付加画像配信サーバ１１のサイト、或いは専用のサーバのサイトのプログラムやスクリプトで実現できる。また、コンテンツ制御部１５１の機能は、ユーザ端末装置１２のプログラムやスクリプトで実現しても良い。更に、仮想空間画像配信サーバ１０のサイト、付加画像配信サーバ１１のサイト、或いは専用のサーバのサイトのプログラムやスクリプトと、ユーザ端末装置１２のプログラムやスクリプトとを協働させて、コンテンツ制御部１５１の機能を実現しても良い。コンテンツデータベース部１５２は、仮想空間画像配信サーバ１０や付加画像配信サーバ１１により構築されるデータベースである。 FIG. 13 is a functional block diagram for explaining additional image processing in the image processing system 1 of the first embodiment. As shown in FIG. 13, the function for processing the additional content image 511 in the image processing system 1 according to the first embodiment can be realized by the content control unit 151 and the content database unit 152 . The function of the content control unit 151 can be implemented by a program or script on the site of the virtual space image distribution server 10, the site of the additional image distribution server 11, or the site of a dedicated server. Also, the function of the content control unit 151 may be implemented by a program or script of the user terminal device 12 . Furthermore, the program or script of the site of the virtual space image distribution server 10, the site of the additional image distribution server 11, or the site of the dedicated server cooperates with the program or script of the user terminal device 12 so that the content control unit 151 function may be realized. The content database unit 152 is a database constructed by the virtual space image distribution server 10 and the additional image distribution server 11 .

コンテンツ制御部１５１は、パターン認識部１６１と、抽出部１６２と、変換部１６３と、表示設定部１６４とからなる。 The content control section 151 includes a pattern recognition section 161 , an extraction section 162 , a conversion section 163 and a display setting section 164 .

パターン認識部１６１は、ディスプレイ３０３に表示中の表示画像（主コンテンツの画像５０１）に含まれる所定のパターンを認識する。ここで、所定のパターンは、予め設定した幾何学に関するパターンであり、例えば、四角や円形などの外周端の形状である。 The pattern recognition unit 161 recognizes a predetermined pattern included in the display image (main content image 501 ) being displayed on the display 303 . Here, the predetermined pattern is a preset geometrical pattern, and is, for example, a shape of an outer peripheral edge such as a square or a circle.

また、パターン認識部１６１は、ユーザ１４の視線方向に基づき主コンテンツの画像５０１が変形した場合、主コンテンツの画像５０１の変形量に基づき所定のパターンを変形する。例えば、所定のパターンが長方形である場合、主コンテンツの画像５０１の変形量に基づき長方形が台形などに変形する。この変形に基づき、変換部１６３が付加コンテンツの画像５１１の形状を変換する。 Further, when the main content image 501 is deformed based on the line-of-sight direction of the user 14, the pattern recognition unit 161 deforms a predetermined pattern based on the amount of deformation of the main content image 501. FIG. For example, if the predetermined pattern is a rectangle, the rectangle is transformed into a trapezoid or the like based on the amount of transformation of the main content image 501 . Based on this deformation, the conversion unit 163 converts the shape of the additional content image 511 .

パターン認識部１６１は、予め設定した数の連続した複数の静止画において、主コンテンツの画像５０１の同一位置に表示される所定のパターン、もしくは主コンテンツの画像５０１の表示位置を変位して表示される所定のパターンを認識する。ここでいう予め設定した数は、例えば、フレームレートを６０ｆｐｓとした場合、１０とする。なお、本実施の形態では、数を１０としているが、これは好適な例であり、これに限定されるものではない。フレームレートに対して所定の数であればよく、最大で１秒以内での認識が可能であればよく、０．１ｓｅｃ以下であることがライブのコンテンツに対しての広告出力に好ましい。 The pattern recognition unit 161 performs a predetermined pattern that is displayed at the same position of the main content image 501, or displays the main content image 501 that is displayed by displacing the display position of the main content image 501 in a preset number of consecutive still images. Recognizes a predetermined pattern. The preset number referred to here is, for example, 10 when the frame rate is 60 fps. Although the number is set to 10 in this embodiment, this is a preferred example and is not limited to this. A predetermined number for the frame rate may be sufficient, as long as recognition is possible within a maximum of 1 second, and 0.1 sec or less is preferable for advertisement output for live content.

パターン認識部１６１は、例えば、いわゆる画像認識により主コンテンツの画像５０１に含まれる幾何学に関する形状を認識する。この場合、パターン認識部１６１は、予め記憶させた形状と、仮想空間画像に含まれる形状とを比較することにより所定のパターンを認識する。具体的には、パターン認識部１６１は、幾何学に関する形状を予め記憶した記憶部（不図示）を参照し、記憶部に記憶された形状と画像に含まれる形状とが一致すると判定した場合に、その形状に対応するものが画像に含まれていると認識する。この場合において、パターン認識部１６１は、記憶部に記憶された形状と画像に含まれる形状とが完全に一致しない場合であっても、記憶部に記憶された形状に示される複数の特徴のうちの所定の割合以上の特徴が、画像に含まれる形状に示される特徴と一致する場合に、記憶部に記憶された形状と画像に含まれる形状とが一致すると判定するようにしてもよい。 The pattern recognition unit 161 recognizes geometric shapes included in the main content image 501 by, for example, so-called image recognition. In this case, the pattern recognition unit 161 recognizes a predetermined pattern by comparing a shape stored in advance with a shape included in the virtual space image. Specifically, the pattern recognition unit 161 refers to a storage unit (not shown) in which geometric shapes are stored in advance, and when it determines that the shape stored in the storage unit matches the shape included in the image, , recognizes that the corresponding shape is included in the image. In this case, even if the shape stored in the storage unit and the shape included in the image do not completely match, the pattern recognition unit 161 recognizes the features of the shape stored in the storage unit. It may be determined that the shape stored in the storage unit and the shape included in the image match when a predetermined ratio or more of the features match the features shown in the shape included in the image.

抽出部１６２は、図６に示すように、パターン認識部１６１により認識された所定のパターンに基づいて、仮想空間画像に付加する付加コンテンツの画像５１１を表示させる表示領域枠５０２を抽出する。 As shown in FIG. 6, the extraction unit 162 extracts a display area frame 502 for displaying an additional content image 511 to be added to the virtual space image, based on the predetermined pattern recognized by the pattern recognition unit 161 .

変換部１６３は、射影変換に基づいて、ユーザ１４の視野角に応じて付加コンテンツの画像５１１の形状を変換する。ここでいう射影変換に関して、射影評価（射影変換の定量的評価）のことをいう。 The conversion unit 163 converts the shape of the additional content image 511 according to the viewing angle of the user 14 based on projective transformation. The projective transformation here means projective evaluation (quantitative evaluation of projective transformation).

表示設定部１６４は、図６に示すように、抽出部１６２により抽出された表示領域枠５０２に、変換部１６３により変換された付加コンテンツの画像５１１を表示させる設定を行う。表示設定部１６４は、変換部１６３により変換された付加コンテンツの画像５１１の形状が所定の条件を満たすか否かを判定し、所定の条件を満たす場合のみ、変換部１６３により変換された付加コンテンツの画像５１１を表示するように設定する。また、表示設定部１６４は、ユーザ１４にとって見にくい表示となる場合には、付加コンテンツの画像５１１を表示しないように設定する。すなわち、表示設定部１６４は、変換部１６３により変換された付加コンテンツの表示領域枠５０２の面積が所定の閾値以上の場合のみ、変換された付加コンテンツの画像５１１の表示を行うように設定する。つまり、表示設定部１６４は、射影評価に基づいて追加コンテンツの画像５１１の表示を制御する。また、本構成では、変換部１６３により変換された付加コンテンツの画像５１１の表示の有無としているが、これに限定されるものではなく、表示設定部１６４は、射影評価に基づいて濃淡やぼかし処理などの付加コンテンツの画像５１１の表示の変更を行ってよい。 As shown in FIG. 6, the display setting unit 164 performs setting to display the additional content image 511 converted by the conversion unit 163 in the display area frame 502 extracted by the extraction unit 162 . The display setting unit 164 determines whether or not the shape of the additional content image 511 converted by the conversion unit 163 satisfies a predetermined condition, and displays the additional content converted by the conversion unit 163 only when the predetermined condition is satisfied. is set to display the image 511 of . In addition, the display setting unit 164 sets not to display the additional content image 511 when the display is difficult for the user 14 to see. That is, the display setting unit 164 sets to display the converted additional content image 511 only when the area of the display area frame 502 of the additional content converted by the conversion unit 163 is equal to or larger than a predetermined threshold value. That is, the display setting unit 164 controls the display of the additional content image 511 based on the projection evaluation. Further, in this configuration, whether or not the image 511 of the additional content converted by the conversion unit 163 is displayed or not is not limited to this. The display of the additional content image 511 may be changed, such as.

また、表示設定部１６４は、ユーザ１４の視線が滞留する時間を取得し、取得された時間に基づいて、所定の閾値以上の視線が滞留した場合の視線方向に表示領域枠がある場合、この表示領域枠５０２に付加コンテンツの画像５１１を表示するように設定する。つまり、表示設定部１６４は、ユーザ１４の視線の滞留時間に基づいて追加コンテンツの画像５１１の表示を制御する。また、本構成では、変換部１６３により変換された付加コンテンツの画像５１１の表示の有無としているが、これに限定されるものではなく、表示設定部１６４は、ユーザ１４の視線の滞留時間に基づいて濃淡やぼかし処理などの付加コンテンツの画像５１１の表示の変更を行ってよい。 Further, the display setting unit 164 acquires the length of time that the line of sight of the user 14 stays, and based on the acquired time, if there is a display area frame in the direction of the line of sight when the line of sight of a predetermined threshold or more stays, this The display area frame 502 is set to display the additional content image 511 . That is, the display setting unit 164 controls the display of the additional content image 511 based on the dwell time of the line of sight of the user 14 . Further, in this configuration, whether or not the additional content image 511 converted by the conversion unit 163 is displayed or not is not limited to this. The display of the additional content image 511 may be changed, such as by shading or blurring.

また、表示設定部１６４は、同一の対象で視点を変えた複数の視聴ルームの仮想空間画像に対して、それぞれ独立して付加コンテンツの画像５１１を選択可能とする。表示設定部１６４により、１つの主コンテンツを予め設定した視聴条件に基づき、主コンテンツに対して複数の異なる画像を提供することができる。 In addition, the display setting unit 164 makes it possible to independently select the image 511 of the additional content for each of the virtual space images of a plurality of viewing rooms with different viewpoints for the same object. The display setting unit 164 can provide a plurality of different images for the main content based on viewing conditions set in advance for one main content.

図１４は、第１の実施の形態に係る画像処理システム１における付加コンテンツの付加処理を説明するためのフローチャートである。 FIG. 14 is a flowchart for explaining additional content addition processing in the image processing system 1 according to the first embodiment.

（ステップＳ１０１）ユーザ１４が仮想空間画像の視聴を開始すると、ユーザ端末装置１２は、ユーザ１４の頭部の動きを検出するセンサやユーザの視線を検出するカメラの撮影画像から、三次元画像表示デバイス１３を装着しているユーザ１４の姿勢情報を取得する。ここでいう姿勢情報とは、姿勢センサ３０９による傾き情報や、視線検出カメラ３０８による視点情報および視線情報などを含む情報群のことをいう。 (Step S101) When the user 14 starts viewing the virtual space image, the user terminal device 12 displays a three-dimensional image from images taken by a sensor that detects the movement of the head of the user 14 and a camera that detects the line of sight of the user. Posture information of the user 14 wearing the device 13 is acquired. The posture information here refers to a group of information including tilt information from the posture sensor 309, viewpoint information and line-of-sight information from the line-of-sight detection camera 308, and the like.

（ステップＳ１０２）ユーザ端末装置１２は、表示する主コンテンツの画像を仮想空間画像配信サーバ１０から取得する。そして、ユーザ端末装置１２は、コンテンツ制御部１５１に、表示する主コンテンツの識別と姿勢情報を送信する。 (Step S102 ) The user terminal device 12 acquires the image of the main content to be displayed from the virtual space image distribution server 10 . The user terminal device 12 then transmits the identification of the main content to be displayed and the posture information to the content control unit 151 .

（ステップＳ１０３）コンテンツ制御部１５１は、ユーザ１４の姿勢情報から視野角を判定し、表示する主コンテンツの識別と、この視野角の情報をコンテンツデータベース部１５２に送る。 (Step S103 ) The content control unit 151 determines the viewing angle from the posture information of the user 14 , and sends the identification of the main content to be displayed and the viewing angle information to the content database unit 152 .

（ステップＳ１０４）コンテンツデータベース部１５２は、ユーザ端末装置１２に表示している主コンテンツの画像５０１のうち、ユーザ１４の視野部分の画像を仮想空間画像配信サーバ１０から取得し、コンテンツ制御部１５１に送る。ここでいう視野部分とは、ステップＳ１０３で判定した視野角の情報に基づく視野領域のことをいう（以下、同様）。 (Step S104) The content database unit 152 acquires from the virtual space image distribution server 10 the image of the visual field portion of the user 14 in the image 501 of the main content displayed on the user terminal device 12, and sends it to the content control unit 151. send. The field of view portion referred to here refers to the field of view area based on the information on the viewing angle determined in step S103 (the same applies hereinafter).

（ステップＳ１０５）コンテンツ制御部１５１は、ユーザ１４の視野部分の主コンテンツの画像５０１からパターン認識を行い、付加コンテンツの表示領域枠５０２を設定できるか否かを判定する。付加コンテンツの表示領域枠５０２が設定できれば（ステップＳ１０５：Ｙｅｓ）、処理をステップＳ１０６に進め、表示領域枠５０２が設定できなければ（ステップＳ１０５：Ｎｏ）、処理をステップＳ１０９に進める。 (Step S105) The content control unit 151 performs pattern recognition from the main content image 501 in the field of view of the user 14, and determines whether or not the additional content display area frame 502 can be set. If the additional content display area frame 502 can be set (step S105: Yes), the process proceeds to step S106, and if the display area frame 502 cannot be set (step S105: No), the process proceeds to step S109.

（ステップＳ１０６）コンテンツ制御部１５１は、付加コンテンツの表示領域枠５０２に対して射影変換マトリクスを計算し、表示領域枠５０２が見やすい角度にあるか否かを判定する。更に、コンテンツ制御部１５１は、視線滞留時間が十分であるか否かを判定する。表示領域枠５０２が見やすい角度にあり、且つ、視線滞留時間が十分である場合には（ステップＳ１０６：Ｙｅｓ）、処理をステップＳ１０７に進める。表示領域枠５０２が見やすい角度にない又は視線滞留時間が十分でない場合には（ステップＳ１０６：Ｎｏ）、処理をステップＳ１０９に進める。 (Step S106) The content control unit 151 calculates a projective transformation matrix for the display area frame 502 of the additional content, and determines whether or not the display area frame 502 is at an easy-to-see angle. Furthermore, the content control unit 151 determines whether or not the line-of-sight retention time is sufficient. If the display area frame 502 is at an easy-to-see angle and the line-of-sight retention time is sufficient (step S106: Yes), the process proceeds to step S107. If the display area frame 502 is not at an angle that is easy to see or if the line-of-sight retention time is not sufficient (step S106: No), the process proceeds to step S109.

（ステップＳ１０７）コンテンツデータベース部１５２は、主コンテンツに付加すべき付加コンテンツの画像５１１を取得して、この付加コンテンツの画像５１１をコンテンツ制御部１５１に送る。 (Step S107 ) The content database unit 152 acquires the additional content image 511 to be added to the main content and sends the additional content image 511 to the content control unit 151 .

（ステップＳ１０８）コンテンツ制御部１５１は、付加コンテンツの画像５１１に対して射影変換マトリクスを計算して、付加コンテンツの画像５１１の視野角の補正を行う。 (Step S108 ) The content control unit 151 calculates a projective transformation matrix for the additional content image 511 to correct the viewing angle of the additional content image 511 .

（ステップＳ１０９）コンテンツ制御部１５１は、表示画像の調整を行い、視野角補正を行った付加コンテンツの画像５１１をユーザ端末装置１２に送る。 (Step S109 ) The content control unit 151 adjusts the display image and sends the additional content image 511 with the viewing angle corrected to the user terminal device 12 .

（ステップＳ１１０）ユーザ端末装置１２は、主コンテンツの画像５０１上に、付加コンテンツの画像５１１を挿入して、三次元画像表示デバイス１３に表示する。なお、本実施の形態では、ユーザ端末装置１２において主コンテンツの画像５０１上に、付加コンテンツの画像５１１を挿入する処理を行っているが、これは好適な例の一つである。そのため、例えば、仮想空間画像配信サーバ１０と付加画像配信サーバ１１との少なくとも一方において（つまり、ネットワーク１５上において）、主コンテンツの画像５０１上に、付加コンテンツの画像５１１を挿入する処理を行ってもよい。または、ユーザ端末装置１２の構成部材を三次元画像表示デバイス１３が含有した形態の場合、三次元画像表示デバイス１３において、主コンテンツの画像５０１上に、付加コンテンツの画像５１１を挿入する処理を行ってもよい。 (Step S110 ) The user terminal device 12 inserts the additional content image 511 into the main content image 501 and displays it on the three-dimensional image display device 13 . In this embodiment, the user terminal device 12 performs processing for inserting the additional content image 511 onto the main content image 501, which is one of the preferred examples. Therefore, for example, in at least one of the virtual space image distribution server 10 and the additional image distribution server 11 (that is, on the network 15), a process of inserting the additional content image 511 onto the main content image 501 is performed. good too. Alternatively, in the case where the three-dimensional image display device 13 contains the constituent members of the user terminal device 12, the three-dimensional image display device 13 performs processing for inserting the additional content image 511 onto the main content image 501. may

以上の処理により、前述の事項（１）から事項（４）に基づいて、付加コンテンツの表示領域枠５０２を設定し、主コンテンツの画像５０１上に付加コンテンツの画像５１１を挿入することができる。 By the above processing, it is possible to set the display area frame 502 of the additional content and insert the image 511 of the additional content on the image 501 of the main content based on the items (1) to (4) described above.

第１の実施の形態によれば、表示画像に没入できる付加画像の表示領域枠５０２を抽出することができる。また、第１の実施の形態によれば、ライブなどの事前に記憶していないライブなどの映像（動画）であっても、所定のパターンを認識することができる。その結果、主コンテンツの画像５０１が、ライブ映像である場合、ライブ映像における空き空間を有効活用できる。また、第１の実施の形態によれば、ユーザ１４の視線方向が変わり、表示画像が変形した場合であっても、所定のパターンも表示画像の変形量に基づいて変形させて、表示画像に没入できる所定のパターンを認識することができる。 According to the first embodiment, it is possible to extract the display area frame 502 of the additional image that can be immersed in the display image. Further, according to the first embodiment, even in a video (moving image) of a live performance or the like that is not stored in advance, a predetermined pattern can be recognized. As a result, when the main content image 501 is a live video, the empty space in the live video can be effectively utilized. Further, according to the first embodiment, even when the viewing direction of the user 14 is changed and the display image is deformed, the predetermined pattern is also deformed based on the deformation amount of the display image, and the display image is changed. Predetermined immersive patterns can be recognized.

＜第２の実施の形態＞
次に、第２の実施の形態について説明する。前述の第１の実施の形態では、図１３にフローチャートで示したように、主コンテンツの画像５０１のうち、ユーザ１４の視野部分の画像を取得して、この視野部分の画像から、表示中の表示画像に含まれる所定のパターンを認識して付加コンテンツの表示領域枠５０２を設定している。これに対して、この第２の実施の形態では、予め主コンテンツの画像５０１を処理して、付加コンテンツの表示領域枠５０２を設定している。 <Second Embodiment>
Next, a second embodiment will be described. In the first embodiment described above, as shown in the flow chart of FIG. The display area frame 502 of the additional content is set by recognizing a predetermined pattern included in the display image. On the other hand, in the second embodiment, the image 501 of the main content is processed in advance to set the display area frame 502 of the additional content.

図１５は、第２の実施の形態に係る画像処理システム１における付加コンテンツの付加処理を説明するためのフローチャートである。 FIG. 15 is a flowchart for explaining additional content addition processing in the image processing system 1 according to the second embodiment.

（ステップＳ２０１）コンテンツ制御部１５１は、主コンテンツのプリアノテーションを開始すると、主コンテンツの全天球画像を取得し、主コンテンツの画像５０１をコンテンツデータベース部１５２に送る。 (Step S201 ) When starting the pre-annotation of the main content, the content control unit 151 acquires the omnidirectional image of the main content and sends the main content image 501 to the content database unit 152 .

（ステップＳ２０２）コンテンツデータベース部１５２は、エクイレクタングラー（equirectangular）形式の主コンテンツの画像５０１のデータを抽出し、主コンテンツの全天球画像を仮想空間画像配信サーバ１０に送信する。 (Step S202 ) The content database unit 152 extracts the data of the image 501 of the main content in the equirectangular format, and transmits the omnidirectional image of the main content to the virtual space image distribution server 10 .

（ステップＳ２０３）：コンテンツ制御部１５１は、全天球画像であるエクイレクタングラー形式の画像からキューブ（cube）投影を行う。 (Step S203): The content control unit 151 performs cube projection from an equirectangular format image, which is an omnidirectional image.

（ステップＳ２０４）コンテンツ制御部１５１は、キューブ投影を行った主コンテンツの画像５０１からパターン認識を行い、付加コンテンツの表示領域枠５０２を設定できるか否かを判定する。付加コンテンツの表示領域枠５０２が設定できれば（ステップＳ２０４：Ｙｅｓ）、処理をステップＳ２０５に進め、表示領域枠５０２が設定できなければ（ステップＳ２０４：Ｎｏ）、プリアノテーションを終了する。 (Step S204) The content control unit 151 performs pattern recognition from the cube-projected main content image 501, and determines whether or not the additional content display area frame 502 can be set. If the additional content display area frame 502 can be set (step S204: Yes), the process proceeds to step S205, and if the display area frame 502 cannot be set (step S204: No), the pre-annotation ends.

（ステップＳ２０５）コンテンツデータベース部１５２は、主コンテンツに付加すべき付加コンテンツの画像５１１を取得する。 (Step S205) The content database unit 152 acquires the additional content image 511 to be added to the main content.

（ステップＳ２０６）コンテンツデータベース部１５２は、該当する画像座標領域にメタデータとして付加コンテンツを格納する。 (Step S206) The content database unit 152 stores additional content as metadata in the corresponding image coordinate area.

（ステップＳ２０７）ユーザ１４が仮想空間画像の視聴を開始すると、ユーザ端末装置１２は、ユーザ１４の頭部の動きを検出する姿勢センサ３０９やユーザ１４の視線を検出する視線検出カメラ３０８の撮影画像から、三次元画像表示デバイス１３を装着しているユーザ１４の姿勢情報を取得する。 (Step S207) When the user 14 starts viewing the virtual space image, the user terminal device 12 detects the image captured by the posture sensor 309 that detects the movement of the user 14's head and the line-of-sight detection camera 308 that detects the line of sight of the user 14. , the posture information of the user 14 wearing the 3D image display device 13 is acquired.

（ステップＳ２０８）ユーザ端末装置１２は、仮想空間画像配信サーバ１０から、表示する主コンテンツの画像５０１を取得する。そして、ユーザ端末装置１２は、コンテンツ制御部１５１に姿勢情報を送信する。 (Step S208 ) The user terminal device 12 acquires the main content image 501 to be displayed from the virtual space image distribution server 10 . The user terminal device 12 then transmits the posture information to the content control section 151 .

（ステップＳ２０９）コンテンツ制御部１５１は、ユーザ１４の姿勢情報から視野角を判定する。 (Step S209) The content control unit 151 determines the viewing angle from the posture information of the user 14. FIG.

（ステップＳ２１０）コンテンツ制御部１５１は、メタデータから、ユーザ１４の視野内（視野部分）に付加コンテンツ領域（付加コンテンツの表示領域枠５０２）が入っているか否かを判定する。ユーザ１４の視野内に付加コンテンツ領域が入っている場合には（ステップＳ２１０：Ｙｅｓ）、処理をステップＳ２１０に進める。ユーザ１４の視野内に付加コンテンツ領域が入っていない場合には（ステップＳ２１０：Ｎｏ）、処理をステップＳ２１３に進める。 (Step S210) The content control unit 151 determines from the metadata whether or not the additional content area (additional content display area frame 502) is within the visual field (visual field portion) of the user 14 or not. If the additional content area is within the field of view of the user 14 (step S210: Yes), the process proceeds to step S210. If the additional content area is not within the field of view of the user 14 (step S210: No), the process proceeds to step S213.

（ステップＳ２１１）コンテンツ制御部１５１は、付加コンテンツの表示領域枠５０２に対して射影変換マトリクスを計算し、表示領域枠５０２が見やすい角度にあるか否かを判定する。更に、コンテンツ制御部１５１は、視線滞留時間が十分であるか否かを判定する。表示領域枠５０２が見やすい角度にあり、且つ、視線滞留時間が十分である場合には（ステップＳ２１１：Ｙｅｓ）、処理をステップＳ２１２に進める。表示領域枠５０２が見やすい角度にない又は視線滞留時間が十分でない場合には（ステップＳ２１１：Ｎｏ）、処理をステップＳ２１３に進める。 (Step S211) The content control unit 151 calculates a projective transformation matrix for the display area frame 502 of the additional content, and determines whether the display area frame 502 is at an easy-to-see angle. Furthermore, the content control unit 151 determines whether or not the line-of-sight retention time is sufficient. If the display area frame 502 is at an easy-to-see angle and the line-of-sight retention time is sufficient (step S211: Yes), the process proceeds to step S212. If the display area frame 502 is not at an angle that is easy to see or if the line-of-sight retention time is not sufficient (step S211: No), the process proceeds to step S213.

（ステップＳ２１３）コンテンツ制御部１５１は、表示画像の調整を行い、視野角補正を行った付加コンテンツの画像５１１をユーザ端末装置１２に送る。 (Step S213 ) The content control unit 151 adjusts the display image and sends the additional content image 511 with the viewing angle corrected to the user terminal device 12 .

（ステップＳ２１４）ユーザ端末装置１２は、主コンテンツの画像５０１上に、付加コンテンツの画像５１１を挿入して、三次元画像表示デバイス１３に表示する。 (Step S214 ) The user terminal device 12 inserts the additional content image 511 into the main content image 501 and displays it on the three-dimensional image display device 13 .

第２の実施の形態では、事前に、エクイレクタングラー形式の画像からキューブ投影を行って、パターン認識しているため、パターン認識率が向上し、高精度にパターン認識を行うことができる。 In the second embodiment, since pattern recognition is performed by performing cube projection from an equirectangular image in advance, the pattern recognition rate is improved and pattern recognition can be performed with high accuracy.

上述した第１の実施の形態と第２の実施の形態では、主コンテンツの画像５０１の主コンテンツ（例えば、コンサートのライブ映像）が描画された箇所とは異なる箇所を付加コンテンツの表示領域枠５０２として抽出する場合を例示して説明したが、これに限定されることはない。第１の実施の形態と第２の実施の形態の画像処理システム１では、主コンテンツの画像５０１における主コンテンツの領域において特定のパターンを認識するようにしてもよい。また、上述した第１の実施の形態と第２の実施の形態では、表示画像として仮想空間画像を適用しているが、これに限定されるものではなく、仮想の画像だけではなく、現在位置において、画像表示デバイスが、図示しない撮像部を通して表示させた画像や、他の通信インターフェースを介して受信した画像であってもよい。また、表示画像に関して、三次元画像に限定されるものではなく、二次元画像であってもよい。 In the above-described first and second embodiments, a portion of the main content image 501 different from the portion where the main content (for example, a live video of a concert) is drawn is displayed in the display area frame 502 of the additional content. Although the case where it extracts as was illustrated and demonstrated, it is not limited to this. In the image processing system 1 of the first embodiment and the second embodiment, a specific pattern may be recognized in the main content area in the main content image 501 . In addition, in the above-described first and second embodiments, a virtual space image is applied as a display image, but the display image is not limited to this. , the image display device may be an image displayed through an imaging unit (not shown) or an image received via another communication interface. Moreover, the display image is not limited to a three-dimensional image, and may be a two-dimensional image.

上述した第１の実施の形態と第２の実施の形態における画像処理システム１の全部または一部をコンピュータで実現するようにしてもよい。その場合、この機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよく、ＦＰＧＡ等のプログラマブルロジックデバイスを用いて実現されるものであってもよい。 All or part of the image processing system 1 in the above-described first and second embodiments may be realized by a computer. In that case, a program for realizing this function may be recorded in a computer-readable recording medium, and the program recorded in this recording medium may be read into a computer system and executed. It should be noted that the "computer system" referred to here includes hardware such as an OS and peripheral devices. The term "computer-readable recording medium" refers to portable media such as flexible discs, magneto-optical discs, ROMs and CD-ROMs, and storage devices such as hard discs incorporated in computer systems. Furthermore, "computer-readable recording medium" means a medium that dynamically retains a program for a short period of time, like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. It may also include something that holds the program for a certain period of time, such as a volatile memory inside a computer system that serves as a server or client in that case. Further, the program may be for realizing a part of the functions described above, or may be capable of realizing the functions described above in combination with a program already recorded in the computer system. It may be implemented using a programmable logic device such as FPGA.

上記の実施形態の一部又は全部は、以下の付記のようにも記載されうるが、以下には限られない。 Some or all of the above-described embodiments can also be described in the following supplementary remarks, but are not limited to the following.

（付記１）仮想空間画像に含まれる所定のパターンを認識するパターン認識部と、
前記パターン認識部により認識された前記所定のパターンに基づいて、前記仮想空間画像に付加する付加画像を表示させる表示領域を、前記仮想空間画像から抽出する抽出部と、前記表示領域に応じて前記付加画像を変換する変換部と、前記抽出部により抽出された前記表示領域に、前記変換部により変換された前記付加画像を表示させる表示設定部と、を備えることを特徴とする画像処理システム。 (Appendix 1) A pattern recognition unit that recognizes a predetermined pattern included in a virtual space image;
an extraction unit for extracting from the virtual space image a display area for displaying an additional image to be added to the virtual space image based on the predetermined pattern recognized by the pattern recognition unit; An image processing system comprising: a conversion unit that converts an additional image; and a display setting unit that displays the additional image converted by the conversion unit in the display area extracted by the extraction unit.

（付記２）前記パターン認識部は、予め記憶させた形状と、仮想空間画像に含まれる形状とを比較することにより前記所定のパターンを認識する付記１に記載の画像処理システム。 (Supplementary note 2) The image processing system according to Supplementary note 1, wherein the pattern recognition unit recognizes the predetermined pattern by comparing a shape stored in advance with a shape included in the virtual space image.

（付記３）前記パターン認識部は、前記仮想空間画像に含まれる人物の顔、又は前記仮想空間画像に含まれる物体の形状のうち少なくともいずれかを認識する付記１又は付記２に記載の画像処理システム。 (Supplementary note 3) The image processing according to Supplementary note 1 or Supplementary note 2, wherein the pattern recognition unit recognizes at least one of a person's face included in the virtual space image and a shape of an object included in the virtual space image. system.

（付記４）前記変換部は、射影変換マトリクスに基づいて、前記付加画像を変換する付記１に記載の画像処理システム。 (Appendix 4) The image processing system according to Appendix 1, wherein the conversion unit converts the additional image based on a projective transformation matrix.

（付記５）前記パターン認識部は、前記仮想空間画像のうち、ユーザの視点位置からユーザの視線方向にある所定の視野角の範囲にある画像に含まれる所定のパターンを認識する付記１に記載の画像処理システム。 (Supplementary Note 5) According to Supplementary note 1, the pattern recognition unit recognizes a predetermined pattern included in an image within a predetermined viewing angle range from the user's viewpoint position to the user's line-of-sight direction in the virtual space image. image processing system.

（付記６）前記パターン認識部は、全天球画像の仮想空間画像をキューブ画像に変換した画像に含まれる所定のパターンを認識する付記１に記載の画像処理システム。 (Supplementary note 6) The image processing system according to Supplementary note 1, wherein the pattern recognition unit recognizes a predetermined pattern included in an image obtained by converting a virtual space image of an omnidirectional image into a cube image.

（付記７）前記表示設定部は、前記変換部により変換された前記付加画像の形状が所定の条件を満たすか否かを判定し、前記所定の条件を満たす場合に、前記表示領域に前記変換部により変換された前記付加画像を表示させる付記１から付記４のいずれか一項に記載の画像処理システム。 (Additional Note 7) The display setting unit determines whether or not the shape of the additional image converted by the conversion unit satisfies a predetermined condition, and if the predetermined condition is satisfied, the display area is converted into the display area. 5. The image processing system according to any one of appendices 1 to 4, wherein the additional image converted by the unit is displayed.

（付記８）前記所定の条件は、変換された前記付加画像の面積が所定の閾値以上である付記７に記載の画像処理システム。 (Supplementary Note 8) The image processing system according to Supplementary Note 7, wherein the predetermined condition is that the area of the converted additional image is equal to or greater than a predetermined threshold.

（付記９）前記表示設定部は、前記抽出部により抽出された前記表示領域へのユーザの視線の滞留時間を取得し、前記ユーザの視線の滞留時間が所定の閾値以上か否かを判定し、前記ユーザの視線の滞留時間が所定の閾値以上となる場合に、前記表示領域に前記変換部により変換された前記付加画像を表示させる付記１に記載の画像処理システム。 (Supplementary Note 9) The display setting unit acquires a dwell time of the user's line of sight in the display area extracted by the extraction unit, and determines whether the dwell time of the user's line of sight is equal to or greater than a predetermined threshold. 2. The image processing system according to claim 1, wherein the additional image converted by the conversion unit is displayed in the display area when the user's gaze retention time is equal to or greater than a predetermined threshold.

（付記１０）前記表示設定部は、同一の対象で視点や装飾を変えた複数の仮想空間画像に対して、それぞれ独立して付加画像を選択可能とする付記１から付記９の何れか一項に記載の画像処理システム。 (Supplementary Note 10) Any one of Supplementary Note 1 to Supplementary Note 9, wherein the display setting unit is capable of independently selecting an additional image for each of a plurality of virtual space images of the same object with different viewpoints and decorations. The image processing system described in .

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 Although the embodiment of the present invention has been described in detail with reference to the drawings, the specific configuration is not limited to this embodiment, and design and the like are included within the scope of the gist of the present invention.

１：画像処理システム、１０：仮想空間画像配信サーバ、１１：付加画像配信サーバ、１２：ユーザ端末装置、１３：三次元画像表示デバイス、１６１：パターン認識部、１６２：抽出部、１６３：変換部、１６４：表示設定部。 1: Image processing system 10: Virtual space image distribution server 11: Additional image distribution server 12: User terminal device 13: Three-dimensional image display device 161: Pattern recognition unit 162: Extraction unit 163: Conversion unit , 164: a display setting unit;

Claims

A pattern recognition unit that recognizes a predetermined pattern included in a virtual space image, and when the virtual space image is deformed in response to a change in the line-of-sight direction of a user, the predetermined pattern is determined based on the amount of deformation of the virtual space image. a pattern recognition unit that transforms the
Based on the predetermined pattern deformed by the pattern recognition unit, a display region having a shape corresponding to the deformed predetermined pattern and in which an additional image to be added to the virtual space image is displayed is the virtual space image. an extraction unit that extracts from the spatial image;
a conversion unit that converts the shape of the additional image so as to correspond to the shape of the deformed predetermined pattern according to the display area;
a display setting unit configured to display the additional image converted by the conversion unit in the display area extracted by the extraction unit;
An image processing system comprising:

A pattern recognition unit that recognizes a predetermined pattern included in a display image, and transforms the predetermined pattern based on the amount of deformation of the display image when the display image is deformed according to a change in the line-of-sight direction of the user. a pattern recognition unit;
Based on the predetermined pattern deformed by the pattern recognition unit, a display area having a shape corresponding to the predetermined pattern that is deformed and in which an additional image to be added to the display image is displayed is selected from the display image. an extractor that extracts from
a display setting unit configured to display the additional image deformed so as to correspond to the shape of the predetermined deformed pattern in the display area extracted by the extracting unit;
An image processing system comprising:

3. The image processing system according to claim 2, wherein the pattern recognition section recognizes the predetermined pattern included in the display image being displayed.

The display image is a moving image configured by continuously displaying a plurality of still images,
The pattern recognition unit displays the predetermined pattern displayed at the same position of the display image, or displaces the display position of the display image, among the plurality of still images that are displayed in a preset number. 4. The image processing system according to claim 2, wherein said predetermined pattern is recognized.

5. The image processing system according to any one of claims 2 to 4 , further comprising a conversion section that transforms the shape of the additional image based on the predetermined pattern recognized by the pattern recognition section.

The display setting unit determines whether or not the shape of the additional image satisfies a predetermined condition, and performs setting to display the additional image in the display area when the predetermined condition is satisfied. 6. The image processing system according to claim 5 .

The predetermined condition is that the area of the additional image is equal to or greater than a predetermined threshold;
The display setting unit determines that the predetermined condition is satisfied when the area of the additional image is equal to or greater than a predetermined threshold value, and displays the additional image in the display area.
The image processing system according to claim 6 .

The display setting unit obtains a dwell time of the user's line of sight in the display area extracted by the extracting unit, determines whether the dwell time of the user's line of sight is equal to or greater than a predetermined threshold, 8. The image processing system according to any one of claims 2 to 7 , wherein a setting is made to display the additional image in the display area when a line-of-sight retention time is equal to or greater than a predetermined threshold.

9. The display setting unit according to any one of claims 2 to 8 , wherein an additional image can be independently selected for each of a plurality of virtual space images of the same object with different viewpoints and decorations. image processing system.

An image processing method in an image processing system,
A pattern recognition unit recognizes a predetermined pattern included in a virtual space image, and when the virtual space image is deformed in response to a change in the line-of-sight direction of the user, the predetermined pattern is determined based on the amount of deformation of the virtual space image. transform the
A display area having a shape corresponding to the predetermined pattern deformed by the pattern recognition section and displaying an additional image to be added to the virtual space image. is extracted from the virtual space image,
a conversion unit converting the shape of the additional image so as to correspond to the shape of the deformed predetermined pattern according to the display area;
The image processing method, wherein a display setting unit causes the additional image converted by the conversion unit to be displayed in the display area extracted by the extraction unit.

An image processing method in an image processing system,
A pattern recognition unit recognizes a predetermined pattern included in a display image, and when the display image is deformed in response to a change in the line-of-sight direction of the user, deforms the predetermined pattern based on the amount of deformation of the display image. ,
Based on the predetermined pattern transformed by the pattern recognition unit, the extraction unit selects a display area having a shape corresponding to the transformed predetermined pattern and displaying an additional image to be added to the display image. , extracted from the displayed image;
The image processing method, wherein a display setting unit performs setting to display the additional image deformed so as to correspond to the shape of the deformed predetermined pattern in the display area extracted by the extracting unit.