JP2015056041A

JP2015056041A - Image generation system, image generation method, image generation program, line of sight prediction system, line of sight prediction method, and line of sight prediction program

Info

Publication number: JP2015056041A
Application number: JP2013189294A
Authority: JP
Inventors: 崇竹之内; Takashi Takenouchi; 文男唐澤; Fumio Karasawa; 智思芥川; Tomokoto Akutagawa
Original assignee: 3M Innovative Properties Co
Current assignee: 3M Innovative Properties Co
Priority date: 2013-09-12
Filing date: 2013-09-12
Publication date: 2015-03-23

Abstract

PROBLEM TO BE SOLVED: To easily obtain an image used for predicting an area having a tendency to attract visual attention in a scene.SOLUTION: An image generation system in one embodiment of the present invention is equipped with one or a plurality of processors. At least one of the processors (a) accepts an input image, (b) generates an image to be evaluated by referring to a storage unit for storing a pattern to be replaced and a replacement pattern in correlation with each other and replacing the pattern to be replaced that is specified within the input image with the replacement pattern, and (c) outputs the image to be evaluated in order to execute, in a line of sight prediction system, a process for predicting an area in the image to be evaluated that has a tendency to attract visual attention.

Description

本発明の一側面は、画像生成システム、画像生成方法、画像生成プログラム、視線予測システム、視線予測方法、および視線予測プログラムに関する。 One aspect of the present invention relates to an image generation system, an image generation method, an image generation program, a gaze prediction system, a gaze prediction method, and a gaze prediction program.

従来から、シーン内において視覚的注意を惹き付ける傾向がある領域を予測する技術が知られている。例えば下記特許文献１には、入力されたシーンを受信し、視覚的注意モデルを入力されたシーンに適用し、入力されたシーン内の、視覚的注意を引き付ける傾向がある領域を予測するように機能する視覚的注意モジュールと、視覚的注意モジュールと相互作用して、識別された領域の少なくとも一つが堅牢（顕著）である、又はシーンが堅牢である、という程度を決定するように機能する堅牢性（顕著性）評価モジュールとを含むコンピュータシステムが記載されている。また、下記特許文献２，３には、そのコンピュータシステムと類似の機能を備えるシステムが記載されている。 2. Description of the Related Art Conventionally, a technique for predicting a region that tends to attract visual attention in a scene is known. For example, Patent Document 1 below receives an input scene, applies a visual attention model to the input scene, and predicts a region in the input scene that tends to attract visual attention. A functional visual attention module and a robust functioning to interact with the visual attention module to determine the degree to which at least one of the identified areas is robust (prominent) or the scene is robust A computer system is described that includes a sexiness (saliency) assessment module. Patent Documents 2 and 3 below describe systems having functions similar to those of the computer system.

特表２０１２−５０４８２７号公報Special table 2012-504827 gazette 特表２０１２−５０４８２８号公報Special table 2012-504828 gazette 特表２０１２−５０４８３０号公報Special table 2012-504830 gazette

特許文献１〜３に記載のコンピュータシステムを用いて視覚的注意に関する評価、すなわち視線予測を行う際には、評価対象となるシーンの画像を用意する必要がある。一つのシーンについて評価するとしても、そのシーン内における特定の領域についてパターンを入れ替えたりしながら視線予測処理を行うことが多い。例えば、最も効果的にユーザの注意を惹き付ける看板を予測するとしても、デザイン（形状、模様、色など）を微妙に変えながら多くのパターンについて評価することが多い。 When performing an evaluation relating to visual attention using the computer systems described in Patent Documents 1 to 3, that is, a gaze prediction, it is necessary to prepare an image of a scene to be evaluated. Even if one scene is evaluated, the line-of-sight prediction process is often performed while changing the pattern for a specific area in the scene. For example, even if the signboard that most effectively attracts the user's attention is predicted, many patterns are often evaluated while slightly changing the design (shape, pattern, color, etc.).

したがって、一般に視線予測を行おうとする場合には、一つのシーンについての評価を行う場合でも当該シーンについて多数の画像を用意する必要がある。しかし、それらの画像をユーザが一つ一つ準備するのは容易ではない。そこで、シーン内において視覚的注意を惹き付ける傾向がある領域を予測するために用いる画像を簡単に得られる仕組みが望まれている。 Therefore, in general, when performing line-of-sight prediction, it is necessary to prepare a large number of images for a scene even when evaluating one scene. However, it is not easy for the user to prepare these images one by one. Therefore, a mechanism that can easily obtain an image used for predicting a region that tends to attract visual attention in a scene is desired.

本発明の一側面に係る画像生成システムは、一または複数のプロセッサを備え、少なくとも一つのプロセッサが、入力画像を受け付け、置換対象パターンと置換パターンとを関連付けて記憶する記憶部を参照して、入力画像内で特定された置換対象パターンを置換パターンに置換することで、評価対象画像を生成し、評価対象画像において視覚的注意を惹き付ける傾向がある領域を予測する処理を視線予測システムにおいて実行するために、該評価対象画像を出力する。 An image generation system according to an aspect of the present invention includes one or a plurality of processors, and at least one processor receives an input image and refers to a storage unit that stores a replacement target pattern and a replacement pattern in association with each other, By replacing the replacement target pattern specified in the input image with the replacement pattern, the evaluation target image is generated, and the process that predicts the region that tends to attract visual attention in the evaluation target image is executed in the gaze prediction system In order to do so, the evaluation target image is output.

このような側面においては、置換対象パターンに対応する置換パターンが予め記憶されており、そのデータを用いて、入力画像に含まれる置換対象パターンが置換パターンに置き換えられる。そして、この置換により生成された評価対象画像が、視覚的注意を惹き付ける傾向がある領域の予測のために出力される。この一連の処理により入力画像の一部が自動的に変換されるので、ユーザは、あるシーンについて一つの入力画像さえ用意すれば、その入力画像から、一部の領域が異なる評価対象画像を得ることが可能になる。すなわち、シーン内において視覚的注意を惹き付ける傾向がある領域を予測するために用いる画像を簡単に得ることができる。 In such an aspect, a replacement pattern corresponding to the replacement target pattern is stored in advance, and the replacement target pattern included in the input image is replaced with the replacement pattern using the data. Then, the evaluation target image generated by this replacement is output for prediction of a region that tends to attract visual attention. Since a part of the input image is automatically converted by this series of processing, if the user prepares only one input image for a certain scene, an evaluation target image having a part of a different area is obtained from the input image. It becomes possible. That is, it is possible to easily obtain an image used for predicting a region that tends to attract visual attention in a scene.

本発明の一側面によれば、シーン内において視覚的注意を惹き付ける傾向がある領域を予測するために用いる画像を簡単に得ることができる。 According to an aspect of the present invention, it is possible to easily obtain an image used for predicting a region that tends to attract visual attention in a scene.

視線予測システムの全体像を示す図である。It is a figure which shows the whole image of a gaze prediction system. 解析システムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of an analysis system. 解析システムの評価結果の表示例を示す図である。It is a figure which shows the example of a display of the evaluation result of an analysis system. パターン情報の例を示す図である。It is a figure which shows the example of pattern information. 実施形態に係る画像生成システムのハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the image generation system which concerns on embodiment. 実施形態に係る画像生成システムの機能構成を示すブロック図である。It is a block diagram which shows the function structure of the image generation system which concerns on embodiment. 入力画像の例を示す図である。It is a figure which shows the example of an input image. 評価対象画像の例を示す図である。It is a figure which shows the example of an evaluation object image. 実施形態に係る画像生成システムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the image generation system which concerns on embodiment. 実施形態に係る画像生成プログラムの構成を示す図である。It is a figure which shows the structure of the image generation program which concerns on embodiment.

以下、添付図面を参照しながら本発明の実施形態を詳細に説明する。なお、図面の説明において同一又は同等の要素には同一の符号を付し、重複する説明を省略する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the description of the drawings, the same or equivalent elements are denoted by the same reference numerals, and redundant description is omitted.

図１〜８を用いて、実施形態に係る画像生成システムの機能および構成を説明する。本実施形態において、画像生成システムは視線予測システム１の一部である。視線予測システム１は、シーン内において視覚的注意を惹き付ける傾向がある領域を予測するコンピュータシステムである。図１に示すように、ユーザはユーザ端末ＴからネットワークＮ経由で視線予測システム１にアクセスして、所望のシーンについての評価結果を得ることができる。 The function and configuration of the image generation system according to the embodiment will be described with reference to FIGS. In the present embodiment, the image generation system is a part of the line-of-sight prediction system 1. The line-of-sight prediction system 1 is a computer system that predicts a region that tends to attract visual attention in a scene. As shown in FIG. 1, the user can access the line-of-sight prediction system 1 from the user terminal T via the network N and obtain an evaluation result for a desired scene.

ユーザ端末Ｔの種類は限定されず、例えば据置型又は携帯型のパーソナルコンピュータでもよいし、高機能携帯電話機（スマートフォン）や携帯電話機、携帯情報端末（ＰＤＡ）などの携帯端末でもよい。また、ネットワークＮの具体的な構成も限定されず、例えばインターネット、専用線、およびＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）の少なくとも一つから構成されてもよい。図１では３台のユーザ端末Ｔを示しているが、視線予測システム１におけるユーザ端末Ｔの台数は限定されない。 The type of the user terminal T is not limited, and may be a stationary or portable personal computer, or a portable terminal such as a high-function mobile phone (smart phone), a mobile phone, or a personal digital assistant (PDA). The specific configuration of the network N is not limited, and may be configured from at least one of the Internet, a dedicated line, and a LAN (Local Area Network), for example. Although three user terminals T are shown in FIG. 1, the number of user terminals T in the line-of-sight prediction system 1 is not limited.

図１に示すように、視線予測システム１は画像生成システム１０および解析システム２０を含む。画像生成システム１０は評価対象のシーンを撮影した画像（本明細書では「評価対象画像」という）を効率的に生成し、解析システム２０はその評価対象画像の中で人の視線がどこに集まりやすいかを予測する。本実施形態の特徴は画像生成システム１０にあるが、この画像生成システム１０の理解を容易にするために、まず解析システム２０の動作を説明する。 As shown in FIG. 1, the line-of-sight prediction system 1 includes an image generation system 10 and an analysis system 20. The image generation system 10 efficiently generates an image obtained by photographing the scene to be evaluated (referred to as “evaluation target image” in this specification), and the analysis system 20 easily collects where the human eyes are gathered in the evaluation target image. Predict. The feature of this embodiment lies in the image generation system 10. First, in order to facilitate understanding of the image generation system 10, the operation of the analysis system 20 will be described.

図２に示すように、解析システム２０は評価対象画像を受け付け（ステップＳ１１）、その画像に対して視線予測を実行し（ステップＳ１２）、最後にその評価結果を出力する（ステップＳ１３）。ステップＳ１１〜Ｓ１３で示される一連の処理は、評価対象画像において視覚的注意を惹き付ける傾向がある領域を予測する解析ステップに相当する。 As shown in FIG. 2, the analysis system 20 receives the evaluation target image (step S11), executes line-of-sight prediction on the image (step S12), and finally outputs the evaluation result (step S13). A series of processes shown in steps S11 to S13 corresponds to an analysis step of predicting a region that tends to attract visual attention in the evaluation target image.

ステップＳ１１において解析システム２０に入力される評価対象画像は静止画であっても動画であってもよい。本実施形態では、説明を簡単にするために、評価対象画像が静止画であることを前提とする。評価対象画像の被写体、すなわち、シーンは限定されない。例えば、評価対象画像は、広告や看板などが存在する市街地または店舗内を示す画像であってもよいし、広告、看板、商品陳列棚、店舗内ＰＯＰ、または、印刷ラベルもしくはカード類そのものを示す画像であってもよいし、商品または商品パッケージそのものを示す画像であってもよいし、ウェブページのキャプチャ画像であってもよい。これらの「画像」は、写真であってもよいし、コンピュータ・グラフィックスであってもよい。本実施形態では、広告や看板などが存在する市街地の画像を評価対象画像とした例を用いて説明する。 The evaluation target image input to the analysis system 20 in step S11 may be a still image or a moving image. In this embodiment, in order to simplify the description, it is assumed that the evaluation target image is a still image. The subject of the evaluation target image, that is, the scene is not limited. For example, the evaluation target image may be an image showing an urban area or a store where an advertisement or a signboard exists, or an advertisement, a signboard, a product display shelf, an in-store POP, or a printed label or card itself. It may be an image, an image showing a product or a product package itself, or a captured image of a web page. These “images” may be photographs or computer graphics. In the present embodiment, description will be made using an example in which an image of an urban area where an advertisement or a signboard is present is used as an evaluation target image.

ステップＳ１２において、解析システム２０は評価対象画像に対する人の視線の動きをシミュレーションする。この処理においては様々な従来技術を用いることができるので本明細書では具体的な手法の説明を省略するが、例えば、上記特許文献１〜３に記載の手法を用いて視線予測を実行してもよい。 In step S <b> 12, the analysis system 20 simulates the movement of a person's line of sight with respect to the evaluation target image. Since various conventional techniques can be used in this process, a description of a specific method is omitted in this specification. For example, eye gaze prediction is executed using the methods described in Patent Documents 1 to 3 above. Also good.

ステップＳ１３において解析システム２０は評価結果を表示するが、結果の表現方法も限定されない。評価結果の表示例を図３に示す。例えば、解析システム２０は評価対象画像５０に対して、視線の集まりやすさの分布を色分けによって示すヒートマップ５１を出力してもよい。あるいは、解析システム２０はその分布を数個の段階（例えば３段階）に分けて示すリージョンマップ５２を出力してもよい。このリージョンマップ５２は、視線が集まる領域を枠で示すと共に各領域についての視線の集中度合いを数値で示す手法であるが、図３ではその数値を省略している。あるいは、解析システム２０は視線の動きを線で示す視線トラッキングチャート５３を出力してもよい。なお、評価結果の表示については上記特許文献１〜３にも記載されている。このように表現方法は様々であるが、いずれにせよ、ユーザはその評価結果から、視線を集めやすい（顕著性が高い）デザインが何かを知ることができる。 In step S13, the analysis system 20 displays the evaluation result, but the expression method of the result is not limited. A display example of the evaluation result is shown in FIG. For example, the analysis system 20 may output a heat map 51 showing the distribution of the ease of gathering of the line of sight with color coding for the evaluation target image 50. Alternatively, the analysis system 20 may output a region map 52 showing the distribution divided into several stages (for example, three stages). The region map 52 is a technique for indicating the area where the line of sight gathers with a frame and indicating the concentration of the line of sight for each area with a numerical value, but the numerical value is omitted in FIG. Alternatively, the analysis system 20 may output a line-of-sight tracking chart 53 that indicates the movement of the line of sight as a line. The display of the evaluation result is also described in Patent Documents 1 to 3. As described above, there are various expression methods, but in any case, the user can know from the evaluation result what design is easy to collect lines of sight (highly noticeable).

また、評価結果の出力先も限定されず、例えば解析システム２０は図３に示すような評価結果をユーザ端末Ｔに送信して当該端末Ｔ上に表示させてもよいし、この送信に代えてまたは加えて、任意の記憶部にその結果を格納してもよい。 Also, the output destination of the evaluation result is not limited. For example, the analysis system 20 may transmit the evaluation result as shown in FIG. 3 to the user terminal T and display it on the terminal T, or instead of this transmission. Alternatively, the result may be stored in an arbitrary storage unit.

ユーザはこの解析システム２０を用いて、例えば人目を惹き付ける看板または広告を予測することができるが、そのためには、一つの看板または広告について複数種類のデザイン案を予め用意し、個々の案について解析システム２０に予測させる必要がある。例えば、ある商店街においてどのような看板が目立つかを知りたい場合には、ユーザは、その看板の部分だけが異なる複数枚の商店街の画像を解析システム２０に入力する必要がある。このような作業をユーザ自身が行うのは容易ではないが、視線予測システム１は画像生成システム１０を備えているので、ユーザは複数パターンの評価対象画像を容易に得ることができる。 The user can use this analysis system 20 to predict, for example, a signboard or an advertisement that attracts people's eyes. For this purpose, a plurality of types of design proposals are prepared in advance for one signboard or advertisement. The analysis system 20 needs to be predicted. For example, when it is desired to know what signboard stands out in a certain shopping street, the user needs to input images of a plurality of shopping streets that differ only in the signboard portion to the analysis system 20. Although it is not easy for the user himself to perform such work, the gaze prediction system 1 includes the image generation system 10, and thus the user can easily obtain a plurality of patterns of evaluation target images.

画像生成システム１０は評価対象画像を自動的にまたは半自動的に生成するためにパターンデータベース３０を参照する。視線予測システム１において画像生成システム１０とパターンデータベース３０とは別々であってもよく、この場合には画像生成システム１０はネットワークを介してパターンデータベース３０にアクセスする。あるいは、１台のコンピュータが画像生成システム１０およびパターンデータベース３０の双方を実装してもよい。 The image generation system 10 refers to the pattern database 30 in order to generate an evaluation target image automatically or semi-automatically. In the line-of-sight prediction system 1, the image generation system 10 and the pattern database 30 may be separate. In this case, the image generation system 10 accesses the pattern database 30 via a network. Alternatively, one computer may implement both the image generation system 10 and the pattern database 30.

パターンデータベース３０は、モデル毎に用意された複数のパターンを記憶する装置（記憶部）である。本明細書における「モデル」とは置換の対象となるオブジェクトである。オブジェクトの例としては、市街地に設置される看板または広告、ウェブページにおけるバナー広告やロゴなどが挙げられるが、モデルとして設定されるオブジェクトはこれらに限定されない。 The pattern database 30 is a device (storage unit) that stores a plurality of patterns prepared for each model. A “model” in this specification is an object to be replaced. Examples of the object include a signboard or advertisement installed in an urban area, a banner advertisement or a logo on a web page, but the object set as a model is not limited to these.

パターンデータベース３０に記憶されるパターン情報の例を図４に示す。パターン情報のレコードはモデルごとに用意され、各モデルに対して、拡張現実感（ＡｕｇｍｅｎｔｅｄＲｅａｌｉｔｙ（ＡＲ））で用いられるような一つのマーカ（ｍａｒｋｅｒ）と、複数のパターンとが関連付けられる。パターンは画像のことであり、その作成方法は特に限定されない。例えば、パターンはコンピュータ・グラフィックスにより作成されてもよいし、実風景の画像から作成されてもよいし、あるいは、手描きのイラストをスキャンニングすることで作成されてもよい。 An example of pattern information stored in the pattern database 30 is shown in FIG. A record of pattern information is prepared for each model, and a single marker as used in augmented reality (AR) and a plurality of patterns are associated with each model. A pattern is an image, and the creation method is not particularly limited. For example, the pattern may be created by computer graphics, may be created from an image of a real landscape, or may be created by scanning a hand-drawn illustration.

図４の例では、モデルＡについての三つのパターンとモデルＢについての二つのパターンとを示している。モデルＡについては、中央に星印を配する点で３パターンは共通するが、前景及び背景の色の組合せがパターン間で異なる。モデルＢについては、文字の配置と背景のデザインとがパターン間で異なる。個々のモデルにおいていくつのパターンを用意するかは限定されず、また、パターンの個数をモデル間で統一させる必要はない。 In the example of FIG. 4, three patterns for model A and two patterns for model B are shown. For model A, the three patterns are common in that an asterisk is placed at the center, but the combination of foreground and background colors differs between the patterns. For model B, the arrangement of characters and the background design differ between patterns. The number of patterns prepared in each model is not limited, and the number of patterns does not need to be unified among the models.

図４に示すように、各パターンは、オリジナルのパターン画像（左側）と、黒い外縁で示されるマスク画像（右側）との組合せにより形成される。パターン画像およびマスク画像の大きさは同じである。マスク画像は、パターン画像を入力画像に重畳する際に当該パターン画像の余分な部分を透明にするために用いられる。例えば、モデルＢのパターンは、入力画像に重畳された際には四角形ではなく八角形のマークとして表現される。一方、マーカはパターンのように画像は二つに分かれておらず、外枠付きの一つの画像として用意される。 As shown in FIG. 4, each pattern is formed by a combination of an original pattern image (left side) and a mask image (right side) indicated by a black outer edge. The pattern image and the mask image have the same size. The mask image is used to make an extra portion of the pattern image transparent when the pattern image is superimposed on the input image. For example, the pattern of the model B is expressed as an octagonal mark instead of a quadrangle when superimposed on the input image. On the other hand, the marker is prepared as a single image with an outer frame, instead of being divided into two like a pattern.

なお、パターンデータベースおよびパターン情報の具体的な構成は図４の態様に限定されず、任意の正規化又は冗長化を行ってよい。例えば、後述するようにマーカを用いることなくパターンを置換するのであれば、パターン情報においてマーカを省略することができる。図４ではマーカおよびパターンを２次元画像で示しているが、マーカおよびパターンは３次元画像であってもよい。 The specific configurations of the pattern database and the pattern information are not limited to those shown in FIG. 4, and any normalization or redundancy may be performed. For example, if the pattern is replaced without using a marker as will be described later, the marker can be omitted in the pattern information. In FIG. 4, the marker and the pattern are shown as a two-dimensional image, but the marker and the pattern may be a three-dimensional image.

次に、画像生成システム１０について詳細に説明する。画像生成システム１０のハードウェア構成は図５に示す通りである。画像生成システム１０は、オペレーティングシステムやアプリケーション・プログラムなどを実行するＣＰＵ１０１と、ＲＯＭ及びＲＡＭで構成される主記憶部１０２と、ハードディスクやフラッシュメモリなどで構成される補助記憶部１０３と、ネットワークカードあるいは無線通信モジュールで構成される通信制御部１０４と、キーボードやマウスなどの入力装置１０５と、ディスプレイなどの出力装置１０６とを備えている。 Next, the image generation system 10 will be described in detail. The hardware configuration of the image generation system 10 is as shown in FIG. The image generation system 10 includes a CPU 101 that executes an operating system, application programs, and the like, a main storage unit 102 that includes a ROM and a RAM, an auxiliary storage unit 103 that includes a hard disk, a flash memory, and the like, and a network card or The communication control unit 104 includes a wireless communication module, an input device 105 such as a keyboard and a mouse, and an output device 106 such as a display.

後述する画像生成システム１０の各機能的構成要素は、ＣＰＵ１０１又は主記憶部１０２の上に所定のソフトウェアを読み込ませ、ＣＰＵ１０１の制御の下で通信制御部１０４や入力装置１０５、出力装置１０６などを動作させ、主記憶部１０２又は補助記憶部１０３におけるデータの読み出し及び書き込みを行うことで実現される。処理に必要なデータやデータベースは主記憶部１０２又は補助記憶部１０３内に格納される。 Each functional component of the image generation system 10 to be described later reads predetermined software on the CPU 101 or the main storage unit 102, and controls the communication control unit 104, the input device 105, the output device 106, and the like under the control of the CPU 101. This is realized by operating and reading and writing data in the main storage unit 102 or the auxiliary storage unit 103. Data and a database necessary for processing are stored in the main storage unit 102 or the auxiliary storage unit 103.

なお、画像生成システム１０は１台のコンピュータで構成されてもよいし、複数台のコンピュータで構成されていてもよい。 Note that the image generation system 10 may be configured by a single computer or a plurality of computers.

図６に示すように、画像生成システム１０は機能的構成要素として受付部１１、生成部１２、および出力部１３を備えている。これらの機能は主に少なくとも一つのプロセッサ（ＣＰＵ１０１）が動作することで実現される。 As illustrated in FIG. 6, the image generation system 10 includes a reception unit 11, a generation unit 12, and an output unit 13 as functional components. These functions are realized mainly by the operation of at least one processor (CPU 101).

受付部１１は、ユーザにより指定された入力画像を受け付ける機能要素である。受付部１１は取得した入力画像を生成部１２に出力する。受付部１１はユーザ端末Ｔから送られてきた入力画像を受信してもよいし、ユーザ端末Ｔからの指示に基づいて、所定の記憶部（例えば、メモリやデータベースなど）から画像を読み出してもよい。入力画像の例を図７に示す。入力画像６１，６２はともに市街地の写真であるが、入力画像６１では看板６１ａ，６１ｂには実際のデザインが描かれているのに対して、入力画像６２では看板６２ａ，６２ｂの部分にマーカが掲げられている。現実世界において看板部分にマーカを貼ることは通常は困難なので入力画像６１を用いる方が現実的ではあるが、本発明はマーカを含む入力画像６２を排除しない。 The accepting unit 11 is a functional element that accepts an input image designated by the user. The reception unit 11 outputs the acquired input image to the generation unit 12. The receiving unit 11 may receive an input image sent from the user terminal T, or may read an image from a predetermined storage unit (for example, a memory or a database) based on an instruction from the user terminal T. Good. An example of the input image is shown in FIG. The input images 61 and 62 are both photographs of the urban area. In the input image 61, the actual designs are drawn on the signboards 61a and 61b. On the other hand, in the input image 62, markers are placed on the signboards 62a and 62b. It is listed. Since it is usually difficult to put a marker on a signboard part in the real world, it is more practical to use the input image 61, but the present invention does not exclude the input image 62 including the marker.

生成部１２は、入力画像の一部を置換することで評価対象画像を生成する機能要素である。一つの入力画像において複数の領域で置換が発生する可能性があるが、以下では、説明を簡単にするために、一つの領域での置換について説明する。 The generation unit 12 is a functional element that generates an evaluation target image by replacing a part of the input image. Although replacement may occur in a plurality of regions in one input image, the replacement in one region will be described below for the sake of simplicity.

まず、生成部１２は置換対象パターンを特定する。マーカレス（ｍａｒｋｅｒｌｅｓｓ）の入力画像が入力された場合には、生成部１２はユーザ操作に応じて置換対象パターンを特定してもよい。この場合には、ユーザはパターンを置き換えたい領域をタッチ操作やマウス操作などにより指定し、ユーザ端末Ｔがその領域を示すデータを画像生成システム１０に送信し、生成部１２がそのデータで示される領域を保持する。あるいは、生成部１２はこのようなユーザ操作に依存することなく自動的に置換対象パターンを特定してもよい。この場合には、生成部１２は入力画像内の任意の領域とパターンデータベース３０内の各パターンとを比較（パターンマッチング）することで置換対象パターンを特定する。なお、パターンマッチングそのものは周知であり、具体的には、特徴量マッチング（特徴点ベースマッチング）、テンプレートマッチング（相関ベースマッチング）等の手法があるが、任意の手法を用いてよい。 First, the generation unit 12 specifies a replacement target pattern. When a markerless input image is input, the generation unit 12 may specify a replacement target pattern in accordance with a user operation. In this case, the user designates an area where the pattern is to be replaced by touch operation or mouse operation, and the user terminal T transmits data indicating the area to the image generation system 10, and the generation unit 12 is indicated by the data. Keep the area. Alternatively, the generation unit 12 may automatically specify the replacement target pattern without depending on such a user operation. In this case, the generation unit 12 specifies a replacement target pattern by comparing (pattern matching) an arbitrary region in the input image with each pattern in the pattern database 30. Note that pattern matching itself is well known. Specifically, there are methods such as feature amount matching (feature point base matching) and template matching (correlation base matching), but any method may be used.

マーカを含む入力画像が入力された場合には、生成部１２はパターンデータベース３０内のマーカを用いたパターンマッチングにより、ユーザ操作に依存することなく入力画像内のマーカを置換対象パターンとして特定する。この場合でもパターンマッチングの具体的な手法は任意に決めてよい。 When an input image including a marker is input, the generation unit 12 specifies the marker in the input image as a replacement target pattern without depending on the user operation by pattern matching using the marker in the pattern database 30. Even in this case, a specific method of pattern matching may be arbitrarily determined.

続いて、生成部１２は、置換対象パターンと入れ替えるための置換パターンを取得する。ここで、置換パターンとは、モデルが置換対象パターンと共通し、かつ当該置換対象パターンとは異なるパターンのことである。生成部１２は、置換対象パターンに対応する１以上の置換パターンをパターンデータベース３０から選択するためのユーザインタフェースをユーザ端末Ｔに提供し、そのインタフェースを介して選択された１以上の置換パターンを取得してもよい。あるいは、生成部１２はパターンデータベース３０を参照して、置換対象パターンに対応する置換パターンをユーザ操作に依存することなく読み出してもよい。 Subsequently, the generation unit 12 acquires a replacement pattern for replacement with the replacement target pattern. Here, the replacement pattern is a pattern having a model common to the replacement target pattern and different from the replacement target pattern. The generation unit 12 provides the user terminal T with a user interface for selecting one or more replacement patterns corresponding to the replacement target pattern from the pattern database 30, and obtains one or more replacement patterns selected via the interface. May be. Alternatively, the generation unit 12 may read the replacement pattern corresponding to the replacement target pattern without depending on the user operation with reference to the pattern database 30.

続いて、生成部１２は取得した置換パターンの大きさおよび姿勢を置換対象パターンと合わせるために当該置換パターンを射影変換により変形させ、変形された置換パターンを入力画像に重畳する。 Subsequently, the generation unit 12 deforms the replacement pattern by projective transformation in order to match the size and orientation of the acquired replacement pattern with the replacement target pattern, and superimposes the deformed replacement pattern on the input image.

変換前の座標を（ｘ，ｙ）とし、変換後の座標を（ｘ´，ｙ´）とすると、射影変換は下記式で表される。
ｘ´＝（ａ_１ｘ＋ｂ_１ｙ＋ｃ_１）／（ａ_０ｘ＋ｂ_０ｙ＋ｃ_０）
ｙ´＝（ａ_２ｘ＋ｂ_２ｙ＋ｃ_２）／（ａ_０ｘ＋ｂ_０ｙ＋ｃ_０） If the coordinates before conversion are (x, y) and the coordinates after conversion are (x ′, y ′), the projective transformation is expressed by the following equation.
x ′ = (a ₁ x + b ₁ y + c ₁ ) / (a ₀ x + b ₀ y + c ₀ )
y ′ = (a ₂ x + b ₂ y + c ₂ ) / (a ₀ x + b ₀ y + c ₀ )

これらの式において独立した未知数は８個なので、４点以上の座標が得られれば未知数を求めることができる。そこで、生成部１２は置換対象パターンおよび置換パターンのそれぞれから４頂点の座標を特定する。生成部１２はユーザにより指定された点を頂点として取得してもよいし、各パターンを解析することで頂点を取得してもよい。あるいは、各パターンの４頂点の座標をパターン情報に予め含めておき、生成部１２がパターンデータベース３０からその頂点座標を読み出してもよい。生成部１２は置換対象パターンにおける頂点（ｘ_ｎ，ｙ_ｎ）および置換パターンにおける頂点（ｘ´_ｎ，ｙ´_ｎ）（共に、ｎ＝１，２，３，４）を上記の射影変換の式に代入することで、下記式で示される射影変換マトリクスＰ_１（第２の射影変換マトリクス）を求める。なお、ｓは、同次座標表現における任意の実数である（ｓ≠０）。

In these formulas, the number of independent unknowns is 8, so if the coordinates of four or more points are obtained, the unknowns can be obtained. Therefore, the generation unit 12 specifies the coordinates of the four vertices from each of the replacement target pattern and the replacement pattern. The generation unit 12 may acquire a point designated by the user as a vertex, or may acquire a vertex by analyzing each pattern. Alternatively, the coordinates of the four vertices of each pattern may be included in the pattern information in advance, and the generation unit 12 may read the vertex coordinates from the pattern database 30. Generator 12 vertices in pattern to be replaced _(x n, _{y n)} and vertex in substitution pattern _(x'n, _y'n) _(both, n = 1, 2, 3, 4) the expression of the projective transformation To obtain a projective transformation matrix P ₁ (second projective transformation matrix) represented by the following equation. Note that s is an arbitrary real number in the homogeneous coordinate expression (s ≠ 0).

続いて、生成部１２は置換対象パターンと置換パターンとのそれぞれにおいて多数の特徴点の座標を求め、個々の特徴点について特徴量を算出する。特徴点の検出および特徴量の求め方は任意であり、例えばＯＲＢ、ＳＩＦＴ、ＳＵＲＦなどの周知のアルゴリズムを用いることができる。入力画像内の置換対象パターンが不鮮明である可能性があるので、生成部１２は特徴量を算出する前に置換対象パターンに対して先鋭化処理および／または平滑化処理を実行してもよい。先鋭化処理は、画像内のぼやけた輪郭を強調するフィルタ処理であり、アンシャープ・マスキング（ｕｎｓｈａｒｐｍａｓｋｉｎｇ）とも呼ばれる。この先鋭化処理は周知の技術である。平滑化処理は、画像内の輝度値を滑らかにする処理、あるいは画像中のノイズを除去する処理であり、この技術も周知である。この平滑化処理はガウシアンフィルタ等の平滑化フィルタを用いて行われる。 Subsequently, the generation unit 12 obtains the coordinates of a large number of feature points in each of the replacement target pattern and the replacement pattern, and calculates feature amounts for the individual feature points. The method of detecting feature points and determining the feature amount is arbitrary, and for example, a well-known algorithm such as ORB, SIFT, or SURF can be used. Since there is a possibility that the replacement target pattern in the input image is unclear, the generation unit 12 may perform sharpening processing and / or smoothing processing on the replacement target pattern before calculating the feature amount. The sharpening process is a filter process that enhances a blurred outline in an image, and is also referred to as unsharp masking. This sharpening process is a well-known technique. The smoothing process is a process for smoothing the luminance value in the image or a process for removing noise in the image, and this technique is also well known. This smoothing process is performed using a smoothing filter such as a Gaussian filter.

続いて、生成部１２は双方のパターンの特徴量を比較し、例えばロバスト推定手法の一つであるＲＡＮＳＡＣ等を用いて、下記式で示される射影変換マトリクスＰ_２（第１の射影変換マトリクス）を算出する。ｓは、同次座標表現における任意の実数である（ｓ≠０）。

Subsequently, the generation unit 12 compares the feature amounts of both patterns, and uses, for example, RANSAC, which is one of robust estimation methods, to projective transformation matrix P ₂ (first projective transformation matrix) represented by the following formula: Is calculated. s is an arbitrary real number in the homogeneous coordinate expression (s ≠ 0).

続いて、生成部１２は、置換対象パターンの歪みを示すマトリクスＰ_１，Ｐ_２を含む下記式を用いて置換パターンの座標を（ｘ，ｙ）から（ｘ´，ｙ´）に変換する。座標（ｘ，ｙ）はオリジナルの置換パターン（パターン情報で定義されている置換パターン）の座標であり、座標（ｘ´，ｙ´）は変形後の置換パターンの座標である。生成部１２は変形させた置換パターンを入力画像に重畳する。

Subsequently, the generation unit 12 converts the coordinates of the replacement pattern from (x, y) to (x ′, y ′) using the following expression including the matrices P ₁ and P ₂ indicating the distortion of the replacement target pattern. The coordinates (x, y) are the coordinates of the original replacement pattern (replacement pattern defined by the pattern information), and the coordinates (x ′, y ′) are the coordinates of the replacement pattern after deformation. The generation unit 12 superimposes the transformed replacement pattern on the input image.

この式に示すように、生成部１２はマトリクスＰ_１を用いなくてもよい。マトリクスＰ_１は、変形の精度をさらに向上させるために用いられ、例えば、置換対象パターンの姿勢が置換パターンと一定以上異なる場合においてその効果が顕著である。このようにマトリクスＰ_１は補助的な役割を担うので、生成部１２は射影変換においてマトリクスＰ_２のみを算出および利用してもよい。 As shown in this equation, generator 12 may not be used matrix P _1. Matrix P ₁ is used to further improve the accuracy of the deformation, for example, is remarkable its effect when the attitude of the pattern to be replaced is different than a predetermined and substitution pattern. Since the matrix P ₁ plays an auxiliary role, generating unit 12 may calculate and use only the matrix P ₂ in the projective transformation.

生成部１２は、重畳された置換パターンの違和感を低減するために、その置換パターンの色およびぼけ具合を周囲の領域に合わせる。色およびぼけ具合の補正についても周知技術を用いることができる。なお、生成部１２は置換パターンの色を補正した後にその置換パターンを入力画像に重畳してもよい。 The generation unit 12 matches the color and blur of the replacement pattern with the surrounding area in order to reduce the uncomfortable feeling of the superimposed replacement pattern. Well-known techniques can also be used for correction of color and blur. The generation unit 12 may superimpose the replacement pattern on the input image after correcting the color of the replacement pattern.

色補正は、色空間に対応する変換マトリクスＭまたはＭ´を用いて下記式で示される。ＭはＲＧＢ表色系を用いる場合の変換マトリクスであり、Ｍ´はＬ＊Ｃ＊ｈ表色系を用いる場合の変換マトリクスである。また、Ｒ，Ｇ，Ｂはそれぞれ、置換パターンのオリジナルの３原色（赤、緑、青）であり、Ｒ´，Ｇ´，Ｂ´はそれぞれ、置換パターンの調整後の３原色である。Ｌ＊，Ｃ＊，ｈはそれぞれ、置換パターンのオリジナルの明度、彩度、および色相であり、Ｌ＊´，Ｃ＊´，ｈ´はそれぞれ、置換パターンの調整後の明度、彩度、および色相である。なお、色補正では任意の色空間を用いてよい。

The color correction is expressed by the following equation using the conversion matrix M or M ′ corresponding to the color space. M is a conversion matrix when the RGB color system is used, and M ′ is a conversion matrix when the L * C * h color system is used. R, G, and B are the original three primary colors (red, green, and blue) of the replacement pattern, and R ′, G ′, and B ′ are the three primary colors after the replacement pattern is adjusted. L *, C *, and h are the original brightness, saturation, and hue of the replacement pattern, respectively. L * ′, C * ′, and h ′ are the brightness, saturation, and the adjusted pattern, respectively. Hue. Note that any color space may be used for color correction.

生成部１２はユーザがスライダなどのユーザインタフェースを介して入力したＲＧＢ値またはＬ＊Ｃ＊ｈ値に基づいて変換マトリクスを求めてもよい。あるいは、生成部１２はユーザ操作に依存することなく周知技術を用いて変換マトリクスを求めてもよい。 The generation unit 12 may obtain a conversion matrix based on RGB values or L * C * h values input by a user via a user interface such as a slider. Alternatively, the generation unit 12 may obtain the conversion matrix using a well-known technique without depending on the user operation.

ぼけ具合の補正は、例えばガウシアンフィルタ等の平滑化フィルタを用いた平滑化処理により実現できる。平滑化処理に用いるマトリクスＨは平滑化フィルタのサイズ（例えば３×３や５×５など）に応じて決まる。フィルタサイズに関して、生成部１２は、ユーザがスライダなどのユーザインタフェースを介して入力した値を受け付けてもよいし、ユーザ操作に依存することなく周知技術を用いて決定してもよい。 The correction of the degree of blur can be realized by a smoothing process using a smoothing filter such as a Gaussian filter. The matrix H used for the smoothing process is determined according to the size of the smoothing filter (for example, 3 × 3 or 5 × 5). Regarding the filter size, the generation unit 12 may accept a value input by the user via a user interface such as a slider, or may determine using a known technique without depending on a user operation.

生成部１２は、入力画像に重畳された置換パターンに変換マトリクスＭまたはＭ´を適用することで該パターンの色を補正する。さらに、生成部１２はその置換パターンにマトリクスＨを適用することで該パターンをぼかす。 The generation unit 12 corrects the color of the pattern by applying the conversion matrix M or M ′ to the replacement pattern superimposed on the input image. Further, the generation unit 12 applies the matrix H to the replacement pattern to blur the pattern.

以上の処理により、入力画像の一つの領域において置換対象パターンが置換パターンに置き換わる。一つの入力画像において複数の置換対象パターンが存在するか、またはユーザにより複数の置換対象パターンが指定されるのであれば、生成部１２は上記の置換を各置換対象パターンについて実行する。この際には、各種マトリクスＰ_１，Ｐ_２，Ｍ（またはＭ´），Ｈは置換対象パターンごとに生成される。生成部１２はすべての置換対象パターンについて処理することで一つ目の評価対象画像を生成する。 With the above processing, the replacement target pattern is replaced with the replacement pattern in one area of the input image. If a plurality of replacement target patterns exist in one input image, or if a plurality of replacement target patterns are designated by the user, the generation unit 12 performs the above replacement for each replacement target pattern. At this time, various matrices P ₁ , P ₂ , M (or M ′), and H are generated for each replacement target pattern. The generation unit 12 generates a first evaluation target image by processing all the replacement target patterns.

生成部１２が一つの入力画像から複数の評価対象画像を生成することもあり得る。例えば図４に示すモデルＡが入力画像の一つの領域に含まれているならば、生成部１２がそのモデルＡについてパターン１〜３を個々に適用することで三つの評価対象画像を生成し得る。この場合において、その領域における置換パターンの変形、色補正、およびぼけ具合の補正は三つのパターン間で共通である。したがって、生成部１２は一つ目の評価対象画像を生成する際に算出した各種マトリクスＰ_１，Ｐ_２，Ｍ（またはＭ´），Ｈを保持し、二つ目以降の評価対象画像を生成する際にこれらのマトリクスを再利用する。もし図４に示すモデルＡ，Ｂの双方が入力画像に含まれているならば、生成部１２がモデルＡについてパターン１〜３を個々に適用し、モデルＢについてパターン１，２を個々に適用するので、合計６（＝３×２）個の評価対象画像を生成し得る。生成部１２は生成した一または複数の評価対象画像を出力部１３に出力する。 The generation unit 12 may generate a plurality of evaluation target images from one input image. For example, if the model A shown in FIG. 4 is included in one region of the input image, the generation unit 12 can generate three evaluation target images by individually applying the patterns 1 to 3 to the model A. . In this case, the deformation of the replacement pattern, the color correction, and the blur correction in the region are common among the three patterns. Therefore, the generation unit 12 holds the various matrices P ₁ , P ₂ , M (or M ′), and H calculated when generating the _first evaluation target image, and generates the second and subsequent evaluation target images. These matrices are reused when doing so. If both models A and B shown in FIG. 4 are included in the input image, the generation unit 12 applies patterns 1 to 3 individually for model A, and applies patterns 1 and 2 individually to model B. Therefore, a total of 6 (= 3 × 2) evaluation target images can be generated. The generation unit 12 outputs the generated one or more evaluation target images to the output unit 13.

出力部１３は、解析システム２０での処理のために（すなわち、評価対象画像において視覚的注意を惹き付ける傾向がある領域を予測する処理のために）、評価対象画像を出力する機能要素である。図８は出力の例であり、図４に示すモデルＡの各パターンを適用した三つの評価対象画像を示している。看板６１ａ，６１ｂにパターン１が貼られている入力画像６１が受け付けられたのであれば、パターン２に置き換えられた評価対象画像７２とパターン３に置き換えられた評価対象画像７３とが出力される。看板６２ａ，６２ｂにマーカが貼られている入力画像６２が受け付けられたのであれば、パターン１に置き換えられた評価対象画像７１と、上記の評価対象画像７２，７３とが出力される。図８では表現されていないが、看板７１ａ，７１ｂ，７２ａ，７２ｂ，７３ａ，７３ｂには必要に応じて色補正および平滑化処理が施される。 The output unit 13 is a functional element that outputs an evaluation target image for processing in the analysis system 20 (that is, for processing that predicts a region that tends to attract visual attention in the evaluation target image). . FIG. 8 is an example of output, and shows three images to be evaluated to which each pattern of the model A shown in FIG. 4 is applied. If the input image 61 in which the pattern 1 is pasted on the signboards 61a and 61b is received, the evaluation target image 72 replaced with the pattern 2 and the evaluation target image 73 replaced with the pattern 3 are output. If the input image 62 in which the markers are pasted on the signboards 62a and 62b is received, the evaluation target image 71 replaced with the pattern 1 and the evaluation target images 72 and 73 are output. Although not represented in FIG. 8, the signboards 71a, 71b, 72a, 72b, 73a, and 73b are subjected to color correction and smoothing processing as necessary.

出力部１３は評価対象画像をユーザに確認させるために当該画像をユーザ端末Ｔに送信してもよい。この場合には、ユーザは送られてきた評価対象画像を確認した後に、ユーザ端末Ｔを操作して視線予測の処理を解析システム２０に指示する。あるいは、出力部１３は評価対象画像を解析システム２０に出力してもよく、この場合には、ユーザは解析システム２０から単独でまたは評価対象画像と共に送られてくる評価結果を確認する。 The output unit 13 may transmit the image to the user terminal T in order to make the user confirm the evaluation target image. In this case, after confirming the sent evaluation target image, the user operates the user terminal T to instruct the analysis system 20 to perform line-of-sight prediction processing. Alternatively, the output unit 13 may output the evaluation target image to the analysis system 20, and in this case, the user confirms the evaluation result sent from the analysis system 20 alone or together with the evaluation target image.

次に、図９を用いて、画像生成システム１０の動作を説明するとともに本実施形態に係る画像生成方法について説明する。 Next, the operation of the image generation system 10 will be described with reference to FIG. 9, and the image generation method according to the present embodiment will be described.

受付部１１が入力画像を受け付けると（ステップＳ２１、受付ステップ）、生成部１２が評価対象画像を生成する。具体的には、生成部１２は置換対象パターンを特定するとともに（ステップＳ２２）、置換パターンを取得する（ステップＳ２３）。これら２種類のパターンはユーザにより指定されてもよいし、ユーザ操作に依存することなく自動的に決定されてもよい。 When the reception unit 11 receives an input image (step S21, reception step), the generation unit 12 generates an evaluation target image. Specifically, the generation unit 12 specifies a replacement target pattern (step S22) and acquires a replacement pattern (step S23). These two types of patterns may be designated by the user, or may be automatically determined without depending on the user operation.

続いて、生成部１２は４頂点に基づく射影変換と特徴点に基づく射影変換により置換パターンを変形させる（ステップＳ２４）。このステップＳ２４において、生成部１２は特徴点に基づく射影変換のみを実行してもよい。続いて、生成部１２は変換した置換パターンを入力画像に重畳した上で（ステップＳ２５）、その置換パターンに対する色補正および平滑化処理を実行する（ステップＳ２６）。これにより、一つの置換パターンのはめ込みが完了する。 Subsequently, the generation unit 12 deforms the replacement pattern by projective transformation based on the four vertices and projective transformation based on the feature points (step S24). In step S24, the generation unit 12 may perform only projective transformation based on feature points. Subsequently, the generation unit 12 superimposes the converted replacement pattern on the input image (step S25), and executes color correction and smoothing processing on the replacement pattern (step S26). Thereby, the insertion of one replacement pattern is completed.

生成部１２は、処理すべき置換対象パターンが他に存在すれば、ステップＳ２２〜Ｓ２６の処理を繰り返し（ステップＳ２７参照）、これにより一つの評価対象画像が完成する（ステップＳ２８）。生成部１２は、生成すべき評価対象画像が他に存在すれば、生成ステップに相当するステップＳ２２〜Ｓ２８の処理を繰り返す（ステップＳ２９参照）。最後に、出力部１３が生成された１以上の評価対象画像を出力する（ステップＳ３０、出力ステップ）。生成された評価対象画像は、ユーザ操作によりまたは自動的に解析システム２０に入力され、解析システム２０はその評価対象画像に対して視線予測処理を実行する。したがって、本実施形態に係る視線予測方法は、図２および図９に示す処理を含む。 If there are other replacement target patterns to be processed, the generation unit 12 repeats the processing of steps S22 to S26 (see step S27), thereby completing one evaluation target image (step S28). If there are other evaluation target images to be generated, the generation unit 12 repeats the processes of steps S22 to S28 corresponding to the generation step (see step S29). Finally, the one or more evaluation target images generated by the output unit 13 are output (step S30, output step). The generated evaluation target image is input to the analysis system 20 by a user operation or automatically, and the analysis system 20 performs a line-of-sight prediction process on the evaluation target image. Therefore, the line-of-sight prediction method according to the present embodiment includes the processes shown in FIGS.

次に、図１０を用いて、画像生成システム１０を実現するための画像生成プログラムＰ１を説明する。 Next, an image generation program P1 for realizing the image generation system 10 will be described with reference to FIG.

画像生成プログラムＰ１は、メインモジュールＰ１０、受付モジュールＰ１１、生成モジュールＰ１２、および出力モジュールＰ１３を備えている。 The image generation program P1 includes a main module P10, a reception module P11, a generation module P12, and an output module P13.

メインモジュールＰ１０は、画像生成機能を統括的に制御する部分である。受付モジュールＰ１１、生成モジュールＰ１２、および出力モジュールＰ１３を実行することにより実現される機能はそれぞれ、上記の受付部１１、生成部１２、および出力部１３の機能と同様である。 The main module P10 is a part that comprehensively controls the image generation function. The functions realized by executing the reception module P11, the generation module P12, and the output module P13 are the same as the functions of the reception unit 11, the generation unit 12, and the output unit 13, respectively.

画像生成プログラムＰ１は、例えば、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭ、半導体メモリ等の有形の記録媒体に固定的に記録された上で提供されてもよい。また、画像生成プログラムＰ１は、搬送波に重畳されたデータ信号として通信ネットワークを介して提供されてもよい。また、画像生成プログラムＰ１は、解析システム２０を実現するための解析プログラムとともに、視線予測プログラムとして提供されてもよい。 The image generation program P1 may be provided after being fixedly recorded on a tangible recording medium such as a CD-ROM, DVD-ROM, or semiconductor memory. The image generation program P1 may be provided as a data signal superimposed on a carrier wave via a communication network. The image generation program P1 may be provided as a line-of-sight prediction program together with an analysis program for realizing the analysis system 20.

以上説明したように、本発明の一側面に係る画像生成システムは、一または複数のプロセッサを備え、少なくとも一つのプロセッサが、入力画像を受け付け、置換対象パターンと置換パターンとを関連付けて記憶する記憶部を参照して、入力画像内で特定された置換対象パターンを置換パターンに置換することで、評価対象画像を生成し、評価対象画像において視覚的注意を惹き付ける傾向がある領域を予測する処理を視線予測システムにおいて実行するために、該評価対象画像を出力する。 As described above, the image generation system according to one aspect of the present invention includes one or a plurality of processors, and at least one processor receives an input image and stores a replacement target pattern and a replacement pattern in association with each other. Processing for generating an evaluation target image by referring to the replacement part and replacing the replacement target pattern specified in the input image with a replacement pattern, and predicting a region that tends to attract visual attention in the evaluation target image Is output in the line-of-sight prediction system.

また、本発明の一側面に係る画像生成方法は、少なくとも一つのプロセッサが入力画像を受け付ける受付ステップと、プロセッサが、置換対象パターンと置換パターンとを関連付けて記憶する記憶部を参照して、入力画像内で特定された置換対象パターンを置換パターンに置換することで、評価対象画像を生成する生成ステップと、プロセッサが、評価対象画像において視覚的注意を惹き付ける傾向がある領域を予測する処理を視線予測システムにおいて実行するために、該評価対象画像を出力する出力ステップとを含む。 The image generation method according to one aspect of the present invention includes an accepting step in which at least one processor receives an input image, and an input with reference to a storage unit that stores the replacement target pattern and the replacement pattern in association with each other. A generation step for generating an evaluation target image by replacing the replacement target pattern specified in the image with a replacement pattern, and a process for predicting an area in the evaluation target image that tends to attract visual attention. An output step of outputting the evaluation target image for execution in the line-of-sight prediction system.

また、本発明の一側面に係る画像生成プログラムは、入力画像を受け付ける受付ステップと、置換対象パターンと置換パターンとを関連付けて記憶する記憶部を参照して、入力画像内で特定された置換対象パターンを置換パターンに置換することで、評価対象画像を生成する生成ステップと、評価対象画像において視覚的注意を惹き付ける傾向がある領域を予測する処理を視線予測システムにおいて実行するために、該評価対象画像を出力する出力ステップとをコンピュータに実行させる。 In addition, the image generation program according to one aspect of the present invention refers to a receiving step that receives an input image, and a storage unit that stores the replacement target pattern and the replacement pattern in association with each other. In order to execute a generation step of generating an evaluation target image by replacing a pattern with a replacement pattern and a process of predicting a region that tends to attract visual attention in the evaluation target image in the gaze prediction system. Causing the computer to execute an output step of outputting the target image.

本発明の別の側面では、少なくとも一つのプロセッサが、入力画像中の置換対象パターンの歪みを示す射影変換マトリクスを算出し、射影変換マトリクスを置換パターンに適用することで該置換パターンを変形させ、特定された置換対象パターンを、変形された置換パターンに置換してもよい。このように、入力画像中の置換対象パターンに合わせて置換パターンを変形させてからその置換パターンを入力画像に適用することで、違和感の少ない評価対象画像を生成することができる。 In another aspect of the present invention, at least one processor calculates a projective transformation matrix indicating distortion of a replacement target pattern in the input image, and transforms the replacement pattern by applying the projective transformation matrix to the replacement pattern. The identified replacement target pattern may be replaced with a modified replacement pattern. In this way, by changing the replacement pattern in accordance with the replacement target pattern in the input image and applying the replacement pattern to the input image, it is possible to generate an evaluation target image with less discomfort.

本発明の別の側面では、少なくとも一つのプロセッサが、入力画像中の置換対象パターンの特徴点の座標と、置換パターンの特徴点の座標とから第１の射影変換マトリクスを算出し、第１の射影変換マトリクスを置換パターンに適用することで該置換パターンを変形させてもよい。特徴点の座標から射影変換マトリクスを求めることで、置換パターンを精度良く変形させることができる。 In another aspect of the present invention, at least one processor calculates a first projective transformation matrix from the coordinates of the feature points of the replacement target pattern in the input image and the coordinates of the feature points of the replacement pattern. The replacement pattern may be transformed by applying a projective transformation matrix to the replacement pattern. By obtaining the projective transformation matrix from the coordinates of the feature points, the replacement pattern can be accurately deformed.

本発明の別の側面では、少なくとも一つのプロセッサが、入力画像中の置換対象パターンにおいて指定された４頂点の座標と、置換パターンにおいて指定された４頂点の座標とから第２の射影変換マトリクスを算出し、置換パターンに対して、第２の射影変換マトリクスを適用し、その後さらに第１の射影変換マトリクスを適用することで、該置換パターンを変形させてもよい。まず、頂点座標から求めた射影変換マトリクスを用いて置換パターンを大雑把に変形させてから、特徴点の座標から得た射影変換マトリクスを用いてその置換パターンをさらに変形させることで、置換パターンをより精度良く変形させることができる。 In another aspect of the present invention, at least one processor generates a second projective transformation matrix from the coordinates of the four vertices specified in the replacement target pattern in the input image and the coordinates of the four vertices specified in the replacement pattern. The replacement pattern may be transformed by calculating, applying the second projective transformation matrix to the replacement pattern, and then applying the first projective transformation matrix. First, roughly replace the replacement pattern using the projection transformation matrix obtained from the vertex coordinates, and then further modify the replacement pattern using the projection transformation matrix obtained from the coordinates of the feature points. It can be deformed with high accuracy.

本発明の別の側面では、少なくとも一つのプロセッサが、記憶部に記憶されている置換対象パターンと入力画像内の領域とを比較することで、該入力画像内に存在する置換対象パターンを特定してもよい。この場合には、ユーザが置換対象パターンを自ら特定する手間が省けるので、そのユーザの作業負荷をさらに軽減することができる。 In another aspect of the present invention, at least one processor identifies a replacement target pattern existing in the input image by comparing the replacement target pattern stored in the storage unit with an area in the input image. May be. In this case, the user can save time and labor for specifying the replacement target pattern, and the work load on the user can be further reduced.

本発明の一側面に係る視線予測システムは、上記の画像生成システムと、評価対象画像において視覚的注意を惹き付ける傾向がある領域を予測する解析システムとを備える。 A line-of-sight prediction system according to one aspect of the present invention includes the above-described image generation system and an analysis system that predicts a region that tends to attract visual attention in an evaluation target image.

本発明の一側面に係る視線予測方法は、上記の画像生成方法により出力された評価対象画像において視覚的注意を惹き付ける傾向がある領域を予測する解析ステップを含む。 A gaze prediction method according to an aspect of the present invention includes an analysis step of predicting a region that tends to attract visual attention in an evaluation target image output by the image generation method.

本発明の一側面に係る視線予測プログラムは、上記の画像生成プログラムと、評価対象画像において視覚的注意を惹き付ける傾向がある領域を予測する解析ステップをコンピュータに実行させるための解析プログラムとを含む。 A line-of-sight prediction program according to an aspect of the present invention includes the above-described image generation program and an analysis program for causing a computer to execute an analysis step of predicting a region that tends to attract visual attention in an evaluation target image. .

以上、本発明をその実施形態に基づいて詳細に説明した。しかし、本発明は上記実施形態に限定されるものではない。本発明は、その要旨を逸脱しない範囲で様々な変形が可能である。 The present invention has been described in detail based on the embodiments. However, the present invention is not limited to the above embodiment. The present invention can be variously modified without departing from the gist thereof.

本発明が動画にも適用可能であることは上述した通りである。この場合には、画像生成システム１０は最初のフレームについては上記実施形態（静止画に対する処理）と同様に評価対象フレームを生成し、その後のフレームについては、オプティカルフロー（動体追跡）を用いて前フレームの置換パターンを動かすことで評価対象フレームを生成してもよい。あるいは、画像生成システム１０は各フレームについて上記実施形態（静止画に対する処理）と同様に評価対象フレームを生成してもよい。 As described above, the present invention can also be applied to moving images. In this case, the image generation system 10 generates an evaluation object frame for the first frame in the same manner as in the above-described embodiment (processing for a still image), and for the subsequent frames, the optical flow (moving object tracking) is used for the previous frame. The evaluation target frame may be generated by moving the frame replacement pattern. Alternatively, the image generation system 10 may generate an evaluation target frame for each frame in the same manner as in the above embodiment (processing for a still image).

上記実施形態では生成部１２が色補正および平滑化処理を実行したが、これらの処理の一方または双方を省略してもよい。また、射影変換による置換パターンの変形も省略可能である。 In the above embodiment, the generation unit 12 performs the color correction and smoothing process, but one or both of these processes may be omitted. Further, the deformation of the replacement pattern by projective transformation can be omitted.

本発明は単体のコンピュータにも適用可能である。例えば、視線予測システム全体の機能をユーザ端末に実装してもよい。あるいは、画像生成システムおよび解析システムの機能をユーザ端末に実装し、パターンデータベースをネットワーク上に配置してもよい。この場合には、ユーザ端末はそのパターンデータベースに都度アクセスしながら評価対象画像の作成および視線予測を実行する。あるいは、ユーザ端末が画像生成システム１０およびパターンデータベース３０の機能も備え、そのユーザ端末がネットワークＮを介して解析システム２０にアクセスしてもよい。 The present invention can also be applied to a single computer. For example, you may implement the function of the whole gaze prediction system in a user terminal. Alternatively, the functions of the image generation system and the analysis system may be mounted on the user terminal, and the pattern database may be arranged on the network. In this case, the user terminal executes the creation of the evaluation target image and the gaze prediction while accessing the pattern database each time. Alternatively, the user terminal may also have the functions of the image generation system 10 and the pattern database 30, and the user terminal may access the analysis system 20 via the network N.

１…視線予測システム、１０…画像生成システム、１１…受付部、１２…生成部、１３…出力部、２０…解析システム、３０…パターンデータベース（記憶部）、１０１…ＣＰＵ（プロセッサ）、Ｐ１…画像生成プログラム、Ｐ１０…メインモジュール、Ｐ１１…受付モジュール、Ｐ１２…生成モジュール、Ｐ１３…出力モジュール、Ｔ…ユーザ端末。 DESCRIPTION OF SYMBOLS 1 ... Gaze prediction system, 10 ... Image generation system, 11 ... Reception part, 12 ... Generation part, 13 ... Output part, 20 ... Analysis system, 30 ... Pattern database (memory | storage part), 101 ... CPU (processor), P1 ... Image generation program, P10 ... main module, P11 ... reception module, P12 ... generation module, P13 ... output module, T ... user terminal.

Claims

With one or more processors,
At least one of the processors is
Accept input images,
By referring to the storage unit that stores the replacement target pattern and the replacement pattern in association with each other, by replacing the replacement target pattern specified in the input image with the replacement pattern, an evaluation target image is generated,
Outputting the evaluation target image in order to execute a process of predicting a region that tends to attract visual attention in the evaluation target image in the gaze prediction system;
Image generation system.

The at least one processor comprises:
Calculating a projective transformation matrix indicating distortion of the replacement target pattern in the input image;
Deforming the replacement pattern by applying the projective transformation matrix to the replacement pattern;
Replacing the specified replacement target pattern with the modified replacement pattern;
The image generation system according to claim 1.

The at least one processor comprises:
Calculating a first projective transformation matrix from the coordinates of the feature points of the replacement target pattern in the input image and the coordinates of the feature points of the replacement pattern;
Deforming the replacement pattern by applying the first projective transformation matrix to the replacement pattern;
The image generation system according to claim 2.

The at least one processor comprises:
Calculating a second projective transformation matrix from the coordinates of the four vertices specified in the replacement target pattern in the input image and the coordinates of the four vertices specified in the replacement pattern;
Applying the second projective transformation matrix to the replacement pattern, and then applying the first projective transformation matrix to deform the replacement pattern;
The image generation system according to claim 3.

The at least one processor identifies the replacement target pattern existing in the input image by comparing the replacement target pattern stored in the storage unit with a region in the input image;
The image generation system as described in any one of Claims 1-4.

An accepting step in which at least one processor accepts an input image;
The processor generates an evaluation target image by referring to a storage unit that stores the replacement target pattern and the replacement pattern in association with each other and replacing the replacement target pattern specified in the input image with the replacement pattern. Generating step to
An image generation method including: an output step of outputting the evaluation target image so that the processor predicts a region in the evaluation target image that tends to attract visual attention in the line-of-sight prediction system.

A reception step for receiving an input image;
A generation step of generating an evaluation target image by replacing the replacement target pattern specified in the input image with the replacement pattern with reference to a storage unit that stores the replacement target pattern and the replacement pattern in association with each other. ,
An image generation program for causing a computer to execute an output step of outputting an evaluation target image in order to execute a process of predicting a region that tends to attract visual attention in the evaluation target image in the gaze prediction system.

The image generation system according to any one of claims 1 to 5,
A line-of-sight prediction system comprising: an analysis system that predicts a region that tends to attract visual attention in the evaluation target image.

A line-of-sight prediction method including an analysis step of predicting a region that tends to attract visual attention in the evaluation target image output by the image generation method according to claim 6.

An image generation program according to claim 7;
A line-of-sight prediction program including an analysis program for causing a computer to execute an analysis step of predicting an area that tends to attract visual attention in the evaluation target image.