JP2018136666A

JP2018136666A - Visual line conversion apparatus and visual line conversion method

Info

Publication number: JP2018136666A
Application number: JP2017029784A
Authority: JP
Inventors: 利浩北島; Toshihiro Kitajima; 延偉陳; Yen Wei Chen; 昌孝瀬尾; Masataka Seo
Original assignee: Ritsumeikan Trust; Samsung R&D Institute Japan Co Ltd
Current assignee: Ritsumeikan Trust; Samsung R&D Institute Japan Co Ltd
Priority date: 2017-02-21
Filing date: 2017-02-21
Publication date: 2018-08-30
Anticipated expiration: 2037-02-21
Also published as: JP6952298B2

Abstract

PROBLEM TO BE SOLVED: To obtain a more natural moving image in taking an image, and particularly a moving image, when a visual line of a target person is changed.SOLUTION: A visual line conversion apparatus includes a camera for acquiring an image of the face of a target person as a processing target image, a reference image memory device for storing a plurality of images of the face of the target person whose visual line is directed toward an imaging apparatus as a reference images, a shape model generator for generating a shape model of an eye shape of the target person based on a plurality of the reference images, a feature point extractor for extracting an eye feature point of the target person in the processing target image, a feature point position corrector for using the shape model to correct a position of the feature point, and an image corrector for correcting the processing target image to transfer a corresponding region of the reference image to a region defined by the feature point whose position is corrected by the feature point position corrector so that the visual line of the target person appears to face the camera.SELECTED DRAWING: Figure 1

Description

本開示は、画像において被撮影者の視線の向きを変える視線変換技術に関する。 The present disclosure relates to a line-of-sight conversion technique for changing a direction of a line of sight of a subject in an image.

映像対話システムでは、通常、カメラの位置とディスプレイの画面の位置とが異なる。このため、ユーザは、画面を見ている会話相手と目線を合わせることができず、同様に、会話相手もユーザと目線を合わせることができない。より自然な対話のために、目線が合う映像対話システムが望まれている。 In a video interaction system, the position of the camera is usually different from the position of the display screen. For this reason, the user cannot match the line of sight with the conversation partner who is watching the screen, and similarly, the conversation partner cannot match the line of sight with the user. For a more natural dialogue, a video dialogue system that matches eyes is desired.

特許文献１には、ハーフミラーを追加して正面向きの顔画像を取得することが記載されている。特許文献２には、モニター画面の左右に１台ずつカメラを設置することにより正面向きの顔画像を生成することが記載されている。特許文献３には、眼部画像を、瞳の位置がその眼部画像の中央となるように補正することが記載されている。 Patent Document 1 describes that a face image facing front is obtained by adding a half mirror. Patent Document 2 describes that a face image facing the front is generated by installing one camera on each of the left and right sides of the monitor screen. Patent Document 3 describes correcting an eye part image so that the position of the pupil is at the center of the eye part image.

特開平１１−１７７９４９号公報Japanese Patent Laid-Open No. 11-177949 特開平８−２５１５６２号公報JP-A-8-251562 特開２０１５−１４９０１６号公報Japanese Patent Laying-Open No. 2015-149016

しかし、ハーフミラーやカメラ等のハードウェアを追加すると、システムが大型化してしまうという問題がある。また、特許文献３のように画像を補正する場合には、単純にフレーム毎に処理を行うと、目の位置がフレーム毎に異なってしまい、不自然な動画像が得られることがある。 However, when hardware such as a half mirror or a camera is added, there is a problem that the system becomes large. Further, when correcting an image as in Patent Document 3, if the processing is simply performed for each frame, the position of the eyes may be different for each frame, and an unnatural moving image may be obtained.

本発明は、被撮影者の視線を変更した画像、特に動画像を求める場合に、システムを大型化させることなく、より自然な動画像が得られるようにすることを目的とする。 It is an object of the present invention to obtain a more natural moving image without increasing the size of the system when obtaining an image in which the line of sight of the subject is changed, particularly a moving image.

本開示による視線変換装置は、被撮影者の顔の画像を処理対象画像として取得するカメラと、撮影装置の方向に視線が向いた前記被撮影者の顔の複数の画像を参照画像として格納する参照画像記憶装置と、複数の前記参照画像に基づいて、前記被撮影者の目の形状についての形状モデルを生成する形状モデル生成器と、前記処理対象画像における前記被撮影者の目の特徴点を抽出する特徴点抽出器と、前記形状モデルを用いて、前記特徴点の位置を補正する特徴点位置補正器と、前記特徴点位置補正器で位置を補正された前記特徴点で規定される領域に、前記参照画像の対応する領域を転写して、前記被撮影者の視線が前記カメラの方向を向いているように見えるように、前記処理対象画像を補正する画像補正器とを有する。 A line-of-sight conversion device according to the present disclosure stores, as a reference image, a camera that acquires an image of a face of a subject as a processing target image, and a plurality of images of the face of the subject that has a line of sight in the direction of the photographing device. A reference image storage device, a shape model generator for generating a shape model for the shape of the subject's eye based on the plurality of reference images, and feature points of the subject's eye in the processing target image A feature point extractor for extracting the feature point, a feature point position corrector for correcting the position of the feature point using the shape model, and the feature point whose position is corrected by the feature point position corrector. And an image corrector that corrects the processing target image so that the corresponding area of the reference image is transferred to the area so that the line of sight of the subject looks like facing the camera.

本開示による視線変換方法は、被撮影者の顔の画像を処理対象画像として取得し、撮影装置の方向に視線が向いた前記被撮影者の顔の複数の画像を参照画像として格納し、複数の前記参照画像に基づいて、前記被撮影者の目の形状についての形状モデルを生成し、前記処理対象画像における前記被撮影者の目の特徴点を抽出し、前記形状モデルを用いて、前記特徴点の位置を補正し、位置を補正された前記特徴点で規定される領域に、前記参照画像の対応する領域を転写して、前記被撮影者の視線が前記カメラの方向を向いているように見えるように、前記処理対象画像を補正する。 The line-of-sight conversion method according to the present disclosure acquires an image of a face of a person to be photographed as a processing target image, stores a plurality of images of the face of the person to be photographed with a line of sight in the direction of the photographing apparatus as a reference image, Generating a shape model for the shape of the subject's eyes based on the reference image, extracting feature points of the subject's eyes in the processing target image, and using the shape model, The position of the feature point is corrected, the corresponding area of the reference image is transferred to the area defined by the corrected feature point, and the line of sight of the subject faces the direction of the camera The processing target image is corrected so that it looks like this.

これらの視線変換装置及び視線変換方法によると、被撮影者がカメラの方向を向いていなくても、その視線がカメラに向けられているように見える画像が得られる。したがって、対話相手には、自分に視線が向けられているように見え、自然な対話が可能になる。形状モデルを用いるので、特に動画像において目の位置が安定し、より自然な動画像が得られる。ハーフミラー等を追加する必要がないので、システムを大型化させることがない。また、２次元画像を対象とする処理を行うので、計算コストを比較的小さく抑えることができる。 According to these line-of-sight conversion devices and line-of-sight conversion methods, it is possible to obtain an image that looks as if the line of sight is directed toward the camera even if the subject is not facing the camera. Therefore, it seems that the conversation partner is looking at him, and natural conversation is possible. Since the shape model is used, the position of the eyes is stabilized particularly in the moving image, and a more natural moving image can be obtained. Since it is not necessary to add a half mirror or the like, the system is not enlarged. In addition, since processing for a two-dimensional image is performed, the calculation cost can be kept relatively small.

本開示によれば、システムを大型化させることなく、より自然な動画像が得られるようにすることが可能になる。 According to the present disclosure, it is possible to obtain a more natural moving image without increasing the size of the system.

図１は、本発明の実施形態に係る視線変換装置の構成例を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration example of a line-of-sight conversion device according to an embodiment of the present invention. 図２は、本発明の実施形態に係る視線変換方法における、形状モデルの生成処理の例を示すフローチャートである。FIG. 2 is a flowchart illustrating an example of a shape model generation process in the line-of-sight conversion method according to the embodiment of the present invention. 図３は、目の特徴点の例を示す説明図である。FIG. 3 is an explanatory diagram illustrating an example of feature points of eyes. 図４は、固有ベクトルを用いて構成された固有空間において、各固有ベクトルに対応する係数を変化させた場合の目の形状変化の例を示す図である。FIG. 4 is a diagram illustrating an example of eye shape change when a coefficient corresponding to each eigenvector is changed in an eigenspace configured using eigenvectors. 図５は、用いられる固有ベクトルの数（基底の数）と累積寄与率との関係の例を示す図である。FIG. 5 is a diagram illustrating an example of the relationship between the number of eigenvectors used (the number of bases) and the cumulative contribution rate. 図６は、本発明の実施形態に係る視線変換方法における、対象画像に対する処理の例を示すフローチャートである。FIG. 6 is a flowchart showing an example of processing for the target image in the line-of-sight conversion method according to the embodiment of the present invention. 図７は、処理対象画像における抽出された特徴点と目領域の例を示す図である。FIG. 7 is a diagram illustrating an example of extracted feature points and eye regions in the processing target image. 図８は、参照画像における抽出された特徴点と目領域の例を示す図である。FIG. 8 is a diagram illustrating an example of extracted feature points and eye regions in the reference image. 図９は、図８の特徴点位置の補正を行う処理の例を更に詳細に示すフローチャートである。FIG. 9 is a flowchart showing in more detail an example of processing for correcting the feature point position of FIG. 図１０は、処理対象画像における抽出された特徴点の位置が正しくない場合の例を示す図である。FIG. 10 is a diagram illustrating an example where the position of the extracted feature point in the processing target image is not correct. 図１１は、位置が補正された特徴点の例を示す図である。FIG. 11 is a diagram illustrating an example of feature points whose positions are corrected. 図１２は、参照画像における目の形状及び特徴点の例を示す説明図である。FIG. 12 is an explanatory diagram illustrating examples of eye shapes and feature points in the reference image. 図１３は、目の周囲に再配置された特徴点の例を示す図である。FIG. 13 is a diagram illustrating an example of feature points rearranged around the eyes. 図１４は、処理前の処理対象画像の例である。FIG. 14 is an example of a processing target image before processing. 図１５は、図１４の画像に図６の処理を行って得られた画像の例である。FIG. 15 is an example of an image obtained by performing the process of FIG. 6 on the image of FIG. 図１６は、相関値の推移の例を示すグラフである。FIG. 16 is a graph illustrating an example of transition of correlation values. 図１７は、本発明の実施形態に係る視線変換装置を実現するコンピュータシステムの構成例を示すブロック図である。FIG. 17 is a block diagram illustrating a configuration example of a computer system that realizes the line-of-sight conversion device according to the embodiment of the present invention.

以下、本発明の実施の形態について、図面を参照しながら説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は、本発明の実施形態に係る視線変換装置の構成例を示すブロック図である。図１の視線変換装置１０は、参照画像記憶装置１２と、形状モデル生成器１４と、カメラ１６と、特徴点抽出器１８と、特徴点位置補正器２０と、画像補正器２２と、画像出力器２８とを有する。画像補正器２２は、テクスチャ合成器２４と、形状変化器２６とを有する。 FIG. 1 is a block diagram illustrating a configuration example of a line-of-sight conversion device according to an embodiment of the present invention. 1 includes a reference image storage device 12, a shape model generator 14, a camera 16, a feature point extractor 18, a feature point position corrector 20, an image corrector 22, and an image output. And a container 28. The image corrector 22 includes a texture synthesizer 24 and a shape changer 26.

参照画像記憶装置１２は、撮影装置の方向に視線が向いた被撮影者の顔の複数の画像を参照画像として格納する。これらの画像は、予め撮影されて格納されている。形状モデル生成器１４は、複数の参照画像に基づいて、被撮影者の目の形状についての統計形状モデル（以下では単に形状モデルと称する）を生成する。目の形状は、目の特徴点によって表される。 The reference image storage device 12 stores, as reference images, a plurality of images of the face of the subject whose line of sight is directed in the direction of the imaging device. These images are captured and stored in advance. The shape model generator 14 generates a statistical shape model (hereinafter simply referred to as a shape model) for the shape of the eye of the subject based on a plurality of reference images. The shape of the eye is represented by eye feature points.

カメラ１６は、被撮影者の顔の画像を処理対象画像として取得する。特徴点抽出器１８は、処理対象画像における被撮影者の目の特徴点を抽出する。特徴点位置補正器２０は、形状モデル生成器１４で生成された形状モデルを用いて、処理対象画像における特徴点の位置を補正する。画像補正器２２は、特徴点位置補正器２０で位置を補正された特徴点で規定される領域に、参照画像の対応する領域を転写して、被撮影者の視線がカメラ１６の方向を向いているように見えるように、処理対象画像を補正する。 The camera 16 acquires an image of the face of the subject as a processing target image. The feature point extractor 18 extracts feature points of the eye of the subject in the processing target image. The feature point position corrector 20 corrects the position of the feature point in the processing target image using the shape model generated by the shape model generator 14. The image corrector 22 transfers the corresponding region of the reference image to the region defined by the feature point whose position has been corrected by the feature point position corrector 20 so that the line of sight of the subject faces the direction of the camera 16. The image to be processed is corrected so that it looks like it is.

被撮影者であるユーザは、テレビジョン受信機の画面、又はコンピュータに接続されたディスプレイを見ながら、対話相手と会話をする。ユーザのテレビジョン受信機又はコンピュータは、インターネット等の通信ネットワークを経由して、対話相手のテレビジョン受信機又はコンピュータと接続されている。ユーザの顔は、カメラ１６によって撮影され、その画像が、視線変換装置１０によって処理された後に対話相手に送信される。以下では、カメラ１６は、例えば、ディスプレイの上に配置されているものとして説明するが、カメラ１６がディスプレイの近辺の他の場所に配置されていたりテレビジョン受信機に内蔵されていてもよく、そのような場合についても同様に説明することができる。 A user who is a subject takes a conversation with a conversation partner while viewing a screen of a television receiver or a display connected to a computer. The user's television receiver or computer is connected to the other party's television receiver or computer via a communication network such as the Internet. The user's face is photographed by the camera 16 and the image is processed by the line-of-sight conversion device 10 and then transmitted to the conversation partner. In the following description, the camera 16 is described as being disposed on the display, for example. However, the camera 16 may be disposed in other places near the display or may be incorporated in the television receiver. Such a case can be explained in the same manner.

図２は、本発明の実施形態に係る視線変換方法における、形状モデルの生成処理の例を示すフローチャートである。まず、形状モデルの生成について説明する。ブロック１１２において、例えばカメラ１６がユーザの顔の画像を複数枚撮影し、参照画像記憶装置１２がこれらの画像を格納する。このとき、ユーザは、ディスプレイを見ながら対話相手と会話をするときとほぼ同じ位置にいる（例えば座っている）。また、ユーザの顔はディスプレイの方を向いているが、ユーザの視線は撮影をするカメラの方向に向いている。カメラ１６に代えて、他の撮影装置によってユーザの顔の画像を撮影してもよい。この場合、撮影装置とユーザとの位置関係は、上述の場合と同様にしておく。参照画像記憶装置１２に格納される画像の枚数は、例えば１００枚であるが、これ以上又はこれ以下の枚数であってもよい。 FIG. 2 is a flowchart illustrating an example of a shape model generation process in the line-of-sight conversion method according to the embodiment of the present invention. First, generation of a shape model will be described. In block 112, for example, the camera 16 captures a plurality of images of the user's face, and the reference image storage device 12 stores these images. At this time, the user is in substantially the same position as when talking to the conversation partner while watching the display (for example, sitting). Also, the user's face is facing the display, but the user's line of sight is facing the direction of the camera that shoots. Instead of the camera 16, an image of the user's face may be taken by another photographing device. In this case, the positional relationship between the imaging device and the user is the same as that described above. The number of images stored in the reference image storage device 12 is, for example, 100, but may be more or less.

図３は、目の特徴点の例を示す説明図である。図３のブロック１１４において、参照画像記憶装置１２に格納された参照画像から目の特徴点が抽出され、抽出された特徴点が参照画像に付与され、参照画像記憶装置１２に格納される。この処理は、参照画像記憶装置１２に格納された参照画像のそれぞれについて行われる。特徴点は、目の形状及び領域を表すように、例えば図３の白丸のように求められる。図３の例では、左右の目のそれぞれについて、瞳のほぼ中心の点と、目の領域の境界線上の８点とが、特徴点として求められる。境界線上の点として、目の領域の左右の端の点、及びこれらに加えて上側の境界線上に３点、下側の境界線上に３点が求められている。正確な特徴点の位置を求めるために、ここでは手動で特徴点を求めることとするが、十分に計算コストを掛けて十分な精度で特徴点を抽出することができる場合には、自動的に特徴点を求めるようにしてもよい。 FIG. 3 is an explanatory diagram illustrating an example of feature points of eyes. In block 114 of FIG. 3, the feature point of the eye is extracted from the reference image stored in the reference image storage device 12, and the extracted feature point is given to the reference image and stored in the reference image storage device 12. This process is performed for each reference image stored in the reference image storage device 12. The feature points are obtained as white circles in FIG. 3, for example, so as to represent the shape and area of the eyes. In the example of FIG. 3, for each of the left and right eyes, approximately the center point of the pupil and eight points on the boundary of the eye region are obtained as feature points. As the points on the boundary line, the left and right end points of the eye region, and in addition to these, three points on the upper boundary line and three points on the lower boundary line are obtained. In order to find the exact position of the feature point, the feature point is manually obtained here. However, if the feature point can be extracted with sufficient accuracy at a sufficient calculation cost, it is automatically A feature point may be obtained.

ブロック１１６では、形状モデル生成器１４は、全ての参照画像について、アライメントをとる処理を行う。ここでは、形状モデル生成器１４は、各参照画像における目の位置、傾き、サイズがほぼ同じになるように、画像の平行移動、回転、拡大・縮小を行う。例えば特徴点の位置を用いて、アライメントをとることができる。 In block 116, the shape model generator 14 performs alignment processing for all reference images. Here, the shape model generator 14 performs translation, rotation, and enlargement / reduction of images so that the position, inclination, and size of the eyes in each reference image are substantially the same. For example, alignment can be performed using the positions of feature points.

ブロック１１８では、形状モデル生成器１４は、複数の参照画像のそれぞれから抽出されたユーザの目の特徴点に基づいて、主成分分析によって、目の形状についての形状モデルを生成する。主成分分析の例について説明する。各参照画像について、左右の目のそれぞれについて、ｕ次元のベクトルｃ＝｛ｃ_１，ｃ_２，…，ｃ_ｕ｝を考える。本実施形態では、ベクトルcは、順に並べられた目の９個の特徴点の座標（ｘ座標及びｙ座標）を要素として有する１８次元ベクトルであって（ｕ＝１８）、目の領域を表している。 In block 118, the shape model generator 14 generates a shape model for the eye shape by principal component analysis based on the feature points of the user's eyes extracted from each of the plurality of reference images. An example of principal component analysis will be described. For each reference image, consider a u-dimensional vector c = {c ₁ , c ₂ ,..., C _u } for each of the left and right eyes. In this embodiment, the vector c is an 18-dimensional vector (u = 18) having the coordinates (x coordinate and y coordinate) of nine feature points of the eyes arranged in order as elements, and represents the eye region. ing.

形状モデル生成器１４は、Ｎ個のベクトルｃから、平均ベクトルを次の式１で求める。 The shape model generator 14 obtains an average vector from the N vectors c by the following formula 1.

形状モデル生成器１４は、各ベクトルｃと平均ベクトルとから、式２によって共分散行列Ｓを求める。共分散行列Ｓは、ｕ×ｕの行列である。 The shape model generator 14 obtains the covariance matrix S from Equation 2 using each vector c and the average vector. The covariance matrix S is a u × u matrix.

形状モデル生成器１４は、この共分散行列Ｓに対して、固有値問題（式３）を解くことによって固有値λ_ｉと固有ベクトル（主成分）ｖ_ｉとを求める。 The shape model generator 14 obtains an eigenvalue λ _i and an eigenvector (principal component) v _i by solving the eigenvalue problem (Equation 3) for the covariance matrix S.

形状モデル生成器１４は、上位ｍ個（ここでは８個）の固有値にそれぞれ対応するｍ個の固有ベクトルを選択する。形状モデル生成器１４は、選択された固有ベクトルを、例えば参照画像記憶装置１２に格納する。 The shape model generator 14 selects m eigenvectors respectively corresponding to the upper m (eight in this case) eigenvalues. The shape model generator 14 stores the selected eigenvector in the reference image storage device 12, for example.

選択された固有ベクトルを用いて構成された固有空間Ｖ＝（ｖ_１，ｖ_２，…，ｖ_ｍ）において、ベクトルｃの係数ベクトルｂ＝（ｂ_１，ｂ_２，…，ｂ_ｍ）を、式４のように求めることができる。係数ベクトルは、各固有ベクトルに対する係数を要素として有するベクトルであって、特徴ベクトルとも呼ばれる。 Eigenspace _V = configured using the selected eigenvectors _{(v 1, v 2, ...} , v m) in the coefficient vector c vector _{_{b = (b 1, b 2}} , ..., b m) of the formula 4 can be obtained. The coefficient vector is a vector having a coefficient for each eigenvector as an element, and is also called a feature vector.

形状モデル生成器１４は、式４を用いて係数ベクトルｂを求める。ベクトルｃは、係数ベクトルｂを用いて、式５のように固有ベクトルｖ_ｉの線形和として表現することができる。 The shape model generator 14 obtains the coefficient vector b using Equation 4. Vector c, using the coefficient vector b, can be expressed as a linear sum of eigenvector v _i as Equation 5.

固有ベクトルｖ_ｉは目の固有形状とも呼ばれ、目の形状特徴を表す。使用される固有ベクトルの数ｍがベクトルｃの次元数ｕよりも小さいときには、式５によって元のデータの次元数を削減したということができる。固有ベクトルｖ_ｉは、それぞれ、“目の開き度合い”や“釣り目度合い”のような目の形状特徴を平均形状からの差分として表しており、それぞれに係る係数を増減することで容易に目の形状を変化させることができる。また，式４を用いて未知の目の形状をこの固有空間に投影し、求められた各係数の値から、その個人の目の形状の特徴（目の開き度合い，釣り目度合い等）を判定することも可能である。このように、ｍ個の固有ベクトルｖ_ｉは、様々な目の形状を表現できる、目の形状についての統計形状モデルであるということができる。 Eigenvectors v _i is also known as eye-specific shape, representing the eye shape features. When the number m of eigenvectors used is smaller than the number of dimensions u of the vector c, it can be said that the number of dimensions of the original data is reduced according to Equation 5. Eigenvectors v _i are respectively, "eye opening degree" and the eyes of shape features such as "eyes that degree" represents a difference from the mean shape, easily eye by increasing or decreasing the coefficient of the respective The shape can be changed. In addition, the shape of the unknown eye is projected onto this eigenspace using Equation 4, and the characteristics of the individual's eye shape (degree of eye opening, degree of fishing, etc.) are determined from the obtained coefficient values. It is also possible to do. Thus, m eigenvectors v _i can be said to be represented a variety of eye shape, a statistical shape model for the eye shape.

求められた主成分（固有ベクトルｖ_１〜ｖ_８）の例を、表１に示す。表１のベクトルｃに対応する係数ベクトルｂの例を表２に示す。 Table 1 shows an example of the obtained principal components (eigenvectors v _{1 to} v ₈ ). An example of the coefficient vector b corresponding to the vector c in Table 1 is shown in Table 2.

図４は、固有ベクトルを用いて構成された固有空間Ｖにおいて、各固有ベクトルに対応する係数ｂ_ｉを変化させた場合の目の形状変化の例を示す図である。第１〜第３主成分は、主成分分析により取得した固有ベクトルのうち、情報量の多い（固有値の大きい）方から第３位までの固有ベクトルである。図４から、各固有ベクトルが目の異なる形状特徴を表していることがわかる。また、各固有ベクトルに対応する係数ｂ_ｉを変化させることで形状の変化は見られるが、位置、サイズ、又は傾きには変化がないことがわかる。図４において、σは、主成分分析で用いられた参照画像から求められた各固有ベクトルの係数ｂ_ｉの標準偏差を示し、mean±０σの列は平均ベクトルを示す。 FIG. 4 is a diagram illustrating an example of changes in the shape of the eye when the coefficient b _i corresponding to each eigenvector is changed in the eigenspace V configured using the eigenvectors. The first to third principal components are the eigenvectors from the eigenvector acquired by the principal component analysis to the third place from the one with the larger amount of information (large eigenvalue). It can be seen from FIG. 4 that each eigenvector represents a different shape feature of the eye. Further, it can be seen that a change in shape is seen by changing the coefficient b _i corresponding to each eigenvector, but there is no change in the position, size, or inclination. In FIG. 4, σ indicates the standard deviation of the coefficient b _i of each eigenvector obtained from the reference image used in the principal component analysis, and the column of mean ± 0σ indicates the average vector.

図５は、用いられる固有ベクトルの数（基底の数）と累積寄与率との関係の例を示す図である。図５では、右目についての累積寄与率Ｒと、左目についての累積寄与率Ｌとが示されている。図５に示されているように、形状モデルとして用いられる固有ベクトルの数が多くなるほど、主成分分析において情報の累積寄与率が大きくなるが、その後の形状モデルを用いた演算量も増加する。本実施形態では、累積寄与率が９５％を超えるように、固有ベクトルの数ｍとして８を採用した。 FIG. 5 is a diagram illustrating an example of the relationship between the number of eigenvectors used (the number of bases) and the cumulative contribution rate. FIG. 5 shows the cumulative contribution rate R for the right eye and the cumulative contribution rate L for the left eye. As shown in FIG. 5, as the number of eigenvectors used as a shape model increases, the cumulative contribution ratio of information in the principal component analysis increases, but the amount of calculation using the subsequent shape model also increases. In the present embodiment, 8 is adopted as the number m of eigenvectors so that the cumulative contribution rate exceeds 95%.

図６は、本発明の実施形態に係る視線変換方法における、対象画像に対する処理の例を示すフローチャートである。図２の処理が終了した後、図６の処理が行われる。図２の処理を１回行っておけば、その後、図２の処理を行う必要はない。 FIG. 6 is a flowchart showing an example of processing for the target image in the line-of-sight conversion method according to the embodiment of the present invention. After the process of FIG. 2 is completed, the process of FIG. 6 is performed. If the process of FIG. 2 is performed once, it is not necessary to perform the process of FIG.

図６のブロック１４０Ａにおいては、処理対象画像（第１フレーム）が入力される。具体的には、カメラ１６が、ユーザの顔の画像を撮影し、特徴点抽出器１８に出力する。 In block 140A of FIG. 6, a processing target image (first frame) is input. Specifically, the camera 16 captures an image of the user's face and outputs it to the feature point extractor 18.

ブロック１４２において、特徴点抽出器１８は、カメラ１６から処理対象画像を受け取り、参照画像記憶装置１２から適切な１枚の参照画像を読み出す。ここで、特徴点抽出器１８は、処理対象画像における顔の向きに近い向きの顔画像、例えば、処理対象画像における顔の向きに最も近い向きの顔画像を含む参照画像を読み出す。特徴点抽出器１８は、処理対象画像及び参照画像から、例えば図３のように目の特徴点を抽出し、これらの画像及び特徴点の座標を出力する。このような顔の向きの検出や特徴点抽出は、当業者によく知られた方法で可能である。図７は、処理対象画像における抽出された特徴点と目領域の例を示す図である。図８は、参照画像における抽出された特徴点と目領域の例を示す図である。目領域は、抽出された特徴点で囲まれる領域として規定される。 In block 142, the feature point extractor 18 receives the processing target image from the camera 16 and reads one appropriate reference image from the reference image storage device 12. Here, the feature point extractor 18 reads out a reference image including a face image in a direction close to the face direction in the processing target image, for example, a face image in a direction closest to the face direction in the processing target image. The feature point extractor 18 extracts eye feature points from the processing target image and the reference image, for example, as shown in FIG. 3, and outputs the coordinates of these images and feature points. Such face orientation detection and feature point extraction can be performed by methods well known to those skilled in the art. FIG. 7 is a diagram illustrating an example of extracted feature points and eye regions in the processing target image. FIG. 8 is a diagram illustrating an example of extracted feature points and eye regions in the reference image. The eye area is defined as an area surrounded by the extracted feature points.

ブロック１４４Ａにおいて、特徴点位置補正器２０は、ユーザの目が形状モデルを用いて表される形状を有するように、処理対象画像における特徴点位置の補正を行う。ここでは、抽出された特徴点を初期座標とし、目の統計形状モデルによる形状変換と、平行移動、拡大縮小、又は回転とを組み合わせて、初期座標の近傍でより適切な、例えば最適な、目の特徴点座標を探索する。特徴点位置の補正のためのこのような探索処理について説明する。 In block 144A, the feature point position corrector 20 corrects the feature point position in the processing target image so that the user's eyes have a shape represented using a shape model. Here, the extracted feature points are used as initial coordinates, and shape transformation based on the statistical shape model of the eye is combined with translation, enlargement / reduction, or rotation. Search for feature point coordinates. Such a search process for correcting the feature point position will be described.

図９は、図８の特徴点位置の補正を行う処理（ブロック１４４Ａ）の例を更に詳細に示すフローチャートである。ブロック１６０では、特徴点位置補正器２０は、処理対象画像の目領域に対してPiecewise-Affine変換によるワーピングを行い、その画素数を参照画像の目領域の画素数と同じにする。以下の処理において、両画像の目領域の間の相関値を求める必要があるからである。なお、参照画像の目領域の画素数が処理対象画像の目領域の画素数と同じになるように、ワーピングを行ってもよい。 FIG. 9 is a flowchart showing in more detail an example of the process (block 144A) for correcting the feature point position of FIG. In block 160, the feature point position corrector 20 warps the eye area of the processing target image by means of Piecewise-Affine transformation so that the number of pixels is the same as the number of pixels of the eye area of the reference image. This is because it is necessary to obtain a correlation value between the eye regions of both images in the following processing. Note that warping may be performed so that the number of pixels in the eye area of the reference image is the same as the number of pixels in the eye area of the processing target image.

ブロック１６１では、特徴点位置補正器２０は、処理対象画像をｘ軸方向に平行移動しながら、評価関数としての式６を用いて、処理対象画像と参照画像との間の相関値ｒを求める。 In block 161, the feature point position corrector 20 obtains a correlation value r between the processing target image and the reference image using Expression 6 as an evaluation function while translating the processing target image in the x-axis direction. .

ここで、例えば、値ｔ^１，ｔ^２はそれぞれ処理対象画像及び参照画像の輝度値を表し、Ｍは目領域の総画素数を表す。値ｔ^１，ｔ^２は、輝度以外の値、例えば色相や彩度等であってもよい。平行移動の際には、特徴点位置補正器２０は、例えば、目の横幅×0.02のステップで、負の方向に３ステップ、正の方向に３ステップ、処理対象画像を移動させ、移動させる毎に相関値ｒを求める。特徴点位置補正器２０は、相関値ｒが最大となる処理対象画像の位置を求め、その位置に処理対象画像を置く。 Here, for example, the values t ¹ and t ² represent the luminance values of the processing target image and the reference image, respectively, and M represents the total number of pixels in the eye area. The values t ¹ and t ² may be values other than luminance, such as hue and saturation. At the time of parallel movement, the feature point position corrector 20 moves, for example, the image to be processed by moving the image to be processed by 3 steps in the negative direction and 3 steps in the positive direction in the step of eye width × 0.02. To obtain a correlation value r. The feature point position corrector 20 obtains the position of the processing target image having the maximum correlation value r, and places the processing target image at that position.

ブロック１６２では、特徴点位置補正器２０は、処理対象画像をｙ軸方向に平行移動しながら、評価関数としての式６を用いて、処理対象画像と参照画像との間の相関値ｒを求める。平行移動の際には、特徴点位置補正器２０は、例えば、目の縦幅×0.02のステップで、負の方向に３ステップ、正の方向に３ステップ、処理対象画像を移動させ、移動させる毎に相関値ｒを求める。特徴点位置補正器２０は、相関値ｒが最大となる処理対象画像の位置を求め、その位置に処理対象画像を置く。 In block 162, the feature point position corrector 20 obtains a correlation value r between the processing target image and the reference image using Equation 6 as an evaluation function while translating the processing target image in the y-axis direction. . At the time of translation, the feature point position corrector 20 moves the image to be processed by moving the image to be processed by, for example, 3 steps in the negative direction and 3 steps in the positive direction in the step of the eye width × 0.02. A correlation value r is obtained every time. The feature point position corrector 20 obtains the position of the processing target image having the maximum correlation value r, and places the processing target image at that position.

ブロック１６４では、特徴点位置補正器２０は、処理対象画像を拡大又は縮小しながら、評価関数としての式６を用いて、処理対象画像と参照画像との間の相関値ｒを求める。拡大又は縮小の際には、特徴点位置補正器２０は、例えば、倍率２％のステップで、負の方向に３ステップ、正の方向に３ステップ、処理対象画像を拡大又は縮小させ、拡大又は縮小させる毎に相関値ｒを求める。特徴点位置補正器２０は、相関値ｒが最大となる処理対象画像の倍率を求め、その倍率になるように処理対象画像を拡大又は縮小させる。 In block 164, the feature point position corrector 20 obtains a correlation value r between the processing target image and the reference image using Expression 6 as an evaluation function while enlarging or reducing the processing target image. When enlarging or reducing, the feature point position corrector 20 enlarges or reduces the processing target image by 3 steps in the negative direction and 3 steps in the positive direction, for example, at a step of 2% magnification. The correlation value r is obtained every time the image is reduced. The feature point position corrector 20 obtains the magnification of the processing target image that maximizes the correlation value r, and enlarges or reduces the processing target image so as to be the magnification.

ブロック１６６では、特徴点位置補正器２０は、処理対象画像を回転させながら、評価関数としての式６を用いて、処理対象画像と参照画像との間の相関値ｒを求める。回転の際には、特徴点位置補正器２０は、例えば、傾き１°のステップで、負の方向に３ステップ、正の方向に３ステップ、処理対象画像を回転させ、回転させる毎に相関値ｒを求める。特徴点位置補正器２０は、相関値ｒが最大となる処理対象画像の傾きを求め、その傾きになるように処理対象画像を回転させる。 In block 166, the feature point position corrector 20 obtains a correlation value r between the processing target image and the reference image using Expression 6 as an evaluation function while rotating the processing target image. At the time of rotation, the feature point position corrector 20 rotates the processing target image, for example, with a step of 1 ° inclination, 3 steps in the negative direction, 3 steps in the positive direction, and the correlation value every time the image is rotated. Find r. The feature point position corrector 20 obtains the inclination of the processing target image that maximizes the correlation value r, and rotates the processing target image so as to have the inclination.

ブロック１６８では、特徴点位置補正器２０は、形状モデル用の係数（すなわち、係数ベクトルｂ）を決定する。特徴点位置補正器２０は、形状モデルの各係数を変化させながら（つまり、係数ベクトルｂを変化させながら）、評価関数としての式６を用いて、処理対象画像と参照画像との間の相関値ｒを求める。より具体的には、特徴点位置補正器２０は、まず第１主成分に対応する係数ｂ_１を、例えば、標準偏差σ×０．５のステップで、負の方向に６ステップ、正の方向に６ステップ、変更し、変更する毎に相関値ｒを求める。特徴点位置補正器２０は、相関値ｒが最大となる係数を求め、その値に係数ｂ_１を決定する。係数ｂ_１の標準偏差σは、形状モデルを求める際に、複数の参照画像から求めておく。 In block 168, the feature point position corrector 20 determines a coefficient for the shape model (ie, coefficient vector b). The feature point position corrector 20 uses the equation 6 as an evaluation function while changing each coefficient of the shape model (that is, changing the coefficient vector b), and uses the correlation between the processing target image and the reference image. Find the value r. More specifically, the feature point position corrector 20 first calculates the coefficient b ₁ corresponding to the first principal component by, for example, a standard deviation σ × 0.5 step, 6 steps in the negative direction, and a positive direction. 6 steps, and the correlation value r is obtained each time the change is made. The feature point position corrector 20 obtains a coefficient that maximizes the correlation value r, and determines the coefficient b ₁ as the value. The standard deviation σ of the coefficient b ₁ is obtained from a plurality of reference images when obtaining the shape model.

その後、特徴点位置補正器２０は、同様の処理を第２、第３、…、第８主成分に対して、この順に行い、係数ｂ_２，ｂ_３，…，ｂ_８を決定する。標準偏差σとしては、それぞれの主成分に対する係数の標準偏差を用いる。すると、前述の式５により、目の形状に対応するベクトルｃを求めることができる。特徴点位置補正器２０は、求められた係数ｂ_１，ｂ_２，…，ｂ_８を出力する。 Then, the feature point position correcting unit 20, the same processing second, third, ..., with respect to the eighth main component, performs in this order, the coefficient _b _2, b 3, ..., to determine the _{b 8.} As the standard deviation σ, the standard deviation of the coefficient for each main component is used. Then, the vector c corresponding to the eye shape can be obtained by the above-described Expression 5. The feature point position corrector 20 outputs the obtained coefficients b ₁ , b ₂ ,..., B ₈ .

以上のような探索の範囲の例を、表３に示す。 Table 3 shows an example of the search range as described above.

図１０は、処理対象画像における抽出された特徴点の位置が正しくない場合の例を示す図である。図１１は、位置が補正された特徴点の例を示す図である。ブロック１４２において特徴点抽出器１８によって抽出された特徴点の位置が、図１０に示されているように、正しくないことがある。このような場合に、図９の処理によって特徴点位置の補正を行うと、例えば図１１のように、特徴点位置を正しい位置に補正することができる。 FIG. 10 is a diagram illustrating an example where the position of the extracted feature point in the processing target image is not correct. FIG. 11 is a diagram illustrating an example of feature points whose positions are corrected. The location of the feature points extracted by the feature point extractor 18 at block 142 may be incorrect as shown in FIG. In such a case, if the feature point position is corrected by the processing of FIG. 9, the feature point position can be corrected to a correct position as shown in FIG. 11, for example.

なお、図９のブロック１６１，１６２，１６４及び１６６の処理の順序を入れ換えてもよい。 Note that the processing order of the blocks 161, 162, 164, and 166 in FIG. 9 may be interchanged.

次に、図６のブロック１４６Ａにおいて、テクスチャ合成器２４は、補正された特徴点で規定される領域に、参照画像の対応する領域を転写して、ユーザの視線がカメラ１６の方向を向いているように見えるように、処理対象画像を補正する。テクスチャ合成器２４は、具体的には次の処理を行う。すなわち、テクスチャ合成器２４は、特徴点を使用して、処理対象画像及び参照画像の目領域を、図３のように三角形領域に分割する。テクスチャ合成器２４は、三角形領域毎に、参照画像の目の領域のテクスチャを、ピースワイズアフィン（Piecewise-Affine）変換を用いて処理対象画像の対応する領域に転写する。この際、各三角形領域においてアフィン変換が行われる。 Next, in block 146A of FIG. 6, the texture synthesizer 24 transfers the corresponding region of the reference image to the region defined by the corrected feature point so that the user's line of sight faces the direction of the camera 16. The image to be processed is corrected so that it appears to be. Specifically, the texture synthesizer 24 performs the following processing. That is, the texture synthesizer 24 uses the feature points to divide the eye area of the processing target image and the reference image into triangular areas as shown in FIG. For each triangular area, the texture synthesizer 24 transfers the texture of the eye area of the reference image to the corresponding area of the processing target image using piecewise affine transformation. At this time, affine transformation is performed in each triangular area.

次に、テクスチャ合成器２４による処理後の画像に対して、形状変化器２６は、形状補正を行う。処理対象画像の目の形状は、ユーザの視線がカメラの方向を向いている参照画像の目の形状とは異なるので、前述のテクスチャ転写の結果は形状とテクスチャのバランスが悪く、不自然な画像になりがちである。そこで、更に、形状変化器２６は、処理対象画像における、特徴点で規定される目の形状を、参照画像における目の形状にワーピングにより補正して、自然な転写結果を実現する。 Next, the shape changer 26 performs shape correction on the image processed by the texture synthesizer 24. Since the eye shape of the processing target image is different from the eye shape of the reference image in which the user's line of sight faces the camera, the result of the texture transfer described above is an unnatural image with a poor balance between the shape and the texture. It tends to be. Therefore, the shape changer 26 further corrects the shape of the eye defined by the feature point in the processing target image to the shape of the eye in the reference image by warping to realize a natural transfer result.

図１２は、参照画像における目の形状及び特徴点の例を示す説明図である。例えば、処理対象画像の目の形状が図３のような形状である場合に、形状変化器２６は、目の形状を図１２のような形状に補正する。この補正については、カメラ、ディスプレイ、及びユーザの相対位置が決まれば、必要となる補正（処理対象画像と参照画像との間での目の形状の関係）がほぼ確定する。ディスプレイより上にカメラを設置した場合には、この補正は主に目の開きを大きくする処理に相当する。ワーピング手法としては、例えばＦＦＤ（Free-Form Deformation）を使用する。 FIG. 12 is an explanatory diagram illustrating examples of eye shapes and feature points in the reference image. For example, when the shape of the eye of the processing target image is as shown in FIG. 3, the shape changer 26 corrects the shape of the eye as shown in FIG. For this correction, if the relative positions of the camera, the display, and the user are determined, the necessary correction (the relationship of the eye shape between the processing target image and the reference image) is almost determined. When the camera is installed above the display, this correction mainly corresponds to a process of increasing the opening of the eyes. As the warping technique, for example, FFD (Free-Form Deformation) is used.

ここで、テクスチャの転写を行う際の処理を更に説明する。図１３は、目の周囲に再配置された特徴点の例を示す図である。転写されたテクスチャとその周囲のテクスチャとの境界が不自然であることがある。特徴点の自動抽出では多少の位置の誤差が発生することが多いこと、また、そもそも異なる画像のテクスチャを転写するので、同一環境で同一人物を撮影したとしても、処理対象画像と参照画像とでは対応する部分の輝度値にある程度の差異が存在することが原因である。そこで、まず特徴点抽出誤差の影響を小さくするために、テクスチャ合成器２４は、ワーピングを行った後の目の輪郭上の特徴点座標の、瞳の中心からの距離を一定の倍率で大きくして、図１３のように目の周囲に特徴点を再配置し、目の領域を拡大する。再配置された特徴点で囲まれた領域は、拡大後の領域を示す。 Here, the processing for transferring the texture will be further described. FIG. 13 is a diagram illustrating an example of feature points rearranged around the eyes. The boundary between the transferred texture and the surrounding texture may be unnatural. The automatic extraction of feature points often causes some positional errors, and the texture of different images is transferred in the first place, so even if the same person is photographed in the same environment, the processing target image and the reference image This is because there is a certain difference in the luminance value of the corresponding part. In order to reduce the influence of the feature point extraction error, the texture synthesizer 24 first increases the distance from the center of the pupil of the feature point coordinates on the eye contour after warping by a constant magnification. Then, feature points are rearranged around the eyes as shown in FIG. 13 to enlarge the eye area. A region surrounded by the rearranged feature points indicates a region after enlargement.

そこで、テクスチャの転写を行う際に、テクスチャ合成器２４は、拡大後の領域の境界からの距離に応じたグラデーションを施す。具体的には、テクスチャ合成器２４は、拡大後の領域において、目の中央に近づくに従って処理対象画像に含まれる画像から参照画像に含まれる画像に徐々に変化するように、参照画像に含まれる画像を、処理対象画像に含まれる画像に重ねる。すなわち、境界近くでは処理対象画像の重みを大きくし，目の中央に近づくほど参照画像の重みを徐々に大きくする。目の内部では参照画像のテクスチャを保持したいので、グラデーションは、ほぼ目の外側、すなわち、特徴点の再配置によって拡大された領域においてほぼ完結させる。 Therefore, when transferring the texture, the texture synthesizer 24 performs gradation according to the distance from the boundary of the enlarged area. Specifically, the texture synthesizer 24 is included in the reference image so as to gradually change from the image included in the processing target image to the image included in the reference image as it approaches the center of the eye in the enlarged region. The image is superimposed on the image included in the processing target image. That is, the weight of the processing target image is increased near the boundary, and the weight of the reference image is gradually increased toward the center of the eye. Since it is desired to keep the texture of the reference image inside the eye, the gradation is almost completed outside the eye, that is, in the region enlarged by the rearrangement of the feature points.

次に、ブロック１４８Ａにおいて、画像出力器２８は、ブロック１４６Ａで得られた画像を、例えばユーザの対話相手のコンピュータに送信する。送信された画像は、対話相手のディスプレイに表示される。 Next, in block 148A, the image output unit 28 transmits the image obtained in block 146A to, for example, the computer with which the user interacts. The transmitted image is displayed on the display of the conversation partner.

図１４は、処理前の処理対象画像の例である。図１５は、図１４の画像に図６の処理を行って得られた画像の例である。図１４では、視線が、ディスプレイに向けられており、カメラには向けられていないが、図１５では、視線がカメラに向けられているように見える。したがって、対話相手には、自分に視線が向けられているように見え、自然な対話が可能になる。 FIG. 14 is an example of a processing target image before processing. FIG. 15 is an example of an image obtained by performing the process of FIG. 6 on the image of FIG. In FIG. 14, the line of sight is directed to the display and not to the camera, but in FIG. 15, the line of sight appears to be directed to the camera. Therefore, it seems that the conversation partner is looking at him, and natural conversation is possible.

その後、ブロック１４０Ｂにおいて、新たな処理対象画像（第２フレーム）がカメラ１６から特徴点抽出器１８に入力され、第１フレームに対する処理と同様の処理が行われる。ただし、特徴点の抽出は行われず、代わりに、ブロック１４４Ａで求められた補正後の特徴点が用いられる。ブロック１４４Ｂ，１４６Ｂ、１４８Ｂの処理は、前述のブロック１４４Ａ，１４６Ａ、１４８Ａの処理とそれぞれ同じである。特徴点抽出器１８は、適切な参照画像を新たに選択して用いてもよいし、第１フレームと同じ参照画像を用いてもよい。以後のフレームについても、同様の処理が行われる。ユーザが瞬きをしたとき等、あるフレームにおいて式６の相関値が前フレームと比べて著しく低下した場合には、目の領域の追跡に失敗したものと判断して，当該フレームを第１フレームとして扱い、特徴点抽出処理（ブロック１４２）を含む一連の処理を再度行う。 Thereafter, in block 140B, a new processing target image (second frame) is input from the camera 16 to the feature point extractor 18, and processing similar to that for the first frame is performed. However, feature points are not extracted, and the corrected feature points obtained in block 144A are used instead. The processing of blocks 144B, 146B, and 148B is the same as the processing of blocks 144A, 146A, and 148A, respectively. The feature point extractor 18 may newly select and use an appropriate reference image, or may use the same reference image as the first frame. Similar processing is performed for the subsequent frames. When the correlation value of Equation 6 significantly decreases compared to the previous frame in a certain frame, such as when the user blinks, it is determined that tracking of the eye region has failed, and the frame is set as the first frame. A series of processing including handling and feature point extraction processing (block 142) is performed again.

図１６は、相関値の推移の例を示すグラフである。図１６では、図６の処理によって得られた画像と参照画像との間の相関値が、フレーム毎に示されている。相関値は、評価関数としての式６を用いて求められる。ブロック１４４Ａ等の特徴点位置の補正を行った場合の相関値（図１６のＡ）は、特徴点位置の補正を行わない場合の相関値（図１６のＢ）より大きく、かつ、値が安定していることがわかる。つまり、Ａの場合には、特徴点の位置補正がほぼ正しく行われていることがわかる。その結果、一連のフレームにおいて処理後の画像の目の位置が安定し、違和感の少ない動画像が得られる。 FIG. 16 is a graph illustrating an example of transition of correlation values. In FIG. 16, the correlation value between the image obtained by the process of FIG. 6 and the reference image is shown for each frame. The correlation value is obtained using Equation 6 as the evaluation function. The correlation value (A in FIG. 16) when the feature point position of the block 144A or the like is corrected is larger than the correlation value (B in FIG. 16) when the feature point position is not corrected, and the value is stable. You can see that That is, in the case of A, it can be seen that the position correction of the feature points is performed almost correctly. As a result, the position of the eyes of the processed image is stabilized in a series of frames, and a moving image with less discomfort is obtained.

図１７は、本発明の実施形態に係る視線変換装置を実現するコンピュータシステムの構成例を示すブロック図である。図１７のコンピュータシステム８０は、プロセッサ８２と、送受信機８４と、バス８８と、メモリ９２と、ファイル格納装置９４と、入力デバイス９６と、ディスプレイ９８とを有する。コンピュータシステム８０は、例えば、ユーザが通信ネットワークを介した対話に使用するテレビジョン受信機若しくはコンピュータを構成していてもよく、又はユーザが通信ネットワークを介した対話に使用するテレビジョン受信機若しくはコンピュータに内蔵されていてもよい。 FIG. 17 is a block diagram illustrating a configuration example of a computer system that realizes the line-of-sight conversion device according to the embodiment of the present invention. A computer system 80 in FIG. 17 includes a processor 82, a transceiver 84, a bus 88, a memory 92, a file storage device 94, an input device 96, and a display 98. The computer system 80 may comprise, for example, a television receiver or computer that the user uses for interaction over the communication network, or a television receiver or computer that the user uses for interaction over the communication network. It may be built in.

プロセッサ８２は、バス８８を経由して他の構成要素と通信する。送受信機８４は、インターネット等の通信ネットワークとの間でデータを送受信する。送受信機８４は、無線によって通信ネットワークに接続されていてもよい。 The processor 82 communicates with other components via the bus 88. The transceiver 84 transmits / receives data to / from a communication network such as the Internet. The transceiver 84 may be connected to a communication network by radio.

メモリ９２は例えばＲＡＭ（random access memory）及びＲＯＭ（read only memory）を含んでおり、データ及び命令を格納する。ファイル格納装置９４は、１以上の揮発性又は不揮発性の、非過渡的な、コンピュータ読み取り可能な格納媒体である。本発明の実施形態がソフトウェアで実現される場合には、例えば、マイクロコード、アセンブリ言語のコード、又はより高レベルの言語のコードが用いられ得る。これらのコードで記述され、本発明の実施形態の機能を実現する命令を含むプログラムを、ファイル格納装置９４は格納する。ファイル格納装置９４は、ＲＡＭ、ＲＯＭ、ＥＥＰＲＯＭ（electrically erasable programmable read only memory）、及びフラッシュメモリ等の半導体メモリ、ハードディスクドライブ等の磁気記録媒体、光記録媒体、これらの組み合わせ等を含み得る。 The memory 92 includes, for example, RAM (random access memory) and ROM (read only memory), and stores data and instructions. File storage device 94 is one or more volatile or non-volatile, non-transient, computer-readable storage media. Where embodiments of the present invention are implemented in software, for example, microcode, assembly language code, or higher level language code may be used. The file storage device 94 stores a program described in these codes and including an instruction for realizing the function of the embodiment of the present invention. The file storage device 94 may include a RAM, a ROM, an EEPROM (electrically erasable programmable read only memory), a semiconductor memory such as a flash memory, a magnetic recording medium such as a hard disk drive, an optical recording medium, a combination thereof, and the like.

入力デバイス９６は、タッチスクリーン、キーボード、リモートコントローラ、及びマウス等を含み得る。ディスプレイは、液晶ディスプレイ、有機ＥＬ（electroluminescence）ディスプレイ等のフラットパネルディスプレイを含み得る。 The input device 96 may include a touch screen, a keyboard, a remote controller, a mouse, and the like. The display may include a flat panel display such as a liquid crystal display or an organic EL (electroluminescence) display.

コンピュータシステム８０は、図１の視線変換装置１０として動作し得る。プロセッサ８２は、形状モデル生成器１４、特徴点抽出器１８、特徴点位置補正器２０、画像補正器２２、及び画像出力器２８として動作し得る。ファイル格納装置９４は、参照画像記憶装置１２として動作し得る。 The computer system 80 can operate as the line-of-sight conversion device 10 of FIG. The processor 82 may operate as the shape model generator 14, the feature point extractor 18, the feature point position corrector 20, the image corrector 22, and the image output unit 28. The file storage device 94 can operate as the reference image storage device 12.

本明細書における各機能ブロックは、例えば、回路等のハードウェアで実現され得る。代替としては各機能ブロックの一部又は全ては、ソフトウェアで実現され得る。例えばそのような機能ブロックは、プロセッサ８２及びプロセッサ８２上で実行されるプログラムによって実現され得る。換言すれば、本明細書で説明される各機能ブロックは、ハードウェアで実現されてもよいし、ソフトウェアで実現されてもよいし、ハードウェアとソフトウェアとの任意の組合せで実現され得る。 Each functional block in the present specification can be realized by hardware such as a circuit, for example. Alternatively, some or all of each functional block can be implemented in software. For example, such a functional block can be realized by the processor 82 and a program executed on the processor 82. In other words, each functional block described in the present specification may be realized by hardware, may be realized by software, or may be realized by any combination of hardware and software.

以上の実施形態は、本質的に好ましい例示であって、本発明、その適用物、あるいはその用途の範囲を制限することを意図するものではない。 The above embodiments are essentially preferable examples, and are not intended to limit the scope of the present invention, its application, or its use.

以上説明したように、本発明は、視線変換装置及び視線変換方法等について有用である。 As described above, the present invention is useful for a line-of-sight conversion device, a line-of-sight conversion method, and the like.

１０視線変換装置
１２参照画像記憶装置
１４形状モデル生成器
１６カメラ
１８特徴点抽出器
２０特徴点位置補正器
２２画像補正器 DESCRIPTION OF SYMBOLS 10 Eye-gaze transformation device 12 Reference image storage device 14 Shape model generator 16 Camera 18 Feature point extractor 20 Feature point position corrector 22 Image corrector

Claims

A camera that acquires an image of the face of the subject as a processing target image;
A reference image storage device that stores, as a reference image, a plurality of images of the face of the subject whose line of sight is directed in the direction of the imaging device;
A shape model generator that generates a shape model for the shape of the eye of the subject based on the plurality of reference images;
A feature point extractor for extracting feature points of the eyes of the subject in the processing target image;
A feature point position corrector that corrects the position of the feature point using the shape model;
The corresponding region of the reference image is transferred to the region defined by the feature point whose position has been corrected by the feature point position corrector so that the line of sight of the subject faces the direction of the camera. A line-of-sight conversion device comprising: an image corrector that corrects the processing target image so as to be visible.

The line-of-sight conversion device according to claim 1,
The line-of-sight conversion device, wherein the shape model generator generates the shape model by principal component analysis based on a feature point of the eye of the subject extracted from each of the plurality of reference images.

The line-of-sight conversion device according to claim 1,
The line-of-sight conversion device, wherein the image corrector deforms an eye shape defined by the feature point in the processing target image into an eye shape in the reference image by warping.

The line-of-sight conversion device according to claim 3,
The image corrector is
The eye region defined by the feature points after performing the warping is enlarged, and in the enlarged region, from the image included in the processing target image to the image included in the reference image as it approaches the center of the eye A line-of-sight conversion device that superimposes an image included in the reference image on an image included in the processing target image so as to gradually change.

Acquire the subject's face image as the processing target image,
Storing a plurality of images of the face of the subject whose line of sight is directed in the direction of the imaging device as a reference image;
Based on the plurality of the reference images, generate a shape model for the shape of the eye of the subject,
Extracting feature points of the subject's eyes in the processing target image;
Using the shape model, the position of the feature point is corrected,
The processing is performed so that the corresponding area of the reference image is transferred to the area defined by the feature point whose position is corrected so that the line of sight of the subject looks toward the camera. Correct the target image,
Gaze conversion method.

The line-of-sight conversion method according to claim 5,
Generating the shape model includes generating the shape model by principal component analysis based on feature points of the eye of the subject extracted from each of the plurality of reference images.

The line-of-sight conversion method according to claim 5,
The line-of-sight conversion method, wherein correcting the processing target image includes deforming an eye shape defined by the feature point in the processing target image into an eye shape in the reference image by warping.

The line-of-sight conversion method according to claim 7,
Correcting the processing target image includes enlarging an eye area defined by the feature points after performing the warping, and is included in the processing target image as it approaches the center of the eye in the enlarged area. A line-of-sight conversion method including superimposing an image included in the reference image on an image included in the processing target image so as to gradually change from an image to an image included in the reference image.