JP7259648B2

JP7259648B2 - Face orientation estimation device and method

Info

Publication number: JP7259648B2
Application number: JP2019158651A
Authority: JP
Inventors: 知禎相澤
Original assignee: Omron Corp
Current assignee: Omron Corp
Priority date: 2019-08-30
Filing date: 2019-08-30
Publication date: 2023-04-18
Anticipated expiration: 2039-08-30
Also published as: WO2021039403A1; JP2021039420A

Description

本開示は、画像データ中の人の顔に関して、安定して精度よく顔器官点の位置及び顔向きの角度を推定する顔向き推定装置及び方法に関する。 The present disclosure relates to a face orientation estimation device and method for stably and accurately estimating the positions of facial organ points and the face orientation angle of a person's face in image data.

画像データの中から人の顔を検出し、更に検出した顔について顔器官点の位置及び顔向きの角度を推定する様々な技術が知られている。例えば、特許文献１は、人の顔における、顔器官点を含む複数の特徴点に対応する、複数の３次元位置を定める３次元顔モデルを、画像中の顔にフィッティングすることにより、顔器官点の位置及び画像中の顔向きの角度を推定する、３次元顔モデルフィッティングアルゴリズム、及び、同アルゴリズムを利用する検出装置を、開示している。 2. Description of the Related Art Various techniques are known for detecting a human face from image data and estimating the positions of facial organ points and the facial orientation angle of the detected face. For example, in Patent Document 1, a three-dimensional face model that defines a plurality of three-dimensional positions corresponding to a plurality of feature points including facial feature points on a human face is fitted to a face in an image to determine facial feature points. A three-dimensional face model fitting algorithm that estimates the position of points and angles of face orientation in an image, and a detection device that utilizes the same, are disclosed.

しかしながら、従来技術に係る３次元顔モデルフィッティングアルゴリズムに拠ると、顔器官点の位置推定が時間的に安定しないことがある。そうすると、顔器官点の位置推定に基づく処理の一つである目開閉検出における精度に、影響を及ぼす可能性が生じる。 However, according to the conventional 3D face model fitting algorithm, the position estimation of facial feature points may not be stable over time. This may affect the accuracy of eye open/close detection, which is one of the processes based on position estimation of facial feature points.

更に、従来技術に拠ると、顔を手で掻くなどの動作によって顔全体の器官点推定位置が耳等の顔の横側に引き込まれた場合や、眼鏡等のフレームによって目の器官点推定位置が引っ張られた場合に、推定位置が誤った位置で安定する事象も生じ得る。そうすると、３次元顔モデルが顔画像としっかりとフィットしないままで安定化してしまい、顔向き推定結果がずれ続けてしまうことになる。 Furthermore, according to the conventional technology, when the estimated organ point positions of the entire face are drawn to the side of the face such as the ears due to an action such as scratching the face with a hand, or when the estimated organ point positions of the eyes are pulled by the frame such as glasses. An event can also occur where the estimated position stabilizes at the wrong position if the is pulled. As a result, the three-dimensional face model is stabilized without being tightly fitted to the face image, and the face orientation estimation result continues to deviate.

特許文献２に開示される開眼度特定装置は、前のフレームにおける各顔器官点位置を中心とした所定の範囲について、テンプレートマッチング等の方法で各顔器官点のトラッキング探索をしている。また、特許文献３に開示される顔部品探索装置は、前のフレームにおける各顔器官点の位置を大まかな位置とみなし、その位置からの各顔器官点のトラッキングにより詳細位置を探索している。しかしながら、特許文献２に開示される開眼度特定装置や特許文献３に開示される顔部品探索装置であっても、顔を手で掻くなどの動作によって顔全体の器官点推定位置が耳等の顔の横側に引っ張られた場合や、眼鏡等のフレームによって目の器官点推定位置が引き込まれた場合に、推定位置が誤った位置で安定してしまうおそれがある。 The eye openness identification device disclosed in Patent Document 2 performs a tracking search for each facial organ point using a method such as template matching for a predetermined range centered on the position of each facial organ point in the previous frame. Further, the facial part search device disclosed in Patent Document 3 regards the position of each facial feature point in the previous frame as a rough position, and searches for detailed positions by tracking each facial feature point from that position. . However, even with the eye openness degree identifying device disclosed in Patent Document 2 and the facial part searching device disclosed in Patent Document 3, an action such as scratching the face with a hand causes the estimated organ point positions of the entire face to shift to the ear or the like. When the eye is pulled to the side of the face, or when the eye organ point estimation position is drawn in by the frame of glasses or the like, the estimated position may be stabilized at an incorrect position.

特開２００７－２４９２８０号公報Japanese Patent Application Laid-Open No. 2007-249280 特開２０１０－１９８３１３号公報JP 2010-198313 A 特開２００３－２８１５３９号公報Japanese Patent Application Laid-Open No. 2003-281539

本開示は、画像データ中の人の顔に関して、時間的に安定し、且つ、誤った推定位置で安定してしまうことなく、精度良く顔器官点の位置及び顔向きの角度を推定するアルゴリズム、及び同アルゴリズムを利用する顔向き推定装置及び方法を提供する。 The present disclosure is an algorithm for accurately estimating the positions of facial organ points and the angle of the face orientation with respect to a human face in image data, which is temporally stable and does not stabilize at an erroneous estimated position. and a face orientation estimation apparatus and method using the same algorithm.

本開示の顔向き推定装置は、
画像データから人の顔画像データを検出する顔検出部と、及び、
検出された人の顔画像データに関して顔器官点の位置及び顔向きの角度を推定する顔向き推定部と
を備える顔向き推定装置である。
前記顔向き推定部は、前記顔検出部が検出する顔画像が複数のフレームにおいて連続する場合、フレーム毎に、
（１）大局探索フィッティング処理から推定基準値を求める推定基準値を求める推定基準値算出処理と、
（２）前フレームのフィッティング結果である３次元顔モデルに基づいて、局所探索フィッティング処理を行うトラッキング処理と
を実行するものであり、
前記（２）トラッキング処理は、補正処理を含み、
前記補正処理は、
前フレームのフィッティング結果である３次元顔モデルにおける顔向きの角度と、前記（１）推定基準値算出処理により算出される推定基準値の３次元顔モデルにおける顔向きの角度との差分を算出し、該差分が所定の閾値以上である場合には、前フレームのフィッティング結果である３次元顔モデルに代えて、最新の推定基準値の３次元顔モデルを、局所探索フィッティング処理の基になる３次元顔モデルとし、
前記顔向き推定部は、局所探索フィッティング処理の結果から、フレーム毎に顔器官点の位置及び顔向きの角度を出力する。 The face orientation estimation device of the present disclosure includes:
a face detection unit that detects human face image data from image data; and
The face orientation estimation device includes a face orientation estimation unit that estimates positions of facial organ points and angles of the face orientation with respect to detected face image data of a person.
When the face images detected by the face detection unit are continuous in a plurality of frames, the face orientation estimation unit performs the following for each frame:
(1) an estimated reference value calculation process for obtaining an estimated reference value from a global search fitting process;
(2) based on the three-dimensional face model that is the fitting result of the previous frame, a tracking process that performs a local search fitting process;
The (2) tracking process includes a correction process,
The correction process is
Calculate the difference between the face orientation angle of the three-dimensional face model that is the result of the fitting of the previous frame and the face orientation angle of the three-dimensional face model of the estimated reference value calculated by the estimated reference value calculation process (1). , if the difference is equal to or greater than a predetermined threshold, the 3D face model with the latest estimated reference value is used as the basis of the local search fitting process instead of the 3D face model that is the fitting result of the previous frame. Dimensional face model and
The facial orientation estimating unit outputs the position of the facial organ point and the facial orientation angle for each frame based on the result of the local search fitting process.

本開示に係る顔向き推定装置及び方法は、画像データ中の人の顔に関して、時間的に安定し、且つ、誤った推定位置で安定してしまうことなく、精度良く顔器官点の位置及び顔向きの角度を推定することができる。 The apparatus and method for estimating face orientation according to the present disclosure are capable of stabilizing a person's face in image data over time without stabilizing an erroneous estimated position. Orientation angles can be estimated.

実施の形態１に係る顔向き推定装置の機能構成を示すブロック図である。2 is a block diagram showing the functional configuration of the face orientation estimation device according to Embodiment 1; FIG. 本開示に係る顔向き推定装置の適用例を説明するための図である。FIG. 10 is a diagram for explaining an application example of the face orientation estimation device according to the present disclosure; 実施の形態１に係る顔向き推定装置の全体動作を示すフローチャートである。4 is a flowchart showing the overall operation of the face orientation estimation device according to Embodiment 1; 実施の形態１に係る顔向き推定装置における顔向き推定部による、顔器官点の位置及び顔向きの角度を推定する処理を示すフローチャートである。5 is a flow chart showing a process of estimating positions of facial feature points and angles of face orientation by a face orientation estimation unit in the face orientation estimation device according to Embodiment 1; 図５（ａ－１）（ａ－２）（ａ－３）（ａ－４）は、顔画像データにおける顔器官点の位置推定が時間的に安定しないために、目のための位置推定の実際の対象が、目（ａ－１）→眉（ａ－２）→目（ａ－３）→眉（ａ－４）と変動してしまう様子を模式的に示す図である。図５（ｂ－１）（ｂ－２）（ｂ－３）（ｂ－４）は、図５（ａ－１）～（ａ－４）のように目のための位置推定の実際の対象が変動してしまうために、目開閉検出処理が、実際には目が開かれ続けているにもかかわらず、開→閉→開→閉という誤出力をしてしまう様子を模式的に示す図である。5(a-1), (a-2), (a-3), and (a-4) show position estimation for eyes because position estimation of facial feature points in face image data is not stable over time. FIG. 10 is a diagram schematically showing how an actual target changes from eye (a-1)→eyebrow (a-2)→eye (a-3)→eyebrow (a-4). FIGS. 5(b-1)(b-2)(b-3)(b-4) are actual objects of position estimation for eyes like FIGS. 5(a-1) to (a-4). This is a diagram schematically showing how the eye open/closed detection process produces an erroneous output of open→closed→open→closed, even though the eyes are actually kept open. is. 本開示に係る、顔器官点の位置及び顔向きの角度を推定するアルゴリズムの概要を説明する図である。FIG. 4 is a diagram illustrating an overview of an algorithm for estimating the positions of facial feature points and the angle of the face orientation, according to the present disclosure;

以下、適宜図面を参照しながら、本発明に係る実施の形態を詳細に説明する。但し、必要以上に詳細な説明は省略する場合がある。例えば、既によく知られた事項の詳細説明や実質的に同一の構成に対する重複説明を省略する場合がある。これは、以下の説明が不必要に冗長になるのを避け、当業者の理解を容易にするためである。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments according to the present invention will be described in detail with reference to the drawings as appropriate. However, more detailed description than necessary may be omitted. For example, detailed descriptions of well-known matters and redundant descriptions of substantially the same configurations may be omitted. This is to avoid unnecessary verbosity in the following description and to facilitate understanding by those skilled in the art.

なお、発明者らは、当業者が本開示を十分に理解するために添付図面および以下の説明を提供するのであって、これらによって特許請求の範囲に記載の主題を限定することを意図するものではない。 It is noted that the inventors provide the accompanying drawings and the following description in order for those skilled in the art to fully understand the present disclosure, which are intended to limit the claimed subject matter. isn't it.

［本開示に至る経緯］
画像データの中から人の顔を検出し、更に検出した顔について顔器官点の位置及び顔向きの角度を推定する様々な技術が知られている。特許文献１は、人の顔における、顔器官点を含む複数の特徴点に対応する、複数の３次元位置を定める３次元顔モデルを、画像中の顔にフィッティングすることにより、顔器官点の位置及び画像中の顔向きの角度を推定する、３次元顔モデルフィッティングアルゴリズム、及び、同アルゴリズムを利用する検出装置を、開示している。 [Background to this disclosure]
2. Description of the Related Art Various techniques are known for detecting a human face from image data and estimating the positions of facial organ points and the facial orientation angle of the detected face. Japanese Patent Application Laid-Open No. 2002-200000 discloses that a three-dimensional face model that defines a plurality of three-dimensional positions corresponding to a plurality of feature points including facial feature points on a human face is fitted to a face in an image to determine facial feature points. A three-dimensional face model fitting algorithm that estimates the position and angle of the face orientation in an image, and a detection device that utilizes the same, are disclosed.

しかしながら、顔器官点の位置推定が時間的に安定しない場合、例えば、顔器官点位置に基づく処理の一つである目開閉検出の精度に、影響を及ぼす可能性が生じる。図５（ａ－１）（ａ－２）（ａ－３）（ａ－４）は、顔画像データにおける顔器官点の位置推定が時間的に安定しないために、目のための位置推定の実際の対象が、目（ａ－１）→眉（ａ－２）→目（ａ－３）→眉（ａ－４）と変動してしまう様子を模式的に示す図である。これに対して、図５（ｂ－１）（ｂ－２）（ｂ－３）（ｂ－４）は、図５（ａ－１）～（ａ－４）のように目のための位置推定の実際の対象が変動してしまうために、目開閉検出処理が、実際には目が開かれ続けているにもかかわらず、開→閉→開→閉という誤出力をしてしまう様子を模式的に示す図である。このように、時間的に安定しない顔器官点の位置推定は、目開閉検出の精度に影響し得る。 However, if the position estimation of facial feature points is not stable over time, it may affect the accuracy of eye open/close detection, which is one of the processes based on facial feature point positions. 5(a-1), (a-2), (a-3), and (a-4) show position estimation for eyes because position estimation of facial feature points in face image data is not stable over time. FIG. 10 is a diagram schematically showing how an actual target changes from eye (a-1)→eyebrow (a-2)→eye (a-3)→eyebrow (a-4). On the other hand, FIGS. 5(b-1), (b-2), (b-3), and (b-4) show the positions for the eyes as in FIGS. 5(a-1) to (a-4). Because the actual target of estimation fluctuates, the eye open/closed detection process incorrectly outputs open → closed → open → closed even though the eyes are actually open. It is a figure shown typically. In this way, position estimation of facial feature points that is not stable over time can affect the accuracy of eye open/close detection.

そこで、フレームレートが１５フレーム／秒、又は３０フレーム／秒である動画においては顔器官点位置の１フレーム毎の動きが一般に微小であることを利用して、顔器官点の位置をトラッキング処理により推定していくことが、考えられる。つまり、前のフレームでの顔器官点位置推定結果（即ち、３次元顔モデルフィッティング結果）を起点として、その周囲で顔器官点を探索（即ち、フィッティング処理）するということが考えられる。 Therefore, by utilizing the fact that movements of facial organ point positions for each frame are generally minute in moving images with a frame rate of 15 frames/second or 30 frames/second, the positions of facial organ points are detected by tracking processing. Estimates can be considered. In other words, it is conceivable to search for facial feature points (ie, fitting process) around the facial feature point position estimation result (ie, three-dimensional face model fitting result) in the previous frame as a starting point.

しかしながら、トラッキング処理を組み合わせることで顔器官点位置推定結果やその後段処理の目開閉検出が時間的に安定する一方で、顔を手で掻くなどの動作によって顔全体の器官点推定位置が耳等の顔の横側に引き込まれた場合や、眼鏡等のフレームによって目の器官点推定位置が引っ張られた場合に、推定位置が誤った位置で安定する事象も生じ得る。そうすると、３次元顔モデルが顔画像としっかりとフィットしないままで安定化してしまい、顔向き推定結果がずれ続けてしまうことになる。 However, by combining the tracking process, the facial organ point position estimation result and the eye opening/closing detection in the subsequent process are temporally stable. When the eye is drawn to the side of the face, or when the estimated position of the organ point of the eye is pulled by the frame of eyeglasses or the like, an event may occur in which the estimated position is stabilized at an incorrect position. As a result, the three-dimensional face model is stabilized without being tightly fitted to the face image, and the face orientation estimation result continues to deviate.

本開示は、このような問題点を解決するために、発明者により考案された技術である。図６は、本開示に係る顔器官の位置及び顔向きの角度を推定するアルゴリズムの概要を説明する図である。図６に示すように、３次元顔モデルフィッティングのトラッキング処理（局所探索フィッティング処理）（（Ａ１）－（Ａ２）、（Ｂ１）－（Ｂ２））と、大局探索フィッティング処理（（Ｃ１）－（Ｃ２））とが、並行して行われる。 The present disclosure is a technique devised by the inventor to solve such problems. FIG. 6 is a diagram illustrating an outline of an algorithm for estimating the position of facial features and the angle of the face orientation according to the present disclosure. As shown in FIG. 6, three-dimensional face model fitting tracking processing (local search fitting processing) ((A1)-(A2), (B1)-(B2)) and global search fitting processing ((C1)-( C2)) are performed in parallel.

フレーム毎に、大局探索フィッティング処理（（Ｃ１）－（Ｃ２））から推定基準値が算出される。 An estimated reference value is calculated from the global search fitting process ((C1)-(C2)) for each frame.

トラッキング処理では、前フレームのフィッティング結果である３次元顔モデルが顔画像上に配置され、続いて、前フレームのフィッティング結果である３次元顔モデルにおける顔向き角度と、顔向き推定基準値（推定基準値における顔向き角度）とが比較されて差分が計算される（（Ａ１）、（Ｂ１））。差分Ｄｉｆｆの算出には、例えば、ヨー角を用いる。ここで差分が所定の閾値より小さい場合（Ａ１）には、配置された、前フレームのフィッティング結果である３次元顔モデルに基づいて、局所探索フィッティング処理が行われる（Ａ２）。差分が所定の閾値以上である場合（Ｂ１）には、前フレームのフィッティング結果である３次元顔モデルではなく、推定基準値の３次元顔モデルが顔画像上に配置され直され（Ｂ１’）、配置され直された推定基準値の３次元顔モデルに基づいて、局所探索フィッティング処理が行われる（Ｂ２）。 In the tracking process, the 3D face model, which is the result of the previous frame fitting, is arranged on the face image. Subsequently, the face orientation angle and the face orientation estimation reference value (estimated face orientation angle at the reference value) and the difference is calculated ((A1), (B1)). For example, the yaw angle is used to calculate the difference Diff. Here, if the difference is smaller than the predetermined threshold (A1), local search fitting processing is performed based on the arranged three-dimensional face model that is the fitting result of the previous frame (A2). If the difference is equal to or greater than a predetermined threshold (B1), the 3D face model of the estimated reference value is rearranged on the face image instead of the 3D face model that is the fitting result of the previous frame (B1′). , based on the relocated estimated reference value 3D face model, a local search fitting process is performed (B2).

このようにすることで、時間的に安定し、且つ、誤った推定位置で安定してしまうこと無く、精度良く顔器官点の位置および顔向きを推定するアルゴリズムが実現される。 By doing so, it is possible to implement an algorithm that is stable over time and that accurately estimates the positions of facial organ points and the facial orientation without being stabilized at an erroneous estimated position.

本開示に係る技術では、３次元顔モデルフィッティングのトラッキング処理をベースとするため、顔器官点位置推定結果や、その後段処理の目開閉検出が、時間的に安定する。更に、本開示に係る技術では、トラッキング処理結果を、大局探索フィッティング処理結果から算出される推定基準値と常時比較して、所定の閾値以上のずれがある場合には、トラッキング処理結果を補正する（即ち、推定基準値の３次元顔モデルと置き換える）。そのため、例えば、顔を手で掻くなどの動作によって顔全体の器官点推定位置が耳等の顔の横側に引き込まれた場合や、眼鏡等のフレームによって目の器官点推定位置が引っ張られた場合に、推定位置が誤った位置で安定する事象が生じ、結果として３次元顔モデルが顔画像としっかりとフィットしないまま安定化し、顔向き推定結果がずれ続ける、ということを抑制することができる。 Since the technology according to the present disclosure is based on the tracking processing of the 3D face model fitting, the facial organ point position estimation result and the eye opening/closing detection in subsequent processing are temporally stable. Furthermore, the technology according to the present disclosure constantly compares the tracking processing result with the estimated reference value calculated from the global search fitting processing result, and corrects the tracking processing result when there is a deviation of a predetermined threshold or more. (That is, replace with the 3D face model of the estimated reference value). Therefore, for example, when the estimated organ point positions of the entire face are drawn into the side of the face such as the ears due to an action such as scratching the face with a hand, or when the estimated organ point positions of the eyes are pulled by a frame such as glasses. In this case, it is possible to suppress the occurrence of an event in which the estimated position is stabilized at an incorrect position, and as a result, the 3D face model stabilizes without being firmly fitted to the face image, and the face orientation estimation result continues to deviate. .

従って、本開示に係る技術により、顔画像において、時間的に安定し、且つ、誤った推定位置で安定してしまうこと無く、精度良く顔器官点の位置および顔向きを推定することができる。更に、大局探索フィッティング処理は、推定位置が誤った位置で安定する事象が生じることを抑制するためのものであるから、凡そのフィッティングを行うもので十分であり、またこれにより、リアルタイム性も維持することができる。 Therefore, the technique according to the present disclosure can accurately estimate the positions of facial organ points and the facial orientation in a facial image, stably over time, without being stabilized at an erroneous estimated position. Furthermore, since the global search fitting process is to suppress the occurrence of an event in which the estimated position stabilizes at an incorrect position, it is sufficient to perform approximate fitting, and this also maintains real-time performance. can do.

［本開示で利用する３次元顔モデルフィッティング］
本開示で利用する３次元顔モデルフィッティングのアルゴリズムについて説明する。３次元顔モデルフィッティングのアルゴリズムは、様々存在する。特許文献１に開示される３次元顔モデルフィッティングのアルゴリズムは、リアルタイム性及び高精度が求められる車載モニタリングセンサの技術分野で用いられる、一つの例である。本開示で利用する３次元顔モデルフィッティングのアルゴリズムは、特許文献１に開示される３次元顔モデルフィッティングのアルゴリズムであってもよいし、別の３次元顔モデルフィッティングのアルゴリズムであってもよい。 [3D face model fitting used in the present disclosure]
Algorithms for 3D face model fitting utilized in the present disclosure will be described. There are various algorithms for 3D face model fitting. The three-dimensional face model fitting algorithm disclosed in Patent Document 1 is one example used in the technical field of on-vehicle monitoring sensors that require real-time performance and high accuracy. The 3D face model fitting algorithm used in the present disclosure may be the 3D face model fitting algorithm disclosed in Patent Document 1, or may be another 3D face model fitting algorithm.

本開示で利用する３次元顔モデルフィッティングのアルゴリズムは、概略以下のようなものである。学習画像を用いて、モデルの各ノードが顔特徴点の正しい位置に配置された正解モデルと、いずれかのノードが誤った位置に配置された誤差モデルとの差、及び誤差モデルに基づいて取得されたノード特徴量、についての相関関係の情報を、予め取得しておく。入力画像から顔特徴点を検出する際には、複数のノードの３次元位置を定めた３次元モデルを作成し、各ノードを入力画像上に投影し、投影点からノード特徴量を取得し、このノード特徴量と学習した相関関係の情報に基づいて、現在の各ノードの位置と対応する特徴点の位置とのずれを示す誤差推定量を取得する。更に、この誤差推定量と現在の各ノードの位置に基づいて、入力画像における各顔特徴点の３次元位置を推定し、それに合わせて各ノードを動かす。 The 3D face model fitting algorithm utilized in the present disclosure is roughly as follows. Obtained based on the difference between the correct model in which each node of the model is placed at the correct position of the facial feature point using the training image and the error model in which any node is placed at the wrong position, and the error model Correlation information about the node feature values that are obtained is acquired in advance. When detecting facial feature points from an input image, create a three-dimensional model that defines the three-dimensional positions of a plurality of nodes, project each node onto the input image, acquire the node feature amount from the projected points, Based on this node feature amount and learned correlation information, an error estimator indicating the deviation between the current position of each node and the position of the corresponding feature point is obtained. Furthermore, based on this error estimate and the current position of each node, the three-dimensional position of each facial feature point in the input image is estimated, and each node is moved accordingly.

なお、本開示で利用する「ラフフィッティング」では、相関関係を取得するための学習段階で用いる学習画像において、正解モデルと誤差モデルの差が比較的大きいものが用いられ、これにより相関関係が形成される。 In the "rough fitting" used in the present disclosure, in the learning image used in the learning stage for obtaining the correlation, the difference between the correct model and the error model is relatively large, and the correlation is formed. be done.

一方、本開示で利用する「詳細フィッティング」では、相関関係を取得するための学習段階で用いる学習画像において、正解モデルと誤差モデルの差が比較的小さいものが用いられ、これにより相関関係が形成される。 On the other hand, in the "detailed fitting" used in the present disclosure, the learning image used in the learning stage for obtaining the correlation has a relatively small difference between the correct model and the error model, thereby forming the correlation. be done.

本開示で利用する３次元顔モデルフィッティングのアルゴリズムは、上記以外のものであってもよい。 Algorithms for 3D face model fitting utilized in this disclosure may be other than those described above.

［適用例］
本開示に係る顔向き推定装置が適用可能な一例について、図２を用いて説明する。図２は、本開示に係る顔向き推定装置１４の適用例を説明するための図である。 [Application example]
An example to which the face orientation estimation device according to the present disclosure can be applied will be described with reference to FIG. FIG. 2 is a diagram for explaining an application example of the face orientation estimation device 14 according to the present disclosure.

図２は、いずれも自動車に搭載される、車両制御部４と、及びドライバモニタリングセンサ１２との、内部構成を示すブロック図である。車両制御部４は、ＥＣＵ（electronic control unit：電子制御ユニット）６と、アクチュエータ８を含む。ＥＣＵ６は、複数のものであってもよいし、アクチュエータ８も複数のものであってもよい。 FIG. 2 is a block diagram showing the internal configuration of the vehicle control unit 4 and the driver monitoring sensor 12, both of which are mounted on the automobile. The vehicle control unit 4 includes an ECU (electronic control unit) 6 and an actuator 8 . A plurality of ECUs 6 may be provided, and a plurality of actuators 8 may be provided.

ドライバモニタリングセンサ１２は、運転者の表情を中心にリアルタイムでモニタリングを行う装置であり、撮像装置であるカメラ１６と、及び、顔向き推定装置である画像処理部１４とを、含む。顔向き推定装置である画像処理部１４は、ハードウェアプロセッサに相当するＣＰＵ１８と、メモリに相当するＲＯＭ（Read Only Memory）２０と、メモリに相当するＲＡＭ（Random Access Memory）２２とを有する。これら各構成は、適宜のバスを介して相互にデータ送受信可能に接続される。
更に、ドライバモニタリングセンサ１２のＣＰＵ１８と、車両制御部４のＥＣＵ６とは、ＣＡＮ（Control Area Network）１０を介して接続する。 The driver monitoring sensor 12 is a device that monitors the driver's facial expression in real time, and includes a camera 16 that is an imaging device and an image processor 14 that is a face direction estimation device. The image processing unit 14, which is a face orientation estimation device, has a CPU 18 equivalent to a hardware processor, a ROM (Read Only Memory) 20 equivalent to a memory, and a RAM (Random Access Memory) 22 equivalent to a memory. These components are connected to each other via appropriate buses so that data can be sent and received.
Furthermore, the CPU 18 of the driver monitoring sensor 12 and the ECU 6 of the vehicle control unit 4 are connected via a CAN (Control Area Network) 10 .

ＣＰＵ１８は、ＲＯＭ２０又はＲＡＭ２２に記憶されたプログラムの実行に関する制御やデータの演算、加工を行う。ＣＰＵ１８は、様々なプログラム（例えば、３次元顔モデルフィッティングアルゴリズムのためのプログラム）を実行する演算装置である。ＣＰＵ１８は、カメラ１６や、車両制御部４のＥＣＵ６から種々の入力データを受け取り、入力データの演算結果を、ＣＡＮ１０を介して車両制御部４のＥＣＵ６に出力したり、ＲＡＭ２２に格納したりする。 The CPU 18 controls the execution of programs stored in the ROM 20 or RAM 22 and performs data calculation and processing. The CPU 18 is an arithmetic unit that executes various programs (eg, a program for a 3D face model fitting algorithm). The CPU 18 receives various input data from the camera 16 and the ECU 6 of the vehicle control unit 4 and outputs the calculation results of the input data to the ECU 6 of the vehicle control unit 4 via the CAN 10 and stores them in the RAM 22 .

ＲＯＭ２０は、データの読み出しのみが可能な記憶部であり、例えば半導体記憶素子で構成される。ＲＯＭ２０は、例えばＣＰＵ１８が実行するアプリケーション等のプログラムやデータ等を記憶する。 The ROM 20 is a memory unit from which data can only be read, and is composed of, for example, a semiconductor memory element. The ROM 20 stores programs such as applications executed by the CPU 18, data, and the like.

ＲＡＭ２２は、データの書き換えが可能な記憶部であり、例えば半導体記憶素子で構成される。ＲＡＭ２２は、例えばカメラ１６からの入力画像等を記憶する。 The RAM 22 is a data rewritable storage unit, and is composed of, for example, a semiconductor memory element. The RAM 22 stores input images from the camera 16, for example.

以上のような、車両制御部４及びドライバモニタリングセンサ１２において、顔向き推定装置は、画像処理部１４により実現される。 In the vehicle control unit 4 and the driver monitoring sensor 12 as described above, the face orientation estimation device is implemented by the image processing unit 14 .

［構成例］
以下、顔向き推定装置１４の構成例としての実施の形態を説明する。 [Configuration example]
An embodiment as a configuration example of the face orientation estimation device 14 will be described below.

１.実施の形態１
１．１．構成
実施の形態１に係る顔向き推定装置１４の構成について、図１を用いて説明する。図１は、本実施の形態に係る顔向き推定装置１４の機能構成を示すブロック図である。 1. Embodiment 1
1.1. Configuration The configuration of face orientation estimation device 14 according to Embodiment 1 will be described with reference to FIG. FIG. 1 is a block diagram showing the functional configuration of face orientation estimation device 14 according to the present embodiment.

顔向き推定装置１４は、顔検出部２３、顔向き推定部２４、目開閉検出部４０、及び、視線推定部４２により、構成される。顔検出部２３は、カメラ１６等により撮像される画像データから人の顔画像データを検出する。顔向き推定部２４は、検出された人の顔画像データに関して顔器官点の位置及び顔向きの角度を推定する。目開閉検出部４０は、検出された人の顔画像データ、並びに、推定された顔器官点の位置及び顔向きの角度のデータに基づいて、目の開閉を検出する。視線推定部４２は、検出された人の顔画像データ、推定された顔器官点の位置及び顔向きの角度のデータ、並びに、検出された目の開閉のデータに基づいて、視線の方向を推定する。なお、顔向き推定装置１４は、視線推定部４２を備えなくてもよい。更に、顔向き推定装置１４は、目開閉検出部４０を備えなくてもよい。 The face orientation estimation device 14 is configured by a face detection section 23 , a face orientation estimation section 24 , an eye open/close detection section 40 and a line of sight estimation section 42 . The face detection unit 23 detects human face image data from image data captured by the camera 16 or the like. The facial orientation estimating unit 24 estimates the positions of facial organ points and facial orientation angles with respect to the detected face image data of a person. The eye open/close detection unit 40 detects the open/close of the eyes based on the detected face image data of the person and the estimated positions of the facial organ points and the face orientation angle data. The line-of-sight estimation unit 42 estimates the line-of-sight direction based on the detected face image data of a person, the estimated positions of facial organ points and face orientation angle data, and the detected eye opening/closing data. do. Note that the face orientation estimation device 14 does not have to include the line-of-sight estimation unit 42 . Furthermore, the face orientation estimation device 14 does not have to include the eye open/close detection section 40 .

顔向き推定部２４は、第１の３次元顔モデル配置部２６、大局探索３次元顔モデルフィッティング部２８、推定基準値算出部３０、局所探索３次元顔モデルフィッティング部３２、第２の３次元顔モデル配置部３４、比較判定部３６、及び、３次元顔モデルフィッティング結果補正部３８を、含む。 The face orientation estimation unit 24 includes a first three-dimensional face model placement unit 26, a global search three-dimensional face model fitting unit 28, an estimated reference value calculation unit 30, a local search three-dimensional face model fitting unit 32, a second three-dimensional A face model placement unit 34, a comparison determination unit 36, and a three-dimensional face model fitting result correction unit 38 are included.

第１の３次元顔モデル配置部２６は、顔画像データに対して、３次元顔モデルを初期配置する。
大局探索３次元顔モデルフィッティング部２８は、初期配置された３次元顔モデルに基づいて、顔画像データに対して、凡その位置に３次元顔モデルをフィッティングする。つまり、大局探索３次元顔モデルフィッティング部２８は、大局探索フィッティング処理（図４・ステップＳ３４参照）を行う。大局探索フィッティング処理は、顔全体を探索して凡そフィッティングする処理である。なお、大局探索３次元顔モデルフィッティング部２８によって大局探索フィッティング処理にて行われるフィッティングは、上述の「ラフフィッティング」である。 A first three-dimensional face model placement unit 26 initially lays out a three-dimensional face model for face image data.
A global search 3D face model fitting unit 28 fits a 3D face model to the approximate position of the face image data based on the initially arranged 3D face model. That is, the global search three-dimensional face model fitting unit 28 performs global search fitting processing (see FIG. 4, step S34). The global search fitting process is a process of searching the entire face and roughly fitting it. The fitting performed in the global search fitting process by the global search three-dimensional face model fitting unit 28 is the above-described "rough fitting".

推定基準値算出部３０は、直近数フレーム分の、例えば直近５フレーム分の、大局探索フィッティング処理（図４・ステップＳ３４参照）で得られた結果から、推定基準値の３次元顔モデルを、即ち、例えば、推定基準値としての顔器官点の位置及び顔向きの角度を、算出する。 The estimated reference value calculation unit 30 calculates the three-dimensional face model of the estimated reference value based on the result obtained by the global search fitting process (see FIG. 4, step S34) for the most recent several frames, for example, the most recent five frames. That is, for example, the position of the facial organ point and the angle of the face direction are calculated as the estimation reference value.

局所探索３次元顔モデルフィッティング部３２は、大局探索フィッティング処理（図４・ステップＳ３４参照）を経由した３次元顔モデル、前フレームのフィッティング結果である３次元顔モデル、又は、最新の推定基準値の３次元顔モデルに、基づいて、顔画像データに対して、局所探索フィッティング処理（図４・ステップＳ４０、ステップＳ５２参照）を行う。局所探索フィッティング処理は、顔における局所を探索して詳細にフィッティングする処理である。局所探索３次元顔モデルフィッティング部３２が行う局所探索フィッティング処理については後で説明する。 The local search 3D face model fitting unit 32 receives the 3D face model that has undergone the global search fitting process (see step S34 in FIG. 4), the 3D face model that is the result of fitting the previous frame, or the latest estimated reference value. Based on the three-dimensional face model, local search fitting processing (see FIG. 4, steps S40 and S52) is performed on the face image data. The local search fitting process is a process of searching for local areas in the face and fitting them in detail. The local search fitting process performed by the local search three-dimensional face model fitting unit 32 will be described later.

第２の３次元顔モデル配置部３４は、顔画像データに対して、前フレームのフィッティング結果である３次元顔モデルを配置する。
比較判定部３６は、第２の３次元顔モデル配置部３４により配置される３次元顔モデル（即ち、前フレームのフィッティング結果である３次元顔モデル）における顔向きの角度と、推定基準値算出部３０により算出される推定基準値の３次元顔モデルにおける顔向きの角度との差分Ｄｉｆｆを算出し、差分Ｄｉｆｆが所定の閾値より小さいか否かを判別する。ここで、差分Ｄｉｆｆの算出には、例えば、ヨー角を用いる。 The second three-dimensional face model placement unit 34 places a three-dimensional face model, which is the result of fitting the previous frame, on the face image data.
The comparison/determination unit 36 calculates the face direction angle and the estimated reference value in the 3D face model placed by the second 3D face model placement unit 34 (that is, the 3D face model that is the fitting result of the previous frame). A difference Diff between the estimated reference value calculated by the unit 30 and the face direction angle in the three-dimensional face model is calculated, and it is determined whether or not the difference Diff is smaller than a predetermined threshold. Here, for example, the yaw angle is used to calculate the difference Diff.

３次元顔モデルフィッティング結果補正部３８は、比較判定部３６が、差分Ｄｉｆｆが所定の閾値以上である、と判別した場合に、前フレームのフィッティング結果である３次元顔モデルではなく、最新の推定基準値の３次元顔モデルを、顔画像データ上に、配置し直す。 If the comparison determination unit 36 determines that the difference Diff is equal to or greater than the predetermined threshold value, the 3D face model fitting result correction unit 38 applies the latest estimated face model instead of the 3D face model that is the fitting result of the previous frame. The three-dimensional face model of the reference value is rearranged on the face image data.

顔向き推定部２４は、局所探索３次元顔モデルフィッティング部３２による局所探索フィッティング処理（図４・ステップＳ４０、ステップＳ５２参照）に基づいて、当該フレームにおける顔器官点の位置及び顔向きの角度を出力する。 The facial orientation estimating unit 24 calculates the positions of facial organ points and facial orientation angles in the frame based on the local search fitting processing (see step S40 and step S52 in FIG. 4) by the local search three-dimensional face model fitting unit 32. Output.

１．２．動作
以上のように構成される顔向き推定装置１４の動作について、以下説明する。 1.2. Operation The operation of the face orientation estimation device 14 configured as described above will be described below.

１．２．１．全体動作
図３は、実施の形態１に係る顔向き推定装置１４の全体動作を示すフローチャートである。顔向き推定装置１４の動作開始（ステップＳ０２）後、トラッキングフラグ「ＴｒａｃｋｉｎｇＦｌａｇ」を「ＦＡＬＳＥ」にセットし（ステップＳ０４）、「ｔ」の初期化処理を行う（ステップＳ０６）。なお、トラッキングフラグ「ＴｒａｃｋｉｎｇＦｌａｇ」は、連続するフレームにおいて、顔検出ができれば１つ目のフレームに関する処理の終わりにて「ＴＲＵＥ」にセットされ、顔検出ができないフレームに到れば「ＦＡＬＳＥ」にセットされる。 1.2.1. Overall Operation FIG. 3 is a flowchart showing the overall operation of the face orientation estimation device 14 according to the first embodiment. After the face direction estimation device 14 starts operating (step S02), the tracking flag "TrackingFlag" is set to "FALSE" (step S04), and "t" is initialized (step S06). The tracking flag "TrackingFlag" is set to "TRUE" at the end of processing for the first frame if a face can be detected in successive frames, and is set to "FALSE" if a frame in which face detection is not possible is reached. be done.

次に、ｔフレーム目の画像データに関して、顔検出部２３が顔画像データ検出処理を行う（ステップＳ０８）。顔検出ができれば（ステップＳ１０・ＹＥＳ）、顔向き推定部２４が顔器官点の位置及び顔向きの角度を推定して（ステップＳ１４）、「ｔ」をインクリメントする（ステップＳ１６）。顔検出ができなければ（ステップＳ１０・ＮＯ）、トラッキングフラグ「ＴｒａｃｋｉｎｇＦｌａｇ」を「ＦＡＬＳＥ」にセットし（ステップＳ１２）、「ｔ」をインクリメントする（ステップＳ１６）。 Next, the face detection unit 23 performs face image data detection processing on the image data of the tth frame (step S08). If the face can be detected (step S10, YES), the face orientation estimator 24 estimates the position of the facial organ point and the angle of the face orientation (step S14), and increments "t" (step S16). If the face cannot be detected (step S10, NO), the tracking flag "TrackingFlag" is set to "FALSE" (step S12), and "t" is incremented (step S16).

「ｔ」が終了値でなければ（ステップＳ１８・ＮＯ）、次のフレームに関して顔検出処理から顔向き推定処理が行われる（ステップＳ０８～）。 If "t" is not the end value (step S18, NO), the face direction estimation process is performed from the face detection process to the next frame (step S08-).

「ｔ」が終了値となれば（ステップＳ１８・ＹＥＳ）、顔向き推定装置１４の動作を終了する（ステップＳ２０）。 If "t" reaches the end value (step S18, YES), the operation of the face direction estimation device 14 is ended (step S20).

１．２．２．顔向き推定処理
図４は、実施の形態１に係る顔向き推定装置１４における、顔向き推定部２４の顔向き推定処理（図３・ステップＳ１４）の内容を示すフローチャートである。 1.2.2. Face Orientation Estimation Processing FIG. 4 is a flow chart showing the details of the face orientation estimation processing (FIG. 3, step S14) of the face orientation estimation unit 24 in the face orientation estimation device 14 according to the first embodiment.

顔向き推定処理開始（ステップＳ３０）後、先ず、推定基準値算出処理（ステップＳ３２～ステップＳ３６）が実行される。初めに、第１の３次元顔モデル配置部２６が、顔画像データに対して３次元顔モデルを初期配置する（ステップＳ３２）。次に、大局探索３次元顔モデルフィッティング部２８が、初期配置された３次元顔モデルに基づいて、顔画像データに対して、凡その位置に３次元顔モデルをフィッティングする（ステップＳ３４）。大局探索３次元顔モデルフィッティング部２８によって大局探索フィッティング処理にて行われるフィッティングは、上述の「ラフフィッティング」である。 After the start of face direction estimation processing (step S30), first, estimation reference value calculation processing (steps S32 to S36) is executed. First, the first three-dimensional face model placement unit 26 initially lays out a three-dimensional face model for face image data (step S32). Next, the global search 3D face model fitting unit 28 fits a 3D face model to the approximate position of the face image data based on the initially arranged 3D face model (step S34). The fitting performed in the global search fitting process by the global search three-dimensional face model fitting unit 28 is the above-described "rough fitting".

次に、推定基準値算出部３０が、直近の所定の数フレーム分の、例えば直近５フレーム分の、大局探索フィッティング処理（ステップＳ３４）で得られた結果から、推定基準値の３次元顔モデルを、即ち、例えば、推定基準値としての顔器官点の位置及び顔向きの角度を、算出（更新）する（ステップＳ３６）。直近５フレーム分のデータは、例えば以下のように用いられる。所定の値、例えば、顔向きの角度を基準にして、５フレームのうち、所定の値が最大であるフレームのデータと、所定の値が最小であるフレームのデータを除いて、３フレームのデータの平均値を算出して、推定基準値の３次元顔モデルとする。 Next, the estimated reference value calculation unit 30 calculates the three-dimensional face model of the estimated reference value from the result obtained by the global search fitting process (step S34) for the most recent predetermined frames, for example, the most recent five frames. , that is, for example, the position of the facial organ point and the angle of the face orientation as the estimated reference value are calculated (updated) (step S36). Data for the latest five frames are used, for example, as follows. Based on a predetermined value, for example, the angle of the face, out of the five frames, the data of the frame with the maximum predetermined value and the data of the frame with the minimum predetermined value are excluded, and the data of the three frames. is calculated as the three-dimensional face model of the estimated reference value.

なお、直近フレームが１フレームのみである場合には、それを推定基準値とする。直近フレームが２フレームのみである場合には、それら２フレームの平均値を推定基準値とする。直近フレームが３又は４フレームのみである場合には、所定の値が最大であるフレームのデータと所定の値が最小であるフレームのデータを除いて、残りのフレームの平均値を推定基準値とする。 Note that when the most recent frame is only one frame, it is used as the estimated reference value. If the most recent frames are only two frames, the average value of those two frames is used as the estimated reference value. If the most recent frames are only 3 or 4 frames, the data of the frame with the maximum predetermined value and the data of the frame with the minimum predetermined value are excluded, and the average value of the remaining frames is used as the estimated reference value. do.

推定基準値を求めるのは以下のような理由による。まず、フィッティング失敗などによる大きな誤差による、影響の発生を抑制するためである。また、１５フレーム/秒若しくは３０フレーム/秒などの動画においては、５フレーム間の動きは非常に小さいと見做せることを利用するためである。 The estimated reference value is obtained for the following reasons. First, it is intended to suppress the occurrence of influence due to a large error due to fitting failure or the like. Also, it is to utilize the fact that in a moving image such as 15 frames/second or 30 frames/second, the movement between 5 frames can be regarded as very small.

推定基準値算出処理（ステップＳ３２～ステップＳ３６）の後、トラッキングフラグ「ＴｒａｃｋｉｎｇＦｌａｇ」が「ＴＲＵＥ」であるか否か、即ち、前フレームでも顔検出されていたか否か、判断される（ステップＳ３８）。トラッキングフラグ「ＴｒａｃｋｉｎｇＦｌａｇ」が「ＴＲＵＥ」で無ければ（ステップＳ３８・ＮＯ）、静止画処理（顔検出１フレーム目の処理）（ステップＳ４０～Ｓ４２）が実行される。トラッキングフラグ「ＴｒａｃｋｉｎｇＦｌａｇ」が「ＴＲＵＥ」であれば（ステップＳ３８・ＹＥＳ）、トラッキング処理（ステップＳ４４～Ｓ５２）が実行される。 After the estimated reference value calculation process (steps S32 to S36), it is determined whether or not the tracking flag "TrackingFlag" is "TRUE", that is, whether or not a face has been detected in the previous frame (step S38). . If the tracking flag "TrackingFlag" is not "TRUE" (step S38, NO), still image processing (processing of the first frame of face detection) (steps S40 to S42) is executed. If the tracking flag "TrackingFlag" is "TRUE" (step S38, YES), tracking processing (steps S44 to S52) is executed.

静止画処理（顔検出１フレーム目の処理）（ステップＳ４０～Ｓ４２）では、初めに、局所探索３次元顔モデルフィッティング部３２が、大局探索フィッティング処理（ステップＳ３４）を経由した３次元顔モデルに基づいて、顔画像データに対して、（後で説明する）局所探索フィッティング処理を行う（ステップＳ４０）。次に、トラッキングフラグ「ＴｒａｃｋｉｎｇＦｌａｇ」が「ＴＲＵＥ」にセットされる（ステップＳ４２）。静止画処理（顔検出１フレーム目の処理）後、顔向き推定部２４が、局所探索フィッティング処理（ステップＳ４０）の結果から、当該フレームにおける顔器官点の位置及び顔向きの角度を出力する（ステップＳ５４）。 In the still image processing (processing of the first frame of face detection) (steps S40 to S42), first, the local search 3D face model fitting unit 32 performs global search fitting processing (step S34) on the 3D face model. Based on this, local search fitting processing (to be described later) is performed on the face image data (step S40). Next, the tracking flag "TrackingFlag" is set to "TRUE" (step S42). After the still image processing (process of the first frame of face detection), the face direction estimation unit 24 outputs the positions of the facial organ points and the angle of the face direction in the frame based on the result of the local search fitting process (step S40) ( step S54).

トラッキング処理（ステップＳ４４～Ｓ５２）では、初めに、第２の３次元顔モデル配置部３４が、顔画像データに対して、前フレームのフィッティング結果である３次元顔モデルを配置する（ステップＳ４４）。次に、比較判定部３６が、第２の３次元顔モデル配置部３４により配置される３次元顔モデル（即ち、前フレームのフィッティング結果である３次元顔モデル）における顔向きの角度と、推定基準値算出部３０により算出される推定基準値の３次元顔モデルにおける顔向きの角度との差分Ｄｉｆｆを算出する（ステップＳ４６）。ここで、差分Ｄｉｆｆの算出には、例えば、ヨー角を用いる。 In the tracking process (steps S44 to S52), first, the second 3D face model placement unit 34 places the 3D face model, which is the result of fitting the previous frame, on the face image data (step S44). . Next, the comparison determination unit 36 determines the angle of the face orientation in the 3D face model placed by the second 3D face model placement unit 34 (that is, the 3D face model that is the fitting result of the previous frame), and the estimated A difference Diff between the estimated reference value calculated by the reference value calculation unit 30 and the face direction angle in the three-dimensional face model is calculated (step S46). Here, for example, the yaw angle is used to calculate the difference Diff.

差分Ｄｉｆｆが所定の閾値以上である（ステップＳ４８・ＮＯ）場合は、３次元顔モデルフィッティング結果補正部３８が、前フレームのフィッティング結果である３次元顔モデルではなく、最新の推定基準値の３次元顔モデルを、顔画像データ上に配置し直す（ステップＳ５０）。これを受けて、局所探索３次元顔モデルフィッティング部３２が、最新の推定基準値の３次元顔モデルに基づいて、顔画像データに対して、（後で説明する）局所探索フィッティング処理を行う（ステップＳ５２）。差分Ｄｉｆｆが所定の閾値未満（ステップＳ４８・ＹＥＳ）である場合は、３次元顔モデルフィッティング結果補正部３８は何らの処理も行わず、続いて、局所探索３次元顔モデルフィッティング部３２が、前フレームのフィッティング結果である３次元顔モデルに基づいて、顔画像データに対して、（後で説明する）局所探索フィッティング処理を行う（ステップＳ５２）。 If the difference Diff is equal to or greater than the predetermined threshold (step S48: NO), the 3D face model fitting result correction unit 38 uses the latest estimated reference value of 3 instead of the 3D face model that is the fitting result of the previous frame. The dimensional face model is rearranged on the face image data (step S50). In response to this, the local search 3D face model fitting unit 32 performs local search fitting processing (to be described later) on the face image data based on the 3D face model with the latest estimated reference value ( step S52). If the difference Diff is less than the predetermined threshold (step S48, YES), the 3D face model fitting result correction unit 38 does not perform any processing. Local search fitting processing (to be described later) is performed on the face image data based on the three-dimensional face model that is the fitting result of the frame (step S52).

このように、ステップＳ４６、ステップＳ４８、及びステップＳ５０は、補正処理を構成する。 Thus, steps S46, S48, and S50 constitute a correction process.

トラッキング処理（ステップＳ４４～Ｓ５２）後、顔向き推定部２４が、局所探索フィッティング処理（ステップＳ５２）の結果から、当該フレームにおける顔器官点の位置及び顔向きの角度を出力する（ステップＳ５４）。 After the tracking process (steps S44 to S52), the face orientation estimating unit 24 outputs the position of the facial organ point and the face orientation angle in the frame (step S54) from the result of the local search fitting process (step S52).

顔器官点の位置及び顔向きの角度の出力処理（ステップＳ５４）の後、顔向き推定処理は終了する（ステップＳ５６）。 After outputting the positions of the facial feature points and the angle of the face orientation (step S54), the face orientation estimation processing ends (step S56).

１．２．３．局所探索フィッティング処理
図４に示す顔向き推定処理における局所探索フィッティング処理（ステップＳ４０、ステップＳ５２）について説明する。局所探索フィッティング処理としては、様々なものが想定され得る。以下はその一例である。 1.2.3. Local Search Fitting Processing Local search fitting processing (steps S40 and S52) in the face orientation estimation processing shown in FIG. 4 will be described. Various local search fitting processes can be assumed. Below is an example.

基になる３次元顔モデルを設定する。本例では、前述のように、基になる３次元モデルは以下の三つである。
（１）大局探索フィッティング処理（即ち、上述の「ラフフィッティング処理」）（図４・ステップＳ３４）を経由した３次元顔モデル
（２）前フレームのフィッティング結果である３次元顔モデル
（３）最新の推定基準値の３次元顔モデル Set the underlying 3D face model. In this example, as described above, the three-dimensional models that are the basis are the following three.
(1) A three-dimensional face model that has undergone the global search fitting process (that is, the above-described "rough fitting process") (Fig. 4, step S34) (2) A three-dimensional face model that is the fitting result of the previous frame (3) The latest 3D face model with estimated reference value of

次に、基になる３次元顔モデルの位置を左右又は上下等にずらしつつ、基になる３次元顔モデルの位置を含む複数方向から、上述の「詳細フィッティング」を行う。ここでのずらし量は、例えば、口幅の約半分である。更に、基になる３次元顔モデルの位置を含む複数方向からのフィッティングの結果を統合する。この統合の際には、「詳細フィッティング」直後に算出されるフィッティングスコアが所定の閾値（即ち、所定の第１閾値）以上となる結果について、フィッティングスコアで重み付けして統合する。 Next, the above-described "detailed fitting" is performed from a plurality of directions including the position of the basic 3D face model while shifting the position of the basic 3D face model horizontally or vertically. The shift amount here is, for example, about half the mouth width. Furthermore, the results of fitting from multiple directions including the position of the underlying 3D face model are integrated. In this integration, the fitting score calculated immediately after the "detailed fitting" is equal to or greater than a predetermined threshold (that is, a predetermined first threshold) is weighted by the fitting score and integrated.

（基になる、又は）統合処理を経た３次元顔モデルの位置を左右又は上下等にずらしつつ、（基になる、又は）統合処理を経た３次元顔モデルの位置を含む複数方向から、詳細フィッティングを行うことと、（基になる、又は）統合処理を経た３次元顔モデルの位置を含む複数方向からのフィッティングの結果を統合することは、例えば、以下の条件をいずれも満たすまでイタレーションする（繰り返す）。
（条件１）統合されたフィッティングスコアが所定の第２閾値より大きい。
（条件２）顔向き角度の変動量が所定の第３閾値より小さい。ここで「顔向き角度の変動量」とは、前のイタレーションでの処理フローの際に算出された顔向き角度からの変動値である。
なお、処理を終了する（打ち切る）ためのイタレーション回数の上限を設ける。 While shifting the position of the (based or) 3D face model that has undergone the integration process to the left and right or up and down, etc., from multiple directions including the position of the (based or) 3D face model that has undergone the integration process, details Performing the fitting and integrating the results of the fitting from multiple directions including the position of the 3D face model (underlying or) undergoing the integration process may be performed by iterating until all of the following conditions are met, for example: (repeat).
(Condition 1) The integrated fitting score is greater than a predetermined second threshold.
(Condition 2) The amount of change in face orientation angle is smaller than a predetermined third threshold. Here, the "variation amount of the face orientation angle" is a value of variation from the face orientation angle calculated during the processing flow in the previous iteration.
Note that an upper limit is provided for the number of iterations for ending (terminating) processing.

以上を経て、局所探索フィッティングの結果を得る。このようして得られる結果は、安定して精度良く顔向きを推定するものである。 Through the above, the result of local search fitting is obtained. The result obtained in this way is a stable and accurate estimation of the face direction.

また、局所探索フィッティング処理（ステップＳ４０、ステップＳ５２）として、上述の「詳細フィッティング」がそのまま用いられてもよい。 Further, the above-described "detailed fitting" may be used as it is as the local search fitting process (steps S40 and S52).

１．３．まとめ
以上のように、本実施の形態に係る顔向き推定装置１４は、画像データから人の顔画像データを検出する顔検出部２３と、及び、検出された人の顔画像データに関して顔器官点の位置及び顔向きの角度を推定する顔向き推定部２４とを備える。顔向き推定部２４は、顔検出部２３が検出する顔画像が複数のフレームにおいて連続する場合、フレーム毎に、
（１）大局探索フィッティング処理から推定基準値を求める推定基準値算出処理と、（２）前フレームのフィッティング結果である３次元顔モデルに基づいて、局所探索フィッティング処理を行うトラッキング処理とを実行するものであり、（２）トラッキング処理は、補正処理を含み、補正処理は、前フレームのフィッティング結果である３次元顔モデルにおける顔向きの角度と、（１）推定基準値算出処理により算出される推定基準値の３次元顔モデルにおける顔向きの角度との差分を算出し、該差分が所定の閾値以上である場合には、前フレームのフィッティング結果である３次元顔モデルに代えて、最新の推定基準値の３次元顔モデルを、局所探索フィッティング処理の基になる３次元顔モデルとし、顔向き推定部２４は、局所探索フィッティング処理の結果から、フレーム毎に顔器官点の位置及び顔向きの角度を出力する。 1.3. Summary As described above, the face orientation estimation device 14 according to the present embodiment includes the face detection unit 23 for detecting human face image data from image data, and a face orientation estimating unit 24 for estimating the position of and the angle of the face orientation. When face images detected by the face detection unit 23 are continuous in a plurality of frames, the face direction estimation unit 24 performs the following for each frame:
(1) Estimated reference value calculation processing for obtaining an estimated reference value from global search fitting processing, and (2) Tracking processing for performing local search fitting processing based on the three-dimensional face model that is the result of fitting of the previous frame. (2) Tracking processing includes correction processing, and correction processing is calculated by the face orientation angle in the 3D face model, which is the result of fitting the previous frame, and (1) estimated reference value calculation processing. A difference between the estimated reference value and the face orientation angle of the 3D face model is calculated, and if the difference is equal to or greater than a predetermined threshold, the 3D face model that is the fitting result of the previous frame is replaced with the latest face model. The three-dimensional face model of the estimated reference value is used as the basis of the local search fitting process, and the face orientation estimation unit 24 calculates the position of the facial organ point and the face orientation for each frame from the result of the local search fitting process. output the angle of

以上の、本実施の形態に係る顔向き推定装置は、画像データ中の人の顔に関して、時間的に安定し、且つ、誤った推定位置で安定してしまうことなく、精度良く顔器官点の位置及び顔向きの角度を推定することができる。 As described above, the face direction estimation apparatus according to the present embodiment stabilizes the face of a person in the image data over time, prevents stabilization at an erroneous estimated position, and accurately identifies facial organ points. Position and face orientation angle can be estimated.

（他の実施の形態）
以上のように、本出願において開示する技術の例示として、実施の形態１を説明した。しかしながら、本開示における技術は、これに限定されず、適宜、変更、置き換え、付加、省略などを行った実施の形態にも適用可能である。 (Other embodiments)
As described above, Embodiment 1 has been described as an example of the technology disclosed in the present application. However, the technology in the present disclosure is not limited to this, and can be applied to embodiments in which modifications, replacements, additions, omissions, etc. are made as appropriate.

実施の形態１に係る顔向き推定装置は、自動車に搭載されるドライバモニタリングセンサに適用されることが想定されるが、適用例はドライバモニタリングセンサに限定されない。例えば、工場における作業者の表情をモニタするモニタリングシステムや、カメラを駅や広場等に設置した上で特定の人物を検出してその人物の表情を検出する検出システム等に適用され得る。 The face orientation estimation device according to Embodiment 1 is assumed to be applied to a driver monitoring sensor mounted on an automobile, but the application is not limited to the driver monitoring sensor. For example, it can be applied to a monitoring system that monitors the facial expressions of workers in a factory, a detection system that detects a specific person by installing a camera in a station, a square, or the like, and detects the facial expression of that person.

また、実施の形態を説明するために、添付図面および詳細な説明を提供した。したがって、添付図面および詳細な説明に記載された構成要素の中には、課題解決のために必須な構成要素だけでなく、上記技術を例示するために、課題解決のためには必須でない構成要素も含まれ得る。そのため、それらの必須ではない構成要素が添付図面や詳細な説明に記載されていることをもって、直ちに、それらの必須ではない構成要素が必須であるとの認定をするべきではない。 Also, the accompanying drawings and detailed description have been provided to explain the embodiments. Therefore, among the components described in the attached drawings and detailed description, there are not only components essential for solving the problem, but also components not essential for solving the problem in order to exemplify the above technology. can also be included. Therefore, it should not be determined that those non-essential components are essential just because they are described in the accompanying drawings and detailed description.

また、上述の実施の形態は、本開示における技術を例示するためのものであるから、特許請求の範囲またはその均等の範囲において種々の変更、置き換え、付加、省略などを行うことができる。 In addition, the above-described embodiments are intended to illustrate the technology of the present disclosure, and various modifications, replacements, additions, omissions, etc. can be made within the scope of the claims or equivalents thereof.

４・・・車両制御部、６・・・ＥＣＵ、８・・・アクチュエータ、１０・・・ＣＡＮ、１２・・・ドライバモニタリングセンサ、１４・・・顔向き推定装置（画像処理部）、１６・・・カメラ、１８・・・ＣＰＵ、２０・・・ＲＯＭ、２２・・・ＲＡＭ、２３・・・顔検出部、２４・・・顔向き推定部、２６・・・第１の３次元顔モデル配置部、２８・・・大局探索３次元顔モデルフィッティング部、３０・・・推定基準値算出部、３２・・・局所探索３次元顔モデルフィッティング部、３４・・・第２の３次元顔モデル配置部、３６・・・比較判定部、３８・・・３次元顔モデルフィッティング結果補正部、４０・・・目開閉検出部、４２・・・視線推定部。 4... vehicle control unit, 6... ECU, 8... actuator, 10... CAN, 12... driver monitoring sensor, 14... face orientation estimation device (image processing unit), 16. Camera 18 CPU 20 ROM 22 RAM 23 Face detection unit 24 Face orientation estimation unit 26 First three-dimensional face model Placement unit 28 Global search three-dimensional face model fitting unit 30 Estimation reference value calculation unit 32 Local search three-dimensional face model fitting unit 34 Second three-dimensional face model Placement unit 36 Comparison determination unit 38 Three-dimensional face model fitting result correction unit 40 Eye open/close detection unit 42 Gaze estimation unit.

Claims

a face detection unit that detects human face image data from image data; and
A face direction estimating device comprising a face direction estimating unit for estimating positions of facial organ points and angles of face direction with respect to detected face image data of a person,
When the face images detected by the face detection unit are continuous in a plurality of frames, the face orientation estimation unit performs the following for each frame:
(1) an estimated reference value calculation process for obtaining an estimated reference value from the global search fitting process;
(2) based on the three-dimensional face model that is the fitting result of the previous frame, a tracking process that performs a local search fitting process;
The (2) tracking process includes a correction process,
The correction process is
Calculate the difference between the face orientation angle of the three-dimensional face model that is the result of the fitting of the previous frame and the face orientation angle of the three-dimensional face model of the estimated reference value calculated by the estimated reference value calculation process (1). , if the difference is equal to or greater than a predetermined threshold, the 3D face model with the latest estimated reference value is used as the basis of the local search fitting process instead of the 3D face model that is the fitting result of the previous frame. Dimensional face model and
The face orientation estimation unit outputs the position of the facial organ point and the angle of the face orientation for each frame from the result of the local search fitting process.
Face orientation estimation device.

The estimated reference value is calculated from the results obtained in the global search fitting process in the last predetermined few frames,
The face direction estimation device according to claim 1.

A computer-implemented face orientation estimation method comprising:
a step of detecting human face image data from image data;
estimating positions of facial organ points and facial orientation angles with respect to the detected human face image data;
and outputting the position of the facial organ point and the angle of the face orientation in the facial image data,
In the estimating step, when the face images detected in the detecting step are continuous in a plurality of frames, for each frame,
(1) an estimated reference value calculation process for obtaining an estimated reference value from the global search fitting process;
(2) based on the three-dimensional face model that is the fitting result of the previous frame, it includes a tracking process that performs a local search fitting process;
The (2) tracking process includes a correction process,
The correction process is
Calculate the difference between the face orientation angle of the three-dimensional face model that is the result of the fitting of the previous frame and the face orientation angle of the three-dimensional face model of the estimated reference value calculated by the estimated reference value calculation process (1). , if the difference is equal to or greater than a predetermined threshold, the 3D face model with the latest estimated reference value is used as the basis of the local search fitting process instead of the 3D face model that is the fitting result of the previous frame. Dimensional face model and
In the outputting step, the position of the facial organ point and the face orientation angle are output for each frame from the result of the local search fitting process.
Face orientation estimation method.

The estimated reference value is calculated from the results obtained in the global search fitting process in the last predetermined few frames,
The face direction estimation method according to claim 3.