JP6326808B2

JP6326808B2 - Face image processing apparatus, projection system, image processing method and program

Info

Publication number: JP6326808B2
Application number: JP2013261088A
Authority: JP
Inventors: 哲司牧野; 雅昭佐々木; 優一宮本
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2013-12-18
Filing date: 2013-12-18
Publication date: 2018-05-23
Anticipated expiration: 2033-12-18
Also published as: JP2015118514A

Description

本発明は、顔画像処理装置、投影システム、画像処理方法及びプログラムに関する。 The present invention relates to a face image processing apparatus, a projection system, an image processing method, and a program.

従来、撮像した画像中の被写体の顔領域と表情を表現するための制御点とに基づいて、顔画像を変形させる技術が知られている（例えば、特許文献１参照）。また、顔画像内の口領域の輝度分布から口の開き具合を検出する手法（例えば、特許文献２参照）や、顔画像内の口領域や歯領域の面積から口の開き具合を検出する手法（例えば、特許文献３参照）が知られている。 2. Description of the Related Art Conventionally, a technique for deforming a face image based on a face area of a subject in a captured image and a control point for expressing an expression is known (see, for example, Patent Document 1). In addition, a method for detecting the degree of opening of the mouth from the luminance distribution of the mouth region in the face image (see, for example, Patent Document 2), and a method for detecting the degree of opening of the mouth from the area of the mouth region or tooth region in the face image (For example, refer to Patent Document 3).

特開２０１２−１８５６２４号公報JP 2012-185624 A 特開２００９−２３１８７９号公報JP 2009-231879 A 特開２００５−２３４６８６号公報JP 2005-234686 A

ところで、口の開き具合を検出する手法を示している上記特許文献２記載の技術は、化粧や肌の色によって輝度分布が変化するため口の開き具合の検出を適正に行うことができない虞がある。また、上記特許文献３記載の技術は、表情を識別するためのものであるため、顔画像の変形に適した画像であるか否かを判定することが困難となっている。 By the way, the technique described in Patent Document 2 showing a method for detecting the degree of opening of the mouth may not be able to properly detect the degree of opening of the mouth because the luminance distribution changes depending on makeup or skin color. is there. Further, since the technique described in Patent Document 3 is for identifying facial expressions, it is difficult to determine whether the image is suitable for deformation of a face image.

本発明は、このような問題に鑑みてなされたものであり、本発明の課題は、口の開閉状態の特定を適正に行うことができる顔画像処理装置、投影システム、画像処理方法及びプログラムを提供することである。 The present invention has been made in view of such problems, and an object of the present invention is to provide a face image processing apparatus, a projection system, an image processing method, and a program capable of appropriately specifying the open / closed state of the mouth. Is to provide.

上記課題を解決するため、本発明に係る顔画像処理装置は、
顔画像を取得する取得手段と、前記取得手段により取得された顔画像から口を検出する第一の検出手段と、前記第一の検出手段により検出された口の中央側の領域及び周辺側の領域の所定の色空間における色情報どうしの相違度を検出する第二の検出手段と、前記第二の検出手段による検出結果に基づいて、口の開閉状態を特定する特定手段と、を備えたことを特徴としている。 In order to solve the above problems, a face image processing apparatus according to the present invention is provided.
An acquisition means for acquiring a face image; a first detection means for detecting a mouth from the face image acquired by the acquisition means; and a region on the central side and a peripheral side of the mouth detected by the first detection means. A second detection unit that detects a degree of difference between color information in a predetermined color space of the region; and a specifying unit that specifies an open / closed state of the mouth based on a detection result by the second detection unit. It is characterized by that.

また、本発明に係る投影システムは、
本発明の顔画像処理装置と、スクリーンに顔画像を投影する投影装置と、を備えた投影システムであって、前記顔画像処理装置は、前記特定手段により閉じた状態であると特定された口の画像に対して変形処理を施す変形手段を更に備え、前記投影装置は、前記変形手段により変形処理が施された口の画像を前記スクリーンに投影することを特徴としている。 Moreover, the projection system according to the present invention includes:
A projection system comprising the face image processing device of the present invention and a projection device that projects a face image on a screen, wherein the face image processing device is identified as being closed by the specifying means. The image forming apparatus further includes a deforming unit that performs a deforming process on the image, and the projection device projects an image of the mouth subjected to the deforming process by the deforming unit onto the screen.

また、本発明に係る画像処理方法は、
顔画像処理装置を用いた画像処理方法であって、顔画像を取得する処理と、取得された顔画像から口を検出する処理と、検出された口の中央側の領域及び周辺側の領域の所定の色空間における色情報どうしの相違度を検出する処理と、前記色情報どうしの相違度の検出結果に基づいて、口の開閉状態を特定する処理と、を含むことを特徴としている。 The image processing method according to the present invention includes:
An image processing method using a face image processing device, a process for acquiring a face image, a process for detecting a mouth from the acquired face image, a region on the center side and a region on the peripheral side of the detected mouth And a process for detecting the degree of difference between the color information in a predetermined color space, and a process for specifying the open / closed state of the mouth based on the detection result of the degree of difference between the color information.

また、本発明に係るプログラムは、
顔画像処理装置のコンピュータを、顔画像を取得する取得手段、前記取得手段により取得された顔画像から口を検出する第一の検出手段、前記第一の検出手段により検出された口の中央側の領域及び周辺側の領域の所定の色空間における色情報どうしの相違度を検出する第二の検出手段、前記第二の検出手段による検出結果に基づいて、口の開閉状態を特定する特定手段、として機能させることを特徴としている。 The program according to the present invention is
The computer of the face image processing apparatus includes an acquisition unit for acquiring a face image, a first detection unit for detecting a mouth from the face image acquired by the acquisition unit, and a center side of the mouth detected by the first detection unit Second detecting means for detecting the difference between the color information in the predetermined color space of the area and the peripheral area, and specifying means for specifying the open / closed state of the mouth based on the detection result by the second detecting means It is characterized by functioning as.

本発明によれば、口の開閉状態の特定を適正に行うことができる。 According to the present invention, the opening / closing state of the mouth can be specified appropriately.

本発明を適用した一実施形態の投影システムの全体構成を模式的に示す図である。It is a figure which shows typically the whole structure of the projection system of one Embodiment to which this invention is applied. 図１の投影システムによる投影状態を模式的に示す図である。It is a figure which shows typically the projection state by the projection system of FIG. 図１の投影システムを構成する画像処理装置の機能的構成を示すブロック図である。It is a block diagram which shows the functional structure of the image processing apparatus which comprises the projection system of FIG. 図１の投影システムによる投影処理に係る動作の一例を示すフローチャートである。It is a flowchart which shows an example of the operation | movement which concerns on the projection process by the projection system of FIG. 図４の投影処理における顔画像特定処理に係る動作の一例を示すフローチャートである。6 is a flowchart illustrating an example of operations related to face image identification processing in the projection processing of FIG. 4. 図５の顔画像特定処理に係る画像の一例を模式的に示す図である。It is a figure which shows typically an example of the image which concerns on the face image specific process of FIG.

以下に、本発明について、図面を用いて具体的な態様を説明する。ただし、発明の範囲は、図示例に限定されない。
図１は、本発明を適用した一実施形態の投影システム１００の全体構成を模式的に示す図である。また、図２（ａ）は、映像コンテンツの非投影状態を模式的に示す投影システム１００の正面図であり、図２（ｂ）は、映像コンテンツの投影状態を模式的に示す投影システム１００の正面図である。 Hereinafter, specific embodiments of the present invention will be described with reference to the drawings. However, the scope of the invention is not limited to the illustrated examples.
FIG. 1 is a diagram schematically showing an overall configuration of a projection system 100 according to an embodiment to which the present invention is applied. 2A is a front view of the projection system 100 schematically showing the non-projection state of the video content, and FIG. 2B is a diagram of the projection system 100 schematically showing the projection state of the video content. It is a front view.

図１に示すように、本実施形態の投影システム１００は、画像処理装置１と、撮像装置２と、投影装置３と、スピーカ４と、スクリーン５等を備えて構成されている。また、投影システム１００は、例えば、人物、キャラクタ、動物等の投影対象物が商品等の説明を行う映像コンテンツをスクリーン５に投影するものであり、図１には、映像コンテンツの聴衆Ｏと当該投影システム１００の各構成要素との位置関係を模式的に表している。
なお、聴衆Ｏは、一人の場合もあるし、複数人の場合もある。 As shown in FIG. 1, a projection system 100 according to this embodiment includes an image processing device 1, an imaging device 2, a projection device 3, a speaker 4, a screen 5, and the like. In addition, the projection system 100 projects, for example, video content on which a projection target such as a person, character, or animal explains a product or the like on the screen 5, and FIG. The positional relationship with each component of the projection system 100 is schematically represented.
The audience O may be one person or plural persons.

図３は、画像処理装置１の機能的構成を示すブロック図である。
図３に示すように、画像処理装置１は、中央制御部１０１と、操作入力部１０２と、表示部１０３と、記録部１０４と、画像処理部１０５と、撮像制御部１０６と、第１〜第３Ｉ／Ｆ１０７、１０８、１０９等を備えて構成されている。また、中央制御部１０１、表示部１０３、記録部１０４、画像処理部１０５、撮像制御部１０６並びに第１〜第３Ｉ／Ｆ１０７、１０８、１０９は、バスライン１１０を介して接続されている。
なお、画像処理装置１としては、例えば、パーソナルコンピュータやワークステーションなどのコンピュータ、端末装置等が適用可能である。また、画像処理装置１は、スクリーン５に投影される映像コンテンツの投影像が聴衆Ｏ側を向くように、映像コンテンツの各フレームの投影画像（フレーム画像）を補正する構成としても良い。 FIG. 3 is a block diagram illustrating a functional configuration of the image processing apparatus 1.
As shown in FIG. 3, the image processing apparatus 1 includes a central control unit 101, an operation input unit 102, a display unit 103, a recording unit 104, an image processing unit 105, an imaging control unit 106, and first to first images. The third I / F 107, 108, 109 and the like are provided. The central control unit 101, the display unit 103, the recording unit 104, the image processing unit 105, the imaging control unit 106, and the first to third I / Fs 107, 108, and 109 are connected via a bus line 110.
As the image processing apparatus 1, for example, a computer such as a personal computer or a workstation, a terminal device, or the like is applicable. Further, the image processing apparatus 1 may be configured to correct the projection image (frame image) of each frame of the video content so that the projected image of the video content projected on the screen 5 faces the audience O side.

中央制御部１０１は、図示は省略するが、例えば、ＣＰＵ（Central Processing Unit）、ＲＡＭ（Random Access Memory）等を備えて構成される。中央制御部１０１のＣＰＵは、記録部１０４の所定の記録領域に記録されているプログラムを読み出し、ＲＡＭのワークエリアに展開し、展開したプログラムに従って各種処理を実行する。中央制御部１０１のＲＡＭは、揮発性のメモリであり、ＣＰＵにより実行される各種プログラムやこれら各種プログラムに係るデータ等を記憶するワークエリアを有する。 Although not shown, the central control unit 101 includes, for example, a CPU (Central Processing Unit), a RAM (Random Access Memory), and the like. The CPU of the central control unit 101 reads a program recorded in a predetermined recording area of the recording unit 104, develops it in the work area of the RAM, and executes various processes according to the expanded program. The RAM of the central control unit 101 is a volatile memory, and has a work area for storing various programs executed by the CPU, data related to these various programs, and the like.

操作入力部１０２は、例えば、数値、文字等を入力するためのデータ入力キーや、データの選択、送り操作等を行うための上下左右移動キーや各種機能キー等によって構成されるキーボードやマウス等の操作部を備え、これらの操作部の操作に応じて所定の操作信号を中央制御部１０１に出力する。 The operation input unit 102 is, for example, a keyboard or mouse configured by data input keys for inputting numerical values, characters, and the like, up / down / left / right movement keys for performing data selection, feeding operations, and various function keys. The operation unit is provided, and a predetermined operation signal is output to the central control unit 101 in accordance with the operation of these operation units.

表示部１０３は、例えば、ＬＣＤ（Liquid Crystal Display）等により構成され、中央制御部１０１からの表示制御信号に従って、各種情報を表示領域に表示する。 The display unit 103 is configured by, for example, an LCD (Liquid Crystal Display) or the like, and displays various types of information in a display area in accordance with a display control signal from the central control unit 101.

記録部１０４は、例えば、ＨＤＤ（Hard Disk Drive）や半導体の不揮発性メモリ等により構成される。また、記録部１０４は、中央制御部１０１で実行されるシステムプログラムや各種処理プログラム、これらのプログラムの実行に必要なデータ等を記録している。プログラムは、例えば、コンピュータ読み取り可能なプログラムコードの形態で所定の記録領域に格納されている。 The recording unit 104 includes, for example, an HDD (Hard Disk Drive), a semiconductor nonvolatile memory, or the like. The recording unit 104 records a system program executed by the central control unit 101, various processing programs, data necessary for executing these programs, and the like. For example, the program is stored in a predetermined recording area in the form of computer-readable program code.

また、記録部１０４は、投影用の映像コンテンツの映像データ１０４ａを記録している。
映像データ１０４ａは、例えば、三次元モデルの動画データを構成する各フレームの投影画像（例えば、投影対象物が正面方向を向いている画像等）のデータ及び各投影画像に対応する音声データにより構成されている。三次元モデルとは、三次元ポリゴン、三次元曲面、テクスチャ画像等によって構成された立体物の画像である。
また、映像データ１０４ａは、映像コンテンツに含まれる投影対象物（特に、人の顔）の動きを表現するための動き情報（例えば、各顔構成部の動き情報等）を含んでいる。動き情報は、所定空間内における複数の制御点の動きを示す情報であり、例えば、複数の制御点の所定空間での位置座標（x, y）を示す情報や変形ベクトル等が時間軸に沿って並べられている。 The recording unit 104 records video data 104a of video content for projection.
The video data 104a is constituted by, for example, data of projection images (for example, images in which the projection target is facing the front direction) constituting the moving image data of the three-dimensional model and audio data corresponding to the projection images. Has been. A three-dimensional model is an image of a three-dimensional object composed of a three-dimensional polygon, a three-dimensional curved surface, a texture image, and the like.
In addition, the video data 104a includes motion information (for example, motion information of each face constituent unit) for expressing a motion of a projection target (particularly a human face) included in the video content. The movement information is information indicating the movement of a plurality of control points in a predetermined space. For example, information indicating the position coordinates (x, y) of the plurality of control points in the predetermined space, deformation vectors, and the like along the time axis. Are lined up.

画像処理部１０５は、画像取得部１０５ａと、顔検出部１０５ｂと、口検出部１０５ｃと、第１相違度検出部１０５ｄと、第２相違度検出部１０５ｅと、開閉状態特定部１０５ｆと、画像変形部１０５ｇとを具備している。
なお、画像処理部１０５の各部は、例えば、所定のロジック回路から構成されているが、当該構成は一例であってこれに限られるものではない。 The image processing unit 105 includes an image acquisition unit 105a, a face detection unit 105b, a mouth detection unit 105c, a first difference degree detection unit 105d, a second difference degree detection unit 105e, an open / close state specifying unit 105f, And a deformable portion 105g.
Note that each unit of the image processing unit 105 includes, for example, a predetermined logic circuit, but the configuration is an example and is not limited thereto.

画像取得部（取得手段）１０５ａは、顔を含む画像を取得する。
すなわち、画像取得部１０５ａは、例えば、撮像装置２により撮像された聴衆Ｏの顔を含む画像（例えば、顔のみの画像や胸から上の画像等）の画像データの複製を取得する。具体的には、画像取得部１０５ａは、撮像装置２により撮像された撮像画像の画像データの複製を処理対象画像として取得する。
また、画像取得部１０５ａは、撮像装置２により聴衆Ｏの顔の再度の撮像が行われると、撮像装置２により撮像された聴衆Ｏの顔を含む画像の画像データの複製を新たに取得する。 The image acquisition unit (acquisition unit) 105a acquires an image including a face.
That is, for example, the image acquisition unit 105a acquires a copy of image data of an image including the face of the audience O captured by the imaging device 2 (for example, an image of only the face or an image above the chest). Specifically, the image acquisition unit 105a acquires a copy of the image data of the captured image captured by the imaging device 2 as a processing target image.
Further, when the imaging device 2 captures the face of the audience O again, the image acquisition unit 105a newly acquires a copy of the image data of the image including the face of the audience O captured by the imaging device 2.

なお、画像取得部１０５ａは、撮像装置２による聴衆Ｏを撮像後に記録部１０４に記録されている画像を読み出して取得しても良い。また、後述する画像処理部１０５による各処理は、処理対象画像の画像データ自体に対して行われても良いし、必要に応じて画像データを所定の比率で縮小した所定サイズ（例えば、ＶＧＡサイズ等）の縮小画像データに対して行われても良い。 Note that the image acquisition unit 105a may read out and acquire an image recorded in the recording unit 104 after imaging the audience O by the imaging device 2. Each process by the image processing unit 105 to be described later may be performed on the image data itself of the processing target image, or a predetermined size (for example, VGA size) obtained by reducing the image data at a predetermined ratio as necessary. Etc.) may be performed on the reduced image data.

顔検出部１０５ｂは、処理対象画像から顔領域Ａ１を検出する。
すなわち、顔検出部１０５ｂは、画像取得部１０５ａにより取得された処理対象画像の画像データに対して所定の顔検出処理を行って、顔が含まれる顔領域Ａ１（図６（ａ）参照）を検出する。
なお、顔検出処理は、公知の技術であるので、ここでは詳細な説明を省略する。 The face detection unit 105b detects a face area A1 from the processing target image.
That is, the face detection unit 105b performs a predetermined face detection process on the image data of the processing target image acquired by the image acquisition unit 105a, and selects a face region A1 (see FIG. 6A) including the face. To detect.
Since the face detection process is a known technique, detailed description thereof is omitted here.

口検出部（第一の検出手段）１０５ｃは、画像取得部１０５ａにより処理対象画像として取得された顔を含む画像から口Ｐを検出する（図６（ａ）参照）。
すなわち、口検出部１０５ｃは、顔検出処理により検出された処理対象画像内の顔領域Ａ１から、所定の検出処理（例えば、ＡＡＭ（Active Appearance Model）等）を用いて、例えば、左右各々の目、鼻、口、眉、顔輪郭等の主要な顔構成部を検出する。
ここで、ＡＡＭとは、視覚的事象のモデル化の一手法であり、任意の顔領域の画像のモデル化を行う処理である。例えば、複数のサンプル顔画像における所定の特徴部位（例えば、目じりや鼻頭やフェイスライン等）の位置や画素値（例えば、輝度値）の統計的分析結果を用意しておき、口検出部１０５ｃは、上記の特徴部位の位置を基準として、顔の形状を表す形状モデルや平均的な形状における「Appearance」を表すテクスチャーモデルを設定し、これらのモデルを用いて顔画像をモデル化する。これにより、顔領域Ａ１内で、例えば、左右各々の目、鼻、口、眉、顔輪郭等の顔構成部がモデル化される。
なお、検出処理としては、例えば、エッジ抽出処理、非等方拡散処理、テンプレートマッチング等の各種の処理を用いても良い。また、口検出部１０５ｃは、口Ｐを含む主要な顔構成部を検出するようにしたが、一例であってこれに限られるものではなく、少なくとも口Ｐを検出する構成であれば適宜任意に変更可能である。 The mouth detection unit (first detection unit) 105c detects the mouth P from the image including the face acquired as the processing target image by the image acquisition unit 105a (see FIG. 6A).
In other words, the mouth detection unit 105c uses, for example, each of the left and right eyes from the face area A1 in the processing target image detected by the face detection process using a predetermined detection process (for example, AAM (Active Appearance Model)). Detect major face components such as nose, mouth, eyebrows and face contours.
Here, AAM is a technique for modeling a visual event, and is a process for modeling an image of an arbitrary face region. For example, the mouth detection unit 105c prepares statistical analysis results of positions and pixel values (for example, luminance values) of predetermined feature parts (for example, eyes, nasal head, and face line) in a plurality of sample face images. A shape model representing the shape of the face and a texture model representing “Appearance” in the average shape are set on the basis of the position of the characteristic part, and a face image is modeled using these models. Thereby, in the face area A1, for example, face constituent parts such as left and right eyes, nose, mouth, eyebrows, and face contour are modeled.
As the detection process, for example, various processes such as an edge extraction process, an anisotropic diffusion process, and template matching may be used. In addition, the mouth detection unit 105c detects a main face component including the mouth P, but is not limited to this example. Any configuration may be used as long as it is configured to detect at least the mouth P. It can be changed.

第１相違度検出部（第二の検出手段）１０５ｄは、口検出部１０５ｃにより検出された口Ｐの中央側の領域及び周辺側の領域の所定の色空間における色情報どうしの相違度を検出する。
すなわち、第１相違度検出部１０５ｄは、例えば、口Ｐと重なるように所定方向（例えば、図６（ｂ）における上下方向等）に並んだ所定形状（例えば、略矩形状等）の中央領域Ｂ１と当該中央領域Ｂ１を所定方向に挟む上唇領域Ｂ２及び下唇領域Ｂ３を設定する。そして、第１相違度検出部１０５ｄは、中央領域Ｂ１並びに上唇領域Ｂ２及び下唇領域Ｂ３内の画像データを所定の色空間（例えば、ＨＳＶ色空間等）に変換し、各領域の所定の色空間におけるカラーマップを生成する。そして、第１相違度検出部１０５ｄは、中央領域Ｂ１のカラーマップと上唇領域Ｂ２及び下唇領域Ｂ３のカラーマップとに基づいて所定の演算を行って中央領域Ｂ１と上唇領域Ｂ２及び下唇領域Ｂ３のカラーマップの分布領域の相違度を第１相違度として検出する。
なお、中央領域Ｂ１、上唇領域Ｂ２及び下唇領域Ｂ３において、口Ｐを示す領域外は無視しても良い。
また、カラーマップどうしの相違度の検出手法としては、例えば、各領域のカラーマップにおける誤差を含めた使用頻度が高い値、つまり代表となる値を任意の数抽出し、各領域における値同士で総当たりで求めた差の合計とし、それをもって相違度とする手法等が挙げられる。 The first difference degree detection unit (second detection unit) 105d detects the difference degree between the color information in the predetermined color space of the center side region and the peripheral side region of the mouth P detected by the mouth detection unit 105c. To do.
That is, the first difference detection unit 105d is, for example, a central region of a predetermined shape (for example, a substantially rectangular shape) arranged in a predetermined direction (for example, the vertical direction in FIG. 6B) so as to overlap the mouth P. An upper lip region B2 and a lower lip region B3 that sandwich B1 and the central region B1 in a predetermined direction are set. Then, the first dissimilarity detection unit 105d converts the image data in the central region B1, the upper lip region B2, and the lower lip region B3 into a predetermined color space (for example, HSV color space), and the predetermined color of each region. Generate a color map in space. Then, the first difference detection unit 105d performs a predetermined calculation based on the color map of the central region B1 and the color maps of the upper lip region B2 and the lower lip region B3, and performs the central region B1, the upper lip region B2, and the lower lip region. The degree of difference in the distribution region of the color map of B3 is detected as the first degree of difference.
In addition, in the center area B1, the upper lip area B2, and the lower lip area B3, the area outside the mouth P may be ignored.
In addition, as a method for detecting the degree of difference between color maps, for example, an arbitrary number of frequently used values including errors in the color map of each region, that is, representative values are extracted, and the values in each region are For example, a method of setting the sum of differences obtained by brute force and using the difference as a sum is used.

なお、口Ｐの周辺側の領域として、中央領域Ｂ１を所定方向に挟む上唇領域Ｂ２及び下唇領域Ｂ３を例示したが、一例であってこれに限られるものではなく、例えば、口Ｐの中央領域Ｂ１を囲繞する環状領域等に適宜任意に変更可能である。 In addition, although the upper lip region B2 and the lower lip region B3 sandwiching the central region B1 in a predetermined direction are illustrated as the peripheral side region of the mouth P, this is an example and not limited thereto. It can be arbitrarily arbitrarily changed to an annular region or the like surrounding the region B1.

第２相違度検出部（第三の検出手段）１０５ｅは、口検出部１０５ｃにより検出された口Ｐの中央側の領域及び当該領域と所定方向に隣合う領域のエッジ検出の評価値どうしの相違度を検出する。
すなわち、第２相違度検出部１０５ｅは、例えば、第１相違度検出部１０５ｄにより設定された中央領域Ｂ１並びに上唇領域Ｂ２及び下唇領域Ｂ３内の画像データに対して、所定のエッジ検出処理（例えば、ハフ変換やソーベルフィルタを用いた所定のマトリックス演算等）を行なって各領域のエッジをそれぞれ検出する。そして、第２相違度検出部１０５ｅは、中央領域Ｂ１のエッジの検出結果に基づいて所定の演算を行って上下方向のエッジに係る中央評価値を算出するとともに、上唇領域Ｂ２及び下唇領域Ｂ３のエッジの検出結果に基づいて所定の演算を行って上下方向のエッジに係る唇評価値を算出して、中央評価値と唇評価値との相違度を第２相違度として検出する。
上記したエッジ検出処理の内容は、一例であってこれに限られるものではなく、適宜任意に変更可能である。例えば、ソーベルフィルタの代わりに、例えば、微分フィルタ、プリューウィットフィルタ、ラプラシアンフィルタ等を用いても良いし、これらのうちの何れか二つ以上を用いても良い。 The second difference degree detection unit (third detection means) 105e is a difference between the edge detection evaluation values of the region on the center side of the mouth P detected by the mouth detection unit 105c and the region adjacent to the region in the predetermined direction. Detect the degree.
That is, the second difference detection unit 105e performs, for example, predetermined edge detection processing (for the image data in the central region B1, the upper lip region B2, and the lower lip region B3 set by the first difference detection unit 105d ( For example, a predetermined matrix operation using a Hough transform or a Sobel filter is performed to detect the edge of each region. The second dissimilarity detection unit 105e performs a predetermined calculation based on the detection result of the edge of the central region B1 to calculate the central evaluation value related to the vertical edge, and the upper lip region B2 and the lower lip region B3. A predetermined calculation is performed based on the edge detection result to calculate the lip evaluation value related to the vertical edge, and the difference between the center evaluation value and the lip evaluation value is detected as the second difference.
The content of the edge detection process described above is an example and is not limited to this, and can be arbitrarily changed as appropriate. For example, instead of the Sobel filter, for example, a differential filter, a Prewitt filter, a Laplacian filter, or the like may be used, or any two or more of these may be used.

なお、口Ｐの中央領域Ｂ１と所定方向に隣合う領域として、中央領域Ｂ１を上下方向に挟む上唇領域Ｂ２及び下唇領域Ｂ３を例示したが、一例であってこれに限られるものではなく、適宜任意に変更可能である。また、上唇領域Ｂ２及び下唇領域Ｂ３のうち、何れか一方の領域（例えば、上唇領域Ｂ２等）のみを用いても良い。 In addition, as an area adjacent to the central area B1 of the mouth P in the predetermined direction, the upper lip area B2 and the lower lip area B3 sandwiching the central area B1 in the vertical direction are illustrated, but it is an example and not limited thereto. It can be arbitrarily changed as appropriate. Further, only one of the upper lip region B2 and the lower lip region B3 (for example, the upper lip region B2) may be used.

開閉状態特定部（特定手段）１０５ｆは、口検出部１０５ｃにより検出された口Ｐの開閉状態を特定する。
すなわち、開閉状態特定部１０５ｆは、第１相違度検出部１０５ｄによる口Ｐの中央領域Ｂ１と上唇領域Ｂ２及び下唇領域Ｂ３のカラーマップどうしの相違度（第１相違度）の検出結果に基づいて、口Ｐの開閉状態を特定する。具体的には、開閉状態特定部１０５ｆは、第１相違度検出部１０５ｄにより検出された第１相違度が所定値よりも大きいか否かを判定し、当該判定の結果、第１相違度が所定値よりも大きいと判定された場合に口Ｐが開いた状態であると特定する。
口Ｐが開いていると、上唇と下唇の間から歯が覗いた状態となって、中央領域Ｂ１のカラーマップと上唇領域Ｂ２及び下唇領域Ｂ３のカラーマップとの相違度（第１相違度）が大きくなると考えられるためである。 The open / closed state specifying unit (specifying unit) 105f specifies the open / closed state of the mouth P detected by the mouth detecting unit 105c.
That is, the open / closed state specifying unit 105f is based on the detection result of the degree of difference (first difference) between the color maps of the central region B1 of the mouth P, the upper lip region B2, and the lower lip region B3 by the first difference degree detection unit 105d. The opening / closing state of the mouth P is specified. Specifically, the opening / closing state specifying unit 105f determines whether or not the first difference detected by the first difference detection unit 105d is larger than a predetermined value, and as a result of the determination, the first difference is When it is determined that it is larger than the predetermined value, it is specified that the mouth P is in an open state.
When the mouth P is open, the teeth are seen from between the upper lip and the lower lip, and the difference between the color map of the central region B1 and the color map of the upper lip region B2 and the lower lip region B3 (first difference) This is because the degree is considered to increase.

また、開閉状態特定部１０５ｆは、第２相違度検出部１０５ｅによる口Ｐの中央領域Ｂ１と上唇領域Ｂ２及び下唇領域Ｂ３のエッジ検出の評価値どうしの相違度（第２相違度）の検出結果に基づいて、口Ｐの開閉状態を特定する。具体的には、開閉状態特定部１０５ｆは、第２相違度検出部１０５ｅにより検出された第２相違度が所定値よりも大きいか否かを判定し、当該判定の結果、第２相違度が所定値よりも大きいと判定された場合に口Ｐが開いた状態であると特定する。
口Ｐが開いていると、上唇と下唇の間から歯が覗いた状態となって、主として歯に対応する中央領域Ｂ１の上下方向のエッジに係る中央評価値と、上唇領域Ｂ２及び下唇領域Ｂ３の上下方向のエッジに係る唇評価値との相違度（第２相違度）が大きくなると考えられるためである。
ここで、開閉状態特定部１０５ｆは、中央評価値と唇評価値との相違度が所定値よりも大きいか否かの判定に代えて、中央評価値が唇評価値よりも大きいか否かを判定し、中央評価値が唇評価値よりも大きいと判定された場合に口Ｐが開いた状態であると特定しても良い。 Further, the open / closed state specifying unit 105f detects the degree of difference (second degree of difference) between the evaluation values of the edge detection of the center region B1 of the mouth P and the upper lip region B2 and the lower lip region B3 by the second difference degree detection unit 105e. Based on the result, the open / closed state of the mouth P is specified. Specifically, the open / closed state specifying unit 105f determines whether or not the second difference detected by the second difference detection unit 105e is greater than a predetermined value, and the second difference is determined as a result of the determination. When it is determined that it is larger than the predetermined value, it is specified that the mouth P is in an open state.
When the mouth P is open, the teeth are seen from between the upper lip and the lower lip, and the central evaluation value related to the vertical edge of the central region B1 corresponding to the teeth, the upper lip region B2, and the lower lip. This is because the degree of difference (second degree of difference) from the lip evaluation value relating to the vertical edge of the region B3 is considered to be large.
Here, instead of determining whether or not the difference between the central evaluation value and the lip evaluation value is larger than a predetermined value, the open / close state specifying unit 105f determines whether or not the central evaluation value is larger than the lip evaluation value. It may be determined that the mouth P is open when it is determined that the central evaluation value is larger than the lip evaluation value.

画像変形部（変形手段）１０５ｇは、開閉状態特定部１０５ｆにより閉じた状態であると特定された口Ｐの画像に対して変形処理を施す。
すなわち、画像変形部１０５ｇは、口検出部１０５ｃにより検出された口Ｐの所定位置に複数の制御点を設定し、設定された制御点の位置を変位させることで、当該口Ｐを変形させる変形処理を行う。
なお、変形処理は、公知の技術であるので、ここでは詳細な説明を省略する。 The image deformation unit (deformation unit) 105g performs a deformation process on the image of the mouth P identified as being closed by the open / close state identification unit 105f.
That is, the image deforming unit 105g sets a plurality of control points at predetermined positions of the mouth P detected by the mouth detecting unit 105c, and deforms the mouth P by displacing the positions of the set control points. Process.
Since the deformation process is a known technique, detailed description thereof is omitted here.

また、画像変形部１０５ｇは、投影装置３から投影される映像コンテンツに含まれる投影対象物の顔領域の口に対応する位置に、口検出部１０５ｃにより検出された口Ｐの画像を合成する。例えば、画像変形部１０５ｇは、投影対象物の顔領域の口の所定位置（例えば、重心等）に口検出部１０５ｃにより検出された口Ｐの対応する位置（例えば、重心等）を一致させるようにして口Ｐの画像を差し替えて合成する。そして、画像変形部１０５ｇは、映像コンテンツと対応付けられている口Ｐの動き情報（例えば、変形ベクトル等）に基づいて、口Ｐを変形させる変形処理を行う。
また、画像処理部１０５は、画像変形部１０５ｇによる変形処理が施された口Ｐを含む画像の画像データを映像データ１０４ａとともに投影装置３に出力する。
なお、口Ｐの合成の手法として、補間対象領域とそれ以外の領域との境界部分や複数の置換画像どうしの境界部分の色や勾配をより自然に変化させた画像を生成するポアソン画像合成（Poisson Image Editing）を用いても良いが、一例であってこれに限られるものではなく、適宜任意に変更可能である。 Further, the image deforming unit 105g synthesizes the image of the mouth P detected by the mouth detecting unit 105c at a position corresponding to the mouth of the face area of the projection target included in the video content projected from the projection device 3. For example, the image transformation unit 105g matches the corresponding position (for example, the center of gravity) of the mouth P detected by the mouth detection unit 105c with a predetermined position (for example, the center of gravity) of the mouth of the face area of the projection target. Thus, the image of the mouth P is replaced and synthesized. Then, the image deformation unit 105g performs a deformation process for deforming the mouth P based on movement information (for example, a deformation vector) of the mouth P associated with the video content.
Further, the image processing unit 105 outputs image data of an image including the mouth P subjected to the deformation process by the image deformation unit 105g to the projection device 3 together with the video data 104a.
As a method of combining the mouth P, Poisson image combining (Poisson image combining that generates an image in which the color and gradient of the boundary portion between the interpolation target region and other regions and the boundary portions of the plurality of replacement images are more naturally changed) (Poisson Image Editing) may be used, but this is an example, and the present invention is not limited to this, and can be arbitrarily changed.

撮像制御部１０６は、撮像装置２による聴衆Ｏの撮像を制御する。具体的には、撮像制御部１０６は、新たな顔画像の取得を要請する再撮像要請部（要請手段）１０６ａを具備している。
再撮像要請部１０６ａは、開閉状態特定部１０５ｇにより口Ｐが開いた状態であると特定されると、撮像装置２により聴衆Ｏの顔の撮像を再度行わせるための制御信号を第１Ｉ／Ｆ１０７を介して撮像装置２に出力する。このとき、再撮像要請部１０６ａは、スピーカ４から所定の内容の音声案内（例えば、聴衆Ｏの口Ｐが開いている旨や、聴衆Ｏの顔の再撮像を促す旨等）を出力させるための制御信号を第３Ｉ／Ｆ１０９を介してスピーカ４に出力する。 The imaging control unit 106 controls imaging of the audience O by the imaging device 2. Specifically, the imaging control unit 106 includes a reimaging requesting unit (requesting unit) 106a that requests acquisition of a new face image.
When the opening / closing state specifying unit 105g specifies that the mouth P is open, the re-imaging request unit 106a sends a control signal for causing the imaging device 2 to image the face of the audience O again. Is output to the imaging device 2 via At this time, the re-imaging request unit 106a outputs a voice guidance with a predetermined content from the speaker 4 (for example, that the mouth P of the audience O is open or that the face of the audience O is re-imaged). Is output to the speaker 4 through the third I / F 109.

第１Ｉ／Ｆ１０７は、撮像装置２と接続され、当該撮像装置２との間でデータ送受信を行うためのインターフェースである。 The first I / F 107 is an interface that is connected to the imaging apparatus 2 and performs data transmission / reception with the imaging apparatus 2.

撮像装置２は、スクリーン５を見ている聴衆Ｏを撮像するための撮像手段である。撮像装置２は、スクリーン５の直上に、その光軸方向がスクリーン５の面に垂直な方向となるように配置され、画像処理装置１からの指示に従って、スクリーン５の前面方向を撮像し、得られた撮像画像を第１Ｉ／Ｆ１０７を介して画像処理装置１に送信する。 The imaging device 2 is an imaging unit for imaging the audience O who is looking at the screen 5. The imaging device 2 is arranged immediately above the screen 5 so that the optical axis direction is perpendicular to the surface of the screen 5, and images the front direction of the screen 5 according to instructions from the image processing device 1. The obtained captured image is transmitted to the image processing apparatus 1 via the first I / F 107.

第１Ｉ／Ｆ１０８は、投影装置３と接続され、当該投影装置３との間でデータ送受信を行うためのインターフェースである。 The first I / F 108 is an interface that is connected to the projection apparatus 3 and performs data transmission / reception with the projection apparatus 3.

投影装置３は、例えば、画像処理装置１から送信された投影画像に基づいてスクリーン５の背面から映像コンテンツを投影する背面投射型の投影装置３である。投影装置３は、例えば、投影画像に基づいて、アレイ状に配列された複数個（例えば、ＸＧＡの場合、横１０２４画素×縦７６８画素）の微小ミラーの各傾斜角度を個々に高速でオン／オフ動作して表示動作することでその反射光により光像を形成する表示素子であるＤＭＤ（デジタルマイクロミラーデバイス）を利用したＤＬＰ（Digital Light Processing）(登録商標)プロジェクタが適用可能である。投影画像における投影対象物の領域に対応する位置の微小ミラーをオンし、その他の領域をオフすることで、投影対象物のみをスクリーン５に投影することが可能となる。 The projection device 3 is, for example, a rear projection type projection device 3 that projects video content from the back of the screen 5 based on the projection image transmitted from the image processing device 1. For example, the projection device 3 can individually turn on / off each inclination angle of a plurality of micromirrors arranged in an array based on the projection image (for example, in the case of XGA, horizontal 1024 pixels × vertical 768 pixels). A DLP (Digital Light Processing) (registered trademark) projector using a DMD (digital micromirror device) which is a display element that forms a light image by reflected light by performing an off operation and a display operation is applicable. By turning on the micromirror at a position corresponding to the region of the projection object in the projection image and turning off the other regions, it is possible to project only the projection object on the screen 5.

また、投影装置３は、画像処理部１０５から出力された画像変形部１０５ｇによる変形処理が施された口Ｐを含む画像の画像データ及び映像コンテンツの映像データ１０４ａが第１Ｉ／Ｆ１０８を介して入力されると、映像コンテンツに含まれる投影対象物の顔領域内に合成された口Ｐが当該コンテンツの内容に応じて変形（例えば、リップシンク）するような映像コンテンツを投影する。 Further, the projection apparatus 3 receives the image data of the image including the mouth P and the video data 104a of the video content, which are output from the image processing unit 105 and subjected to the deformation process by the image deformation unit 105g, via the first I / F 108. Then, the video content is projected such that the mouth P synthesized in the face area of the projection target included in the video content is deformed (for example, lip sync) according to the content.

スクリーン５は、支持台５１により床面に垂直になるように支持され、投影装置３の出力光照射方向に配置されている。
また、スクリーン５は、図１に示すように、投影対象物の形状に成型された透明アクリル板等の基材５２の前面上側に背面投影用のスクリーンフィルム５３が貼付され、下側に投影対象物の下半身が印刷されたフィルム５４が貼付されて構成されている。
なお、本実施形態においては、投影システム１００は投影対象物の上半身の映像コンテンツをスクリーン５に投影する構成としているが、一例であってこれに限られるものではなく、例えば、スクリーン５の基材５２の全面に背面投影用のスクリーンフィルム５３を貼付した構成とし、投影対象物の全身の映像コンテンツをスクリーン５に投影することとしても良い。 The screen 5 is supported by the support base 51 so as to be perpendicular to the floor surface, and is arranged in the output light irradiation direction of the projection device 3.
Further, as shown in FIG. 1, a screen film 53 for rear projection is pasted on the upper side of the front surface of a base material 52 such as a transparent acrylic plate molded into the shape of the projection target, and the screen 5 is projected on the lower side. A film 54 on which the lower half of the object is printed is attached.
In the present embodiment, the projection system 100 is configured to project the video content of the upper half of the projection target onto the screen 5, but is not limited to this example. A screen film 53 for rear projection may be attached to the entire surface of 52, and video content of the whole body of the projection target may be projected onto the screen 5.

第３Ｉ／Ｆ１０９は、スピーカ４と接続され、当該スピーカ４との間でデータ送受信を行うためのインターフェースである。 The third I / F 109 is an interface that is connected to the speaker 4 and performs data transmission / reception with the speaker 4.

スピーカ４は、中央制御部１０１から出力されて第３Ｉ／Ｆ１０９を介して入力された指示に応じて、所定の音声データ（例えば、映像データ１０４ａに含まれる音声データ等）を音声信号に変換して出力する。 The speaker 4 converts predetermined audio data (for example, audio data included in the video data 104a) into an audio signal in response to an instruction output from the central control unit 101 and input via the third I / F 109. Output.

＜投影処理＞
次に、投影システム１００による投影処理について、図４〜図６を参照して説明する。
図４は、投影処理に係る動作の一例を示すフローチャートである。 <Projection processing>
Next, projection processing by the projection system 100 will be described with reference to FIGS.
FIG. 4 is a flowchart illustrating an example of an operation related to the projection processing.

図４に示すように、先ず、撮像装置２は、聴衆Ｏを撮像して顔を含むライブビュー画像（例えば、顔のみの画像や胸から上の画像等）を生成し、画像処理装置１の画像取得部１０５ａは、生成されたライブビュー画像の画像データの複製を取得する（ステップＳ１）。
次に、顔検出部１０５ｂは、画像取得部１０５ａにより取得されたライブビュー画像の画像データに対して所定の顔検出処理を行って、顔が含まれる顔領域Ａ１を検出する（ステップＳ２）。続けて、画像処理部１０５は、顔検出部１０５ｂにより顔領域Ａ１が検出されたか否かを判定する（ステップＳ３）。 As shown in FIG. 4, first, the imaging device 2 captures the audience O to generate a live view image including a face (for example, an image of only the face or an image above the chest). The image acquisition unit 105a acquires a copy of the image data of the generated live view image (step S1).
Next, the face detection unit 105b performs a predetermined face detection process on the image data of the live view image acquired by the image acquisition unit 105a, and detects a face area A1 including the face (step S2). Subsequently, the image processing unit 105 determines whether or not the face area A1 is detected by the face detection unit 105b (step S3).

ステップＳ３にて、顔領域Ａ１が検出されていないと判定されると（ステップＳ３；ＮＯ）、画像処理部１０５は、処理をステップＳ２に戻し、所定の時間間隔毎に顔検出処理を行う。
一方、ステップＳ３にて、顔領域Ａ１が検出されたと判定されると（ステップＳ３；ＹＥＳ）、画像処理部１０５は、投影される映像コンテンツに合成される顔画像を特定する顔画像特定処理（図５参照）を行う（ステップＳ４）。
以下に、顔画像特定処理について、図５を参照して詳細に説明する。 If it is determined in step S3 that the face area A1 is not detected (step S3; NO), the image processing unit 105 returns the process to step S2, and performs face detection processing at predetermined time intervals.
On the other hand, when it is determined in step S3 that the face area A1 has been detected (step S3; YES), the image processing unit 105 specifies a face image specifying process for specifying a face image to be combined with the projected video content ( (See FIG. 5).
Hereinafter, the face image specifying process will be described in detail with reference to FIG.

＜顔画像特定処理＞
図５は、顔画像特定処理に係る動作の一例を示すフローチャートである。
なお、図６（ａ）及び図６（ｂ）は、顔画像特定処理に係る画像の一例を模式的に示す図であり、これらに限られるものではなく、適宜任意に変更可能である。 <Face image identification processing>
FIG. 5 is a flowchart illustrating an example of an operation related to the face image specifying process.
FIGS. 6A and 6B are diagrams schematically illustrating an example of an image related to the face image specifying process, and the image is not limited to these, and can be arbitrarily changed as appropriate.

図５に示すように、先ず、撮像装置２は、スピーカ４から発せられる音声案内にしたがって所定のタイミングで聴衆Ｏを撮像して顔を含む画像を生成し、画像処理装置１の画像取得部１０５ａは、生成された撮像画像の画像データの複製を処理対象画像として取得する（ステップＳ１１）。
次に、顔検出部１０５ｂは、画像取得部１０５ａにより取得された処理対象画像の画像データに対して所定の顔検出処理を行って、顔が含まれる顔領域Ａ１（図６（ａ）参照）を検出する（ステップＳ１２）。 As illustrated in FIG. 5, first, the imaging device 2 captures the audience O at a predetermined timing in accordance with voice guidance emitted from the speaker 4 to generate an image including a face, and the image acquisition unit 105 a of the image processing device 1. Acquires a copy of the generated image data of the captured image as a processing target image (step S11).
Next, the face detection unit 105b performs a predetermined face detection process on the image data of the processing target image acquired by the image acquisition unit 105a, and a face area A1 including the face (see FIG. 6A). Is detected (step S12).

続けて、口検出部１０５ｃは、顔検出処理により検出された処理対象画像内の顔領域Ａ１から、所定の検出処理（例えば、ＡＡＭ等）を用いて、口Ｐを検出する（ステップＳ１３）。このとき、口Ｐの検出と併せて、例えば、左右各々の目、鼻、眉、顔輪郭等の主要な顔構成部を検出しても良い。 Subsequently, the mouth detection unit 105c detects the mouth P from the face area A1 in the processing target image detected by the face detection process using a predetermined detection process (for example, AAM) (step S13). At this time, together with the detection of the mouth P, for example, main face components such as the left and right eyes, the nose, the eyebrows, and the face contour may be detected.

次に、第１相違度検出部１０５ｄは、口Ｐと重なるように中央領域Ｂ１と上唇領域Ｂ２及び下唇領域Ｂ３を設定し（図６（ｂ）参照）、中央領域Ｂ１並びに上唇領域Ｂ２及び下唇領域Ｂ３内の画像データを所定の色空間（例えば、ＨＳＶ色空間等）に変換し、各領域の所定の色空間におけるカラーマップを生成する（ステップＳ１４）。そして、第１相違度検出部１０５ｄは、所定の演算を行って中央領域Ｂ１と上唇領域Ｂ２及び下唇領域Ｂ３のカラーマップの分布領域の相違度を第１相違度として検出する（ステップＳ１５）。 Next, the first difference detection unit 105d sets the central region B1, the upper lip region B2, and the lower lip region B3 so as to overlap with the mouth P (see FIG. 6B), and the central region B1, the upper lip region B2, The image data in the lower lip region B3 is converted into a predetermined color space (for example, HSV color space), and a color map in the predetermined color space of each region is generated (step S14). Then, the first dissimilarity detection unit 105d performs a predetermined calculation to detect the dissimilarity between the distribution areas of the color map of the central region B1, the upper lip region B2, and the lower lip region B3 as the first dissimilarity (step S15). .

続けて、開閉状態特定部１０５ｆは、第１相違度検出部１０５ｄにより検出された口Ｐの中央領域Ｂ１と上唇領域Ｂ２及び下唇領域Ｂ３のカラーマップどうしの相違度（第１相違度）が所定値よりも大きいか否かを判定する（ステップＳ１６）。
ここで、第１相違度が所定値よりも大きいと判定された場合（ステップＳ１６；ＹＥＳ）、聴衆Ｏの口Ｐが開いた状態であると考えられるため、再撮像要請部１０６ａは、スピーカ４から聴衆Ｏの顔の再撮像を促す旨の音声案内を出力させるための制御信号を第１Ｉ／Ｆ１０７を介してスピーカ４に出力する。スピーカ４は、入力された制御信号に応じて、聴衆Ｏの顔の再撮像を促す旨の音声案内を発する（ステップＳ１７）。
その後、ステップＳ１１にて、撮像装置２は、スピーカ４から発せられる音声案内にしたがって所定のタイミングで聴衆Ｏを再度撮像して顔を含む画像を生成し、画像取得部１０５ａは、生成された撮像画像の画像データの複製を処理対象画像として取得する（ステップＳ１１）。 Subsequently, the open / closed state specifying unit 105f determines the degree of difference (first difference) between the color maps of the center area B1, the upper lip area B2, and the lower lip area B3 of the mouth P detected by the first difference degree detection part 105d. It is determined whether or not it is larger than the predetermined value (step S16).
Here, when it is determined that the first degree of difference is greater than the predetermined value (step S16; YES), it is considered that the mouth P of the audience O is in an open state. A control signal for outputting voice guidance for prompting re-imaging of the face of the audience O is output to the speaker 4 via the first I / F 107. In response to the input control signal, the speaker 4 issues a voice guidance for prompting re-imaging of the face of the audience O (step S17).
Thereafter, in step S11, the imaging device 2 captures the audience O again at a predetermined timing in accordance with voice guidance emitted from the speaker 4 to generate an image including a face, and the image acquisition unit 105a generates the generated imaging. A copy of the image data of the image is acquired as a processing target image (step S11).

なお、撮像装置２が聴衆Ｏの顔の再度の撮像を行う代わりに、予め聴衆Ｏを連続して撮像して顔を含む画像を複数生成しておき、画像取得部１０５ａは、生成済みの複数の画像から処理対象画像を新たに取得しても良い。 Instead of the imaging device 2 imaging the face of the audience O again, the audience O is continuously imaged in advance to generate a plurality of images including the face, and the image acquisition unit 105 a A processing target image may be newly acquired from the image.

ステップＳ１２以降の各処理は、ステップＳ１６にて、第１相違度が所定値よりも大きくないと判定（ステップＳ１６；ＮＯ）されるまで繰り返し実行される。
ステップＳ１６にて、第１相違度が所定値よりも大きくないと判定された場合（ステップＳ；ＹＥＳ）、第２相違度検出部１０５ｅは、中央領域Ｂ１並びに上唇領域Ｂ２及び下唇領域Ｂ３内の画像データに対して所定のエッジ検出処理を行なった後、中央領域Ｂ１のエッジの検出結果に基づいて所定の演算を行って上下方向のエッジに係る中央評価値を算出するとともに、上唇領域Ｂ２及び下唇領域Ｂ３のエッジの検出結果に基づいて所定の演算を行って上下方向のエッジに係る唇評価値を算出する（ステップＳ１８）。そして、第２相違度検出部１０５ｅは、中央領域Ｂ１のエッジに係る中央評価値と上唇領域Ｂ２及び下唇領域Ｂ３のエッジに係る唇評価値との相違度を第２相違度として検出する（ステップＳ１９）。 Each process after step S12 is repeatedly executed until it is determined in step S16 that the first difference is not greater than the predetermined value (step S16; NO).
When it is determined in step S16 that the first difference is not greater than the predetermined value (step S; YES), the second difference detection unit 105e is located in the central region B1, the upper lip region B2, and the lower lip region B3. The predetermined edge detection process is performed on the image data of the image area, and then a predetermined calculation is performed based on the detection result of the edge of the center area B1 to calculate the center evaluation value related to the vertical edge, and the upper lip area B2 Then, a predetermined calculation is performed based on the detection result of the edge of the lower lip region B3 to calculate the lip evaluation value related to the vertical edge (step S18). Then, the second difference degree detection unit 105e detects a difference degree between the center evaluation value related to the edge of the center area B1 and the lip evaluation values related to the edges of the upper lip area B2 and the lower lip area B3 as the second difference degree ( Step S19).

続けて、開閉状態特定部１０５ｆは、第２相違度検出部１０５ｅにより検出された中央領域Ｂ１と上唇領域Ｂ２及び下唇領域Ｂ３のエッジ検出の評価値どうしの相違度（第２相違度）が所定値よりも大きいか否かを判定する（ステップＳ２０）。
ここで、第２相違度が所定値よりも大きいと判定された場合（ステップＳ２０；ＹＥＳ）、聴衆Ｏの口Ｐが開いた状態であると考えられるため、処理をステップＳ１７に移行し、再撮像要請部１０６ａは、上記と略同様に、スピーカ４から聴衆Ｏの顔の再撮像を促す旨の音声案内を出力させるための制御信号を第１Ｉ／Ｆ１０７を介してスピーカ４に出力する。スピーカ４は、入力された制御信号に応じて、聴衆Ｏの顔の再撮像を促す旨の音声案内を発する（ステップＳ１７）。 Subsequently, the open / closed state specifying unit 105f determines the difference (second difference) between the evaluation values of the edge detection of the center region B1, the upper lip region B2, and the lower lip region B3 detected by the second difference detection unit 105e. It is determined whether it is larger than a predetermined value (step S20).
Here, when it is determined that the second degree of difference is greater than the predetermined value (step S20; YES), since it is considered that the mouth P of the audience O is open, the process proceeds to step S17, and the process is repeated. In substantially the same manner as described above, the imaging request unit 106a outputs a control signal for outputting voice guidance for prompting re-imaging of the audience O's face from the speaker 4 to the speaker 4 via the first I / F 107. In response to the input control signal, the speaker 4 issues a voice guidance for prompting re-imaging of the face of the audience O (step S17).

その後、処理をステップＳ１１に戻し、ステップＳ１１以降の各処理は、ステップＳ２０にて、第２相違度が所定値よりも大きくないと判定（ステップＳ２０；ＮＯ）されるまで繰り返し実行される。
ステップＳ２０にて、第２相違度が所定値よりも大きくないと判定されると（ステップＳ２０；ＮＯ）、画像変形部１０５ｇは、検出された口Ｐの所定位置に複数の制御点を設定し、当該複数の制御点を変形処理に用いられる制御点として登録する（ステップＳ２１）。
これにより、顔画像特定処理を終了する。 Thereafter, the process returns to step S11, and each process after step S11 is repeatedly executed until it is determined in step S20 that the second difference is not greater than the predetermined value (step S20; NO).
If it is determined in step S20 that the second difference is not greater than the predetermined value (step S20; NO), the image deforming unit 105g sets a plurality of control points at the predetermined position of the detected mouth P. The plurality of control points are registered as control points used for the deformation process (step S21).
Thereby, the face image specifying process is completed.

図４に戻り、画像処理部１０５は、記録部１０４から指定されている映像コンテンツの映像データ１０４ａを読み出して取得する（ステップＳ５）。ここで、映像コンテンツは、ユーザによる操作入力部１０２の所定操作に基づいて所望の映像コンテンツが中央制御部１０１のＣＰＵにより予め指定されるようになっているが、例えば、聴衆Ｏが選択可能な複数の映像コンテンツを用意しておき、聴衆Ｏが所望の映像コンテンツを選択して指定するような構成としても良い。 Returning to FIG. 4, the image processing unit 105 reads out and acquires the video data 104a of the specified video content from the recording unit 104 (step S5). Here, the video content is specified in advance by the CPU of the central control unit 101 based on a predetermined operation of the operation input unit 102 by the user. For example, the audience O can select the video content. A plurality of video contents may be prepared and the audience O may select and specify desired video contents.

次に、画像変形部１０５ｇは、変形処理により口Ｐを変形させた画像の画像データを生成し、画像処理部１０５は、画像変形部１０５ｇにより生成された画像データを映像コンテンツの映像データ１０４ａとともに、第１Ｉ／Ｆ１０８を介して投影装置３に出力する（ステップＳ６）。
そして、投影装置３は、第１Ｉ／Ｆ１０８を介して入力された映像コンテンツの映像データ１０４ａ及び画像変形部１０５ｇにより生成された画像データに基づいて、映像コンテンツに含まれる投影対象物の顔領域内に合成された口Ｐを当該コンテンツの内容に応じて変形（リップシンク）させるような映像コンテンツをスクリーン５に投影する（ステップＳ７）。
このとき、画像変形部１０５ｇは、変形処理により口Ｐ以外の顔構成部を変形させた画像の画像データを生成し、投影装置３は、当該画像データに基づいて、映像コンテンツに含まれる投影対象物の顔領域内にて口Ｐ以外の顔構成部を変形させるような映像コンテンツをスクリーン５に投影しても良い。 Next, the image deforming unit 105g generates image data of the image obtained by deforming the mouth P by the deformation process, and the image processing unit 105 uses the image data generated by the image deforming unit 105g together with the video data 104a of the video content. And output to the projection apparatus 3 via the first I / F 108 (step S6).
Then, the projection device 3 is based on the video data 104a of the video content input via the first I / F 108 and the image data generated by the image transformation unit 105g within the face area of the projection target included in the video content. The video content that deforms (lip sync) the mouth P synthesized in accordance with the content is projected onto the screen 5 (step S7).
At this time, the image deforming unit 105g generates image data of an image obtained by deforming the face constituent unit other than the mouth P by the deformation processing, and the projection device 3 uses the projection data included in the video content based on the image data. You may project on the screen 5 the video content which deform | transforms face structure parts other than the mouth P within the face area of a thing.

次に、画像処理装置１にあっては、中央制御部１０１のＣＰＵは、投影装置３による映像コンテンツの投影が終了したか否かを判定する（ステップＳ８）。
ここで、映像コンテンツの投影が終了していないと判定（ステップＳ８；ＮＯ）、中央制御部１０１のＣＰＵは、当該判定処理を所定の時間間隔毎に行う。
一方、映像コンテンツの投影が終了したと判定されると（ステップＳ８；ＹＥＳ）、中央制御部１０１のＣＰＵは、当該投影処理を終了させる。 Next, in the image processing device 1, the CPU of the central control unit 101 determines whether or not the projection of the video content by the projection device 3 has been completed (step S8).
Here, when it is determined that the projection of the video content has not ended (step S8; NO), the CPU of the central control unit 101 performs the determination process at predetermined time intervals.
On the other hand, when it is determined that the projection of the video content has ended (step S8; YES), the CPU of the central control unit 101 ends the projection process.

以上のように、本実施形態の投影システム１００によれば、顔画像から検出された口Ｐの中央側の領域（中央領域Ｂ１）及び周辺側の領域（上唇領域Ｂ２及び下唇領域Ｂ３）の所定の色空間における色情報どうしの相違度を検出し、当該検出結果に基づいて、口Ｐの開閉状態を特定するので、中央側の領域及び周辺側の領域の所定の色空間における色情報を用いて口Ｐの開閉状態の特定を適正に行うことができる。すなわち、口Ｐの中央側の領域及び周辺側の領域とで所定の色空間における色情報が相違している場合、上唇と下唇の間から歯が覗いて口Ｐが開いていると考えられる。そこで、口Ｐの中央側の領域及び周辺側の領域の所定の色空間における色情報どうしの相違度が大きい場合に、口Ｐが開いた状態であると特定するので、口Ｐの開閉状態を適正に特定することができる。
これにより、口Ｐに対する変形処理を、閉じた状態の口Ｐの画像を用いて行うことができることとなって、例えば、変形処理が施された口Ｐの画像をスクリーン５に投影した場合に、リップシンクをより自然なものとすることができる。 As described above, according to the projection system 100 of the present embodiment, the area on the center side (center area B1) and the area on the peripheral side (upper lip area B2 and lower lip area B3) of the mouth P detected from the face image. Since the degree of difference between the color information in the predetermined color space is detected and the open / closed state of the mouth P is specified based on the detection result, the color information in the predetermined color space in the central area and the peripheral area is obtained. The opening / closing state of the mouth P can be specified appropriately. That is, when the color information in the predetermined color space is different between the center side region and the peripheral side region of the mouth P, it is considered that the mouth P is opened by looking into the teeth between the upper lip and the lower lip. . Therefore, when the degree of difference between the color information in the predetermined color space in the central area and the peripheral area of the mouth P is large, the mouth P is specified to be in an open state. It can be specified appropriately.
Thereby, the deformation process for the mouth P can be performed using the image of the mouth P in the closed state. For example, when the image of the mouth P subjected to the deformation process is projected on the screen 5, Lip sync can be made more natural.

また、口Ｐの中央側の領域（中央領域Ｂ１）及び当該領域と所定方向に隣合う領域（上唇領域Ｂ２及び下唇領域Ｂ３）のエッジ検出の評価値どうしの相違度を検出し、当該検出結果に基づいて、口Ｐの開閉状態を特定するので、口Ｐの色情報に加えて中央側の領域及び当該領域と隣合う領域のエッジ検出の評価値を用いて口Ｐの開閉状態の特定をより適正に行うことができる。すなわち、口Ｐの中央側の領域及び当該領域と隣合う領域とでエッジ検出の評価値が相違している場合、上唇と下唇の間から歯が覗いて口Ｐが開いていると考えられる。そこで、中央側の領域及び当該領域と隣合う領域のエッジ検出の評価値どうしの相違度が大きい場合に、口Ｐが開いた状態であると特定するので、口Ｐの開閉状態をより適正に特定することができる。 Further, the degree of difference between the evaluation values of the edge detection of the region on the center side of the mouth P (central region B1) and the region adjacent to the region in the predetermined direction (upper lip region B2 and lower lip region B3) is detected. Since the opening / closing state of the mouth P is specified based on the result, the opening / closing state of the mouth P is specified using the edge detection evaluation values of the central side region and the adjacent region in addition to the color information of the mouth P. Can be performed more appropriately. That is, when the edge detection evaluation value is different between the region on the center side of the mouth P and the region adjacent to the region, it is considered that the mouth P is opened through the teeth between the upper lip and the lower lip. . Therefore, when the degree of difference between the edge detection evaluation values in the central area and the adjacent area is large, it is specified that the mouth P is in an open state. Can be identified.

さらに、口Ｐが開いた状態であると特定されると、新たな顔画像の取得を要請することで、当該要請に応じて、処理対象となる顔画像を新たに取得することができる。これにより、開いた状態の口Ｐの画像を用いて変形処理が行われてしまうことを抑制することができ、結果として、変形処理が施された口Ｐの画像をスクリーン５に投影した場合に、リップシンクが不自然になることを抑制することができる。 Further, when it is specified that the mouth P is in an open state, a new face image to be processed can be acquired in response to the request by requesting acquisition of a new face image. As a result, it is possible to prevent the deformation process from being performed using the image of the mouth P in the open state. As a result, when the image of the mouth P subjected to the deformation process is projected onto the screen 5. It is possible to suppress the lip sync from becoming unnatural.

なお、本発明は、上記実施形態に限定されることなく、本発明の趣旨を逸脱しない範囲において、種々の改良並びに設計の変更を行っても良い。
例えば、上記実施形態では、スクリーン５に映像を投影する投影システム１００を例示したが、一例であってこれに限られるものではなく、投影装置３を備えずに画像処理装置１単体から構成されていても良い。 The present invention is not limited to the above-described embodiment, and various improvements and design changes may be made without departing from the spirit of the present invention.
For example, in the above-described embodiment, the projection system 100 that projects an image on the screen 5 is illustrated. However, the projection system 100 is merely an example, and the projection system 100 is not limited thereto. May be.

また、上記実施形態では、画像処理装置１は、中央領域Ｂ１と上唇領域Ｂ２及び下唇領域Ｂ３のエッジ検出の評価値どうしの相違度（第２相違度）に応じて、口Ｐの開閉状態を特定するようにしたが、一例であってこれに限られるものではなく、必ずしも第２相違度を検出する第２相違度検出部１０５ｅを具備する必要はない。
さらに、画像処理装置１は、閉じた状態の口Ｐの画像に対して変形処理を施すようにしたが、一例であってこれに限られるものではなく、必ずしも口Ｐの画像を変形する画像変形部１０５ｇを具備する必要はない。 In the above embodiment, the image processing apparatus 1 opens and closes the mouth P according to the degree of difference (second degree of difference) between the edge detection evaluation values of the central area B1, the upper lip area B2, and the lower lip area B3. However, the present invention is not limited to this example, and the second difference degree detection unit 105e that detects the second difference degree is not necessarily provided.
Further, the image processing apparatus 1 performs the deformation process on the image of the mouth P in the closed state. However, the image processing apparatus 1 is an example, and is not limited thereto. It is not necessary to include the part 105g.

加えて、上記実施形態にあっては、取得手段、第一の検出手段、第二の検出手段、特定手段としての機能を、画像処理装置１の中央制御部１０１の制御下にて、画像取得部１０５ａ、口検出部１０５ｃ、第１相違度検出部１０５ｄ、開閉状態特定部１０５ｆが駆動することにより実現される構成としたが、これに限られるものではなく、中央制御部１０１のＣＰＵによって所定のプログラム等が実行されることにより実現される構成としても良い。
すなわち、プログラムを記憶するプログラムメモリに、取得処理ルーチン、第一の検出処理ルーチン、第二の検出処理ルーチン、特定処理ルーチンを含むプログラムを記憶しておく。そして、取得処理ルーチンにより中央制御部１０１のＣＰＵを、顔画像を取得する手段として機能させるようにしても良い。また、第一の検出処理ルーチンにより中央制御部１０１のＣＰＵを、取得された顔画像から口Ｐを検出する手段として機能させるようにしても良い。また、第二の検出処理ルーチンにより中央制御部１０１のＣＰＵを、検出された口Ｐの中央側の領域及び周辺側の領域の所定の色空間における色情報どうしの相違度を検出する手段として機能させるようにしても良い。また、特定処理ルーチンにより中央制御部１０１のＣＰＵを、色情報どうしの相違度の検出結果に基づいて、口Ｐの開閉状態を特定する手段として機能させるようにしても良い。 In addition, in the above embodiment, the functions of the acquisition unit, the first detection unit, the second detection unit, and the identification unit are acquired under the control of the central control unit 101 of the image processing apparatus 1. The unit 105a, the mouth detection unit 105c, the first dissimilarity detection unit 105d, and the open / close state specifying unit 105f are driven. However, the present invention is not limited to this, and the CPU of the central control unit 101 performs predetermined processing. It is good also as a structure implement | achieved by executing this program.
That is, a program including an acquisition processing routine, a first detection processing routine, a second detection processing routine, and a specific processing routine is stored in a program memory that stores the program. Then, the CPU of the central control unit 101 may function as a means for acquiring a face image by an acquisition process routine. Further, the CPU of the central control unit 101 may function as a means for detecting the mouth P from the acquired face image by the first detection processing routine. Further, the CPU of the central control unit 101 functions as a means for detecting the degree of difference between the color information in the predetermined color space of the central area and the peripheral area of the detected mouth P by the second detection processing routine. You may make it let it. Further, the CPU of the central control unit 101 may function as means for specifying the open / closed state of the mouth P based on the detection result of the degree of difference between the color information by the specifying process routine.

同様に、消去手段、第３の検出手段、要請手段、変形手段についても、中央制御部１０１のＣＰＵによって所定のプログラム等が実行されることにより実現される構成としても良い。 Similarly, the erasing unit, the third detecting unit, the requesting unit, and the deforming unit may be realized by executing a predetermined program or the like by the CPU of the central control unit 101.

さらに、上記の各処理を実行するためのプログラムを格納したコンピュータ読み取り可能な媒体として、ＲＯＭやハードディスク等の他、フラッシュメモリ等の不揮発性メモリ、ＣＤ−ＲＯＭ等の可搬型記録媒体を適用することも可能である。また、プログラムのデータを所定の通信回線を介して提供する媒体としては、キャリアウェーブ（搬送波）も適用される。 Furthermore, as a computer-readable medium storing a program for executing each of the above processes, a non-volatile memory such as a flash memory or a portable recording medium such as a CD-ROM is applied in addition to a ROM or a hard disk. Is also possible. A carrier wave is also used as a medium for providing program data via a predetermined communication line.

本発明のいくつかの実施形態を説明したが、本発明の範囲は、上述の実施の形態に限定するものではなく、特許請求の範囲に記載された発明の範囲とその均等の範囲を含む。
以下に、この出願の願書に最初に添付した特許請求の範囲に記載した発明を付記する。付記に記載した請求項の項番は、この出願の願書に最初に添付した特許請求の範囲の通りである。
〔付記〕
＜請求項１＞
顔画像を取得する取得手段と、
前記取得手段により取得された顔画像から口を検出する第一の検出手段と、
前記第一の検出手段により検出された口の中央側の領域及び周辺側の領域の所定の色空間における色情報どうしの相違度を検出する第二の検出手段と、
前記第二の検出手段による検出結果に基づいて、口の開閉状態を特定する特定手段と、
を備えたことを特徴とする顔画像処理装置。
＜請求項２＞
前記特定手段は、更に、
前記第二の検出手段により検出された相違度が大きい場合に、口が開いた状態であると特定することを特徴とする請求項１に記載の顔画像処理装置。
＜請求項３＞
前記第一の検出手段により検出された口の中央側の領域及び当該領域と所定方向に隣合う領域のエッジ検出の評価値どうしの相違度を検出する第三の検出手段を更に備え、
前記特定手段は、
前記第三の検出手段の検出結果に基づいて、口の開閉状態を特定することを特徴とする請求項１又は２に記載の顔画像処理装置。
＜請求項４＞
前記特定手段は、更に、
前記第三の検出手段により検出された相違度が大きい場合に、口が開いた状態であると特定することを特徴とする請求項３に記載の顔画像処理装置。
＜請求項５＞
前記特定手段により口が開いた状態であると特定されると、新たな顔画像の取得を要請する要請手段を更に備え、
前記取得手段は、
前記要請手段による要請に応じて、顔画像を新たに取得することを特徴とする請求項１〜４の何れか一項に記載の顔画像処理装置。
＜請求項６＞
請求項１〜５の何れか一項に記載の顔画像処理装置と、スクリーンに顔画像を投影する投影装置と、を備えた投影システムであって、
前記顔画像処理装置は、
前記特定手段により閉じた状態であると特定された口の画像に対して変形処理を施す変形手段を更に備え、
前記投影装置は、
前記変形手段により変形処理が施された口の画像を前記スクリーンに投影することを特徴とする投影システム。
＜請求項７＞
顔画像処理装置を用いた画像処理方法であって、
顔画像を取得する処理と、
取得された顔画像から口を検出する処理と、
検出された口の中央側の領域及び周辺側の領域の所定の色空間における色情報どうしの相違度を検出する処理と、
前記色情報どうしの相違度の検出結果に基づいて、口の開閉状態を特定する処理と、
を含むことを特徴とする画像処理方法。
＜請求項８＞
顔画像処理装置のコンピュータを、
顔画像を取得する取得手段、
前記取得手段により取得された顔画像から口を検出する第一の検出手段、
前記第一の検出手段により検出された口の中央側の領域及び周辺側の領域の所定の色空間における色情報どうしの相違度を検出する第二の検出手段、
前記第二の検出手段による検出結果に基づいて、口の開閉状態を特定する特定手段、
として機能させることを特徴とするプログラム。 Although several embodiments of the present invention have been described, the scope of the present invention is not limited to the above-described embodiments, but includes the scope of the invention described in the claims and equivalents thereof.
The invention described in the scope of claims attached to the application of this application will be added below. The item numbers of the claims described in the appendix are as set forth in the claims attached to the application of this application.
[Appendix]
<Claim 1>
Acquisition means for acquiring a face image;
First detection means for detecting a mouth from the face image acquired by the acquisition means;
Second detection means for detecting a degree of difference between color information in a predetermined color space in the central area and the peripheral area of the mouth detected by the first detection means;
Based on the detection result by the second detection means, specifying means for specifying the open / closed state of the mouth,
A face image processing apparatus comprising:
<Claim 2>
The specifying means further includes:
The face image processing apparatus according to claim 1, wherein when the degree of difference detected by the second detection unit is large, the face image processing device is identified as having an open mouth.
<Claim 3>
A third detection means for detecting the difference between the evaluation value of the edge detection of the area on the center side of the mouth detected by the first detection means and the area adjacent to the area in a predetermined direction;
The specifying means is:
The face image processing apparatus according to claim 1, wherein the opening / closing state of the mouth is specified based on a detection result of the third detection unit.
<Claim 4>
The specifying means further includes:
The face image processing apparatus according to claim 3, wherein when the degree of difference detected by the third detection unit is large, the face image processing device is identified as having an open mouth.
<Claim 5>
When it is specified by the specifying means that the mouth is in an open state, it further comprises request means for requesting acquisition of a new face image,
The acquisition means includes
The face image processing apparatus according to claim 1, wherein a face image is newly acquired in response to a request from the request unit.
<Claim 6>
A projection system comprising: the face image processing apparatus according to claim 1; and a projection apparatus that projects a face image on a screen.
The face image processing device includes:
Deformation means for performing deformation processing on the mouth image identified as being closed by the identification means,
The projector is
A projection system, wherein an image of a mouth subjected to deformation processing by the deformation means is projected onto the screen.
<Claim 7>
An image processing method using a face image processing device,
Processing to acquire a face image;
Processing to detect the mouth from the acquired face image;
A process for detecting the degree of difference between the color information in a predetermined color space in the center side area and the peripheral side area of the detected mouth;
Based on the detection result of the degree of difference between the color information, a process of specifying the open / closed state of the mouth,
An image processing method comprising:
<Claim 8>
The computer of the face image processing device
Acquisition means for acquiring a face image;
First detection means for detecting a mouth from the face image acquired by the acquisition means;
Second detection means for detecting the degree of difference between the color information in a predetermined color space of the central side area and the peripheral side area of the mouth detected by the first detection means;
Identification means for identifying the open / closed state of the mouth based on the detection result by the second detection means;
A program characterized by functioning as

１００投影システム
１画像処理装置
１０１中央制御部
１０５画像処理部
１０５ａ画像取得部
１０５ｃ口検出部
１０５ｄ第１相違度検出部
１０５ｅ第２相違度検出部
１０５ｆ開閉状態特定部
１０５ｇ画像変形部
１０６撮像制御部
１０６ａ再撮像要請部
２撮像装置
３投影装置 DESCRIPTION OF SYMBOLS 100 Projection system 1 Image processing apparatus 101 Central control part 105 Image processing part 105a Image acquisition part 105c Mouth detection part 105d 1st difference detection part 105e 2nd difference detection part 105f Opening / closing state specific | specification part 105g Image deformation part 106 Imaging control part 106a Re-imaging request unit 2 Imaging device 3 Projection device

Claims

Acquisition means for acquiring a face image;
First detection means for detecting a mouth from the face image acquired by the acquisition means;
Second detection means for detecting a degree of difference between color information in a predetermined color space in the central area and the peripheral area of the mouth detected by the first detection means;
Based on the detection result by the second detection means, specifying means for specifying the open / closed state of the mouth,
A face image processing apparatus comprising:

The specifying means further includes:
The face image processing apparatus according to claim 1, wherein when the degree of difference detected by the second detection unit is large, the face image processing device is identified as having an open mouth.

A third detection means for detecting the difference between the evaluation value of the edge detection of the area on the center side of the mouth detected by the first detection means and the area adjacent to the area in a predetermined direction;
The specifying means is:
The face image processing apparatus according to claim 1, wherein the opening / closing state of the mouth is specified based on a detection result of the third detection unit.

The specifying means further includes:
The face image processing apparatus according to claim 3, wherein when the degree of difference detected by the third detection unit is large, the face image processing device is identified as having an open mouth.

When it is specified by the specifying means that the mouth is in an open state, it further comprises request means for requesting acquisition of a new face image,
The acquisition means includes
The face image processing apparatus according to claim 1, wherein a face image is newly acquired in response to a request from the request unit.

A projection system comprising: the face image processing apparatus according to claim 1; and a projection apparatus that projects a face image on a screen.
The face image processing device includes:
Deformation means for performing deformation processing on the mouth image identified as being closed by the identification means,
The projector is
A projection system, wherein an image of a mouth subjected to deformation processing by the deformation means is projected onto the screen.

An image processing method using a face image processing device,
Processing to acquire a face image;
Processing to detect the mouth from the acquired face image;
A process for detecting the degree of difference between the color information in a predetermined color space in the center side area and the peripheral side area of the detected mouth;
Based on the detection result of the degree of difference between the color information, a process of specifying the open / closed state of the mouth,
An image processing method comprising:

The computer of the face image processing device
Acquisition means for acquiring a face image;
First detection means for detecting a mouth from the face image acquired by the acquisition means;
Second detection means for detecting the degree of difference between the color information in a predetermined color space of the central side area and the peripheral side area of the mouth detected by the first detection means;
Identification means for identifying the open / closed state of the mouth based on the detection result by the second detection means;
A program characterized by functioning as

Acquisition means for acquiring a face image;
  First detection means for detecting a mouth from the face image acquired by the acquisition means;
  Second detection means for detecting a degree of difference between color information in a predetermined color space in the central area and the peripheral area of the mouth detected by the first detection means;
  A specifying means for specifying that the mouth is open when the degree of difference detected by the second detecting means is large;
  A face image processing apparatus comprising: