JP2011228913A

JP2011228913A - Image processing apparatus, reply image generation system and program

Info

Publication number: JP2011228913A
Application number: JP2010096493A
Authority: JP
Inventors: Wakana Odagiri; わか菜小田切
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2010-04-19
Filing date: 2010-04-19
Publication date: 2011-11-10
Anticipated expiration: 2030-04-19
Also published as: JP5447134B2

Abstract

PROBLEM TO BE SOLVED: To provide an image processing apparatus which processes an interesting image with a specified image, a reply image generation system and a program.SOLUTION: The image processing apparatus comprises: a first image acquisition part for acquiring a first image; and an image composition part for combining a second image acquired with the first image acquired by the first image acquisition part while acquiring the second image. The image composition part detects a first region which satisfies a predetermined reference on the first image acquired by the first image acquisition part, superimposes the second image on a region in the vicinity of the first region detected, and generates a third image by combining the first image with the second image.

Description

本発明は、画像処理装置、返信画像生成システム、及び、プログラムに関する。 The present invention relates to an image processing device, a reply image generation system, and a program.

例えば、特許文献１には、通信端末から送信された画像データを表示装置に表示させる表示制御方法であって、画像データを送信した通信端末の位置情報から推定したその通信端末のユーザの現在状況を画像データと共に表示装置に出力する表示制御方法が開示されている。 For example, Patent Literature 1 discloses a display control method for displaying image data transmitted from a communication terminal on a display device, and the current state of the user of the communication terminal estimated from position information of the communication terminal that transmitted the image data. A display control method for outputting the image data together with image data to a display device is disclosed.

また、特許文献２には、携帯電話やパーソナルコンピュータから送信された画像データをＤＰＦに表示させる表示制御方法であって、ＤＰＦの利用者に対し表示方法の設定操作を要求することなく、受信した設定情報に基づいて表示方法を設定すると共に、設定変更に伴って不足するデジタル写真データを自動的に補充する表示制御方法が開示されている。 Patent Document 2 discloses a display control method for displaying image data transmitted from a mobile phone or a personal computer on a DPF, which is received without requiring a DPF user to perform a display method setting operation. There has been disclosed a display control method for setting a display method based on setting information and automatically replenishing digital photo data that is deficient when the setting is changed.

特開２００９−１８１２１０号公報JP 2009-181210 A 特開２００９−１５１３６９号公報JP 2009-151369 A

特許文献１及び特許文献２に記載された技術は、所定の画像を面白みのある画像に加工できるものでは無かった。 The techniques described in Patent Document 1 and Patent Document 2 have not been able to process a predetermined image into an interesting image.

本発明は、上記点に鑑みてなされたものであり、所定の画像を面白みのある画像に加工する画像処理装置、返信画像生成システム、及び、プログラムを提供することを目的とする。 The present invention has been made in view of the above points, and an object thereof is to provide an image processing apparatus, a reply image generation system, and a program that process a predetermined image into an interesting image.

本発明の第１の観点に係る画像処理装置は、
第１画像を取得する第１画像取得手段と、
第２画像を取得し、取得した前記第２画像と、前記第１画像取得手段が取得した前記第１画像とを合成する画像合成手段と、を備え、
前記画像合成手段は、前記第１画像取得手段が取得した前記第１画像において所定の基準を満たす第１領域を検出し、検出した前記第１領域の近傍の領域に前記第２画像を重畳させ、前記第１画像と前記第２画像とを合成した第３画像を生成する。 An image processing apparatus according to a first aspect of the present invention includes:
First image acquisition means for acquiring a first image;
An image combining unit that acquires a second image and combines the acquired second image with the first image acquired by the first image acquiring unit;
The image synthesizing unit detects a first region satisfying a predetermined criterion in the first image acquired by the first image acquiring unit, and superimposes the second image on a region near the detected first region. Then, a third image obtained by synthesizing the first image and the second image is generated.

本発明の第２の観点に係る返信画像生成システムは、
第１の端末が第２の端末に送信する第１画像に基づいて、前記第１画像に対する返信画像を生成する返信画像生成システムであって、
前記第１画像を表示した前記第２の端末の表示面前方を撮影した撮影画像を取得し、取得した前記撮影画像に基づいて、所定の像を抽出し、抽出した前記所定の像を表す第２画像を生成する画像生成手段と、
前記第１画像と前記画像生成手段が生成した前記第２画像とを合成する画像合成手段と、を備え、
前記画像合成手段は、前記第１画像において所定の基準を満たす第１領域を検出し、検出した前記第１領域の近傍の領域に前記第２画像を重畳させ、前記第１画像と前記第２画像とを合成した第３画像を生成する。 The reply image generation system according to the second aspect of the present invention is:
A reply image generation system for generating a reply image for the first image based on a first image transmitted from a first terminal to a second terminal,
A photographed image obtained by photographing the front of the display surface of the second terminal displaying the first image is obtained, a predetermined image is extracted based on the obtained photographed image, and a first image representing the extracted prescribed image is obtained. Image generating means for generating two images;
Image synthesizing means for synthesizing the first image and the second image generated by the image generating means,
The image composition means detects a first region that satisfies a predetermined criterion in the first image, superimposes the second image on a region in the vicinity of the detected first region, and the first image and the second image A third image synthesized with the image is generated.

また、本発明の第３の観点に係るプログラムは、
コンピュータを、
第１画像を取得する第１画像取得手段、
第２画像を取得し、取得した前記第２画像と、前記第１画像取得手段が取得した前記第１画像とを合成する画像合成手段、として機能させ、
前記画像合成手段は、前記第１画像取得手段が取得した前記第１画像において所定の基準を満たす第１領域を検出し、検出した前記第１領域の近傍の領域に前記第２画像を重畳させ、前記第１画像と前記第２画像とを合成した第３画像を生成する。 A program according to the third aspect of the present invention is:
Computer
First image acquisition means for acquiring a first image;
Acquiring a second image, and causing the acquired second image to function as an image combining unit that combines the first image acquired by the first image acquiring unit;
The image synthesizing unit detects a first region satisfying a predetermined criterion in the first image acquired by the first image acquiring unit, and superimposes the second image on a region near the detected first region. Then, a third image obtained by synthesizing the first image and the second image is generated.

本発明に係る画像処理装置、返信画像生成システム、及び、プログラムによれば、所定の画像を面白みのある画像に加工することができる。 According to the image processing device, the reply image generation system, and the program according to the present invention, a predetermined image can be processed into an interesting image.

本発明の１実施形態に係る返信画像生成システムの構成例を示すブロック図である。It is a block diagram which shows the structural example of the reply image generation system which concerns on one Embodiment of this invention. 本発明の１実施形態に係る第１の端末の構成を示す図である。It is a figure which shows the structure of the 1st terminal which concerns on one Embodiment of this invention. 本発明の１実施形態に係る第１の端末の、ＣＰＵ等を用いた場合の構成を示す図である。It is a figure which shows the structure at the time of using CPU etc. of the 1st terminal which concerns on one Embodiment of this invention. 本発明の１実施形態に係る第２の端末の構成を示す図である。It is a figure which shows the structure of the 2nd terminal which concerns on 1 embodiment of this invention. 本発明の１実施形態に係る第２の端末の、ＣＰＵ等を用いた場合の構成を示す図である。It is a figure which shows the structure at the time of using CPU etc. of the 2nd terminal which concerns on 1 embodiment of this invention. 本発明の１実施形態に係るサーバの構成を示す図である。It is a figure which shows the structure of the server which concerns on one Embodiment of this invention. 本発明の１実施形態に係るサーバの、ＣＰＵ等を用いた場合の構成を示す図である。It is a figure which shows the structure at the time of using CPU etc. of the server which concerns on one Embodiment of this invention. 本発明の１実施形態に係る返信画像生成システムが行う画像送受信処理の第１処理のフローチャートを示す図である。It is a figure which shows the flowchart of the 1st process of the image transmission / reception process which the reply image generation system which concerns on one Embodiment of this invention performs. 本発明の１実施形態における画像送受信用データテーブルのデータ構造を説明する図である。It is a figure explaining the data structure of the data table for image transmission / reception in one Embodiment of this invention. 本発明の１実施形態に係る返信画像生成システムが行う画像送受信処理の第２処理のフローチャートを示す図である。It is a figure which shows the flowchart of the 2nd process of the image transmission / reception process which the reply image generation system which concerns on one Embodiment of this invention performs. 第２画像の一例を示す図である。It is a figure which shows an example of a 2nd image. 本発明の１実施形態に係るサーバが行う画像送受信処理Ｃの第３画像生成処理のフローチャートを示す図である。It is a figure which shows the flowchart of the 3rd image generation process of the image transmission / reception process C which the server which concerns on 1 embodiment of this invention performs. （ａ）は、第１画像における基準顔を説明するための図である。（ｂ）は、第１画像における基準顔及び第２画像を重畳させるべきでない顔領域を説明するための図である。(A) is a figure for demonstrating the reference | standard face in a 1st image. (B) is a figure for demonstrating the face area which should not superimpose the reference | standard face and 2nd image in a 1st image. 本発明の１実施形態に係るサーバが行う第３画像生成処理における、第２画像の位置・大きさ決定処理のフローチャートを示す図である。It is a figure which shows the flowchart of the position / size determination process of a 2nd image in the 3rd image generation process which the server which concerns on one Embodiment of this invention performs. 「なでる」のジェスチャーの場合の、第１画像における、集合領域と基準顔との関係、及び、集合領域の配置等を示す図であり、（ａ）は基準顔が大きい場合、（ｂ）は、基準顔が小さい場合を示す図である。It is a figure which shows the relationship between a collection area | region and a reference | standard face in a 1st image in the case of the gesture of "stroking", arrangement | positioning of a collection | recovery area | region, etc., (a) is when a reference | standard face is large, (b) is It is a figure which shows the case where a reference | standard face is small. 「指を指す」のジェスチャーの場合の、第１画像における、集合領域と基準顔との関係、及び、集合領域の配置等を示す図であり、（ａ）は基準顔が大きい場合、（ｂ）は、基準顔が小さい場合を示す図である。FIG. 10 is a diagram showing the relationship between the collection area and the reference face in the first image, the arrangement of the collection area, and the like in the case of the gesture “pointing finger”, where (a) is (b) when the reference face is large ) Is a diagram showing a case where the reference face is small. 「拍手」又は「手を振る」（両手で手を振る）のジェスチャーの場合の、第１画像における、集合領域と基準顔との関係、及び、集合領域の配置等を示す図であり、（ａ）は基準顔が大きい場合、（ｂ）は、基準顔が小さい場合を示す図である。FIG. 10 is a diagram illustrating a relationship between a collection area and a reference face in the first image, an arrangement of the collection area, and the like in the case of a gesture of “applause” or “waving hands” (waving hands with both hands); FIG. 4A is a diagram illustrating a case where the reference face is large, and FIG. 「包み込む」のジェスチャーの場合の、第１画像における、集合領域と基準顔との関係、及び、集合領域の配置等を示す図であり、（ａ）は基準顔が大きい場合、（ｂ）は、基準顔が小さい場合を示す図である。It is a figure which shows the relationship between a collection area | region and a reference | standard face in a 1st image in the case of the gesture of "enveloping", arrangement | positioning of a collection | recovery area | region, etc., (a) when a reference | standard face is large, (b) is It is a figure which shows the case where a reference | standard face is small. 本発明の１実施形態に係る返信画像生成システムにおける第３画像の生成を説明するための図である。It is a figure for demonstrating the production | generation of the 3rd image in the reply image production | generation system which concerns on one Embodiment of this invention.

本発明に係る１実施形態について図面を参照して説明する。なお、本発明は下記の説明、図面の記載によって限定されるものではない。下記の説明及び図面の内容に変更（構成要素の削除も含む）を加えることができるのはもちろんである。また、以下の説明では、本発明の理解を容易にするために、公知の技術的事項の説明は適宜省略されている。 An embodiment according to the present invention will be described with reference to the drawings. In addition, this invention is not limited by description of the following description and drawing. It goes without saying that changes (including deletion of components) can be added to the contents of the following description and drawings. Further, in the following description, in order to facilitate understanding of the present invention, descriptions of known technical matters are omitted as appropriate.

まず、図１を参照して、本実施形態に係る返信画像生成システム１の構成を説明する。返信画像生成システム１は、第１の端末１００と、第２の端末２００と、サーバ３００と、を備える。返信画像生成システム１は、画像データを第１の端末１００と第２の端末２００とで送受信するシステムでもある。 First, the configuration of a reply image generation system 1 according to this embodiment will be described with reference to FIG. The reply image generation system 1 includes a first terminal 100, a second terminal 200, and a server 300. The reply image generation system 1 is also a system that transmits and receives image data between the first terminal 100 and the second terminal 200.

第１の端末１００と第２の端末２００とサーバ３００とは、ネットワーク９００に接続され、互いに通信可能になっている。なお、本実施形態では、返信画像生成システム１は、第１の端末１００と第２の端末２００とサーバ３００とを１つずつ備えているが、返信画像生成システム１は、第１の端末１００と第２の端末２００とサーバ３００とのうちの少なくとも１つを複数備えても良い。ネットワーク９００は、例えばインターネットである。 The first terminal 100, the second terminal 200, and the server 300 are connected to the network 900 and can communicate with each other. In the present embodiment, the reply image generation system 1 includes the first terminal 100, the second terminal 200, and the server 300, but the reply image generation system 1 includes the first terminal 100. And at least one of the second terminal 200 and the server 300 may be provided. The network 900 is the Internet, for example.

第１の端末１００及び第２の端末２００は、本実施形態では、デジタルフォトフレームである。第１の端末１００及び第２の端末２００は、パーソナルコンピュータ、携帯電話等の他の装置であってもよい。第１の端末１００と第２の端末２００とは、サーバ３００を介して、画像データ等のデータを送受信する。 In the present embodiment, the first terminal 100 and the second terminal 200 are digital photo frames. The first terminal 100 and the second terminal 200 may be other devices such as a personal computer and a mobile phone. The first terminal 100 and the second terminal 200 transmit and receive data such as image data via the server 300.

サーバ３００は、第１の端末１００又は第２の端末２００が送受信する画像データ等のデータを中継する。サーバ３００は、所定のコンピュータによって構成される。 The server 300 relays data such as image data transmitted and received by the first terminal 100 or the second terminal 200. The server 300 is configured by a predetermined computer.

第１の端末１００と第２の端末２００とサーバ３００とは、画像データを処理するが、画像データは処理中に適宜データ形式が変更されても良いものとする。 The first terminal 100, the second terminal 200, and the server 300 process image data, but the data format of the image data may be changed as appropriate during processing.

次に図２及び図３を参照して、第１の端末１００の構成を説明する。 Next, the configuration of the first terminal 100 will be described with reference to FIGS. 2 and 3.

第１の端末１００は、制御部１１０と、内部記憶部１２０と、操作部１３０と、表示部１４０と、通信部１５０と、Ｉ／Ｆ部１６０と、を備える。 The first terminal 100 includes a control unit 110, an internal storage unit 120, an operation unit 130, a display unit 140, a communication unit 150, and an I / F unit 160.

操作部１３０は、例えば、タッチセンサ等の入力装置によって構成される。操作部１３０は、制御部１１０の制御のもと、ユーザからの操作入力に応じた操作信号を出力し、制御部１１０に供給する。制御部１１０は、供給された操作信号に応じて処理を行う。これによって、制御部１１０は、操作部１３０へのユーザの操作内容に応じた処理を行う。 The operation unit 130 is configured by an input device such as a touch sensor, for example. The operation unit 130 outputs an operation signal corresponding to an operation input from the user under the control of the control unit 110 and supplies the operation signal to the control unit 110. The control unit 110 performs processing according to the supplied operation signal. As a result, the control unit 110 performs processing according to the user's operation content on the operation unit 130.

表示部１４０は、例えば、表示パネルと表示コントローラとによって構成される。表示コントローラには、制御部１１０から画像データ等のデータが供給される。表示コントローラは、供給された画像データに基づいて表示パネルを駆動し、表示パネルに画像を表示させる。このようにして、制御部１０１は、表示部１４０に所望の画像を表示する。 The display unit 140 includes, for example, a display panel and a display controller. Data such as image data is supplied from the control unit 110 to the display controller. The display controller drives the display panel based on the supplied image data, and displays an image on the display panel. In this way, the control unit 101 displays a desired image on the display unit 140.

なお、操作部１３０と表示部１４０とは、タッチパネル等によって構成されてもよい。この場合には、タッチパネルに内蔵された入力装置が操作部１３０を構成する。 Note that the operation unit 130 and the display unit 140 may be configured by a touch panel or the like. In this case, the input device built in the touch panel constitutes the operation unit 130.

通信部１５０は、例えば、モデム等の適宜の通信装置によって構成される。通信部１５０は、有線又は無線で、ネットワーク９００に接続される。制御部１１０は、通信部１５０及びネットワーク９００を介して、データを送受信する。 The communication unit 150 is configured by an appropriate communication device such as a modem, for example. The communication unit 150 is connected to the network 900 by wire or wireless. The control unit 110 transmits and receives data via the communication unit 150 and the network 900.

Ｉ／Ｆ部１６０は、外部とのインターフェース（Ｉ／Ｆ）である。Ｉ／Ｆ部１６０は、例えば、メディアコントローラ等を備える読取書込装置によって構成される。制御部１１０は、Ｉ／Ｆ部１６０を介して記憶媒体１９０に対してデータを読み書きする。 The I / F unit 160 is an interface (I / F) with the outside. The I / F unit 160 is configured by a reading / writing device including a media controller or the like, for example. The control unit 110 reads / writes data from / to the storage medium 190 via the I / F unit 160.

記憶媒体１９０は、例えば、フラッシュメモリ等を備えるメモリカード等によって構成され、第１の端末１００に取り付けられる。記憶媒体１９０は、画像データ等を記憶する。 The storage medium 190 is configured by, for example, a memory card including a flash memory or the like, and is attached to the first terminal 100. The storage medium 190 stores image data and the like.

制御部１１０は、画像送信部１１０ａと表示制御部１１０ｂと画像受信部１１０ｃとを備える。制御部１１０（画像送信部１１０ａと表示制御部１１０ｂと画像受信部１１０ｃとのそれぞれ）は、例えば、ＣＰＵ（Central Processing Unit）１１１とＲＡＭ（Random Access Memory）１１２とから構成され、後述の画像送受信処理Ａを行う。内部記憶部１２０は、例えば、フラッシュメモリ１２１から構成され、制御部１１０が使用するデータ等を適宜記憶する。 The control unit 110 includes an image transmission unit 110a, a display control unit 110b, and an image reception unit 110c. The control unit 110 (the image transmission unit 110a, the display control unit 110b, and the image reception unit 110c) includes, for example, a CPU (Central Processing Unit) 111 and a RAM (Random Access Memory) 112. Process A is performed. The internal storage unit 120 includes, for example, a flash memory 121, and appropriately stores data used by the control unit 110.

ＣＰＵ１１１は、フラッシュメモリ１２１に記録されているプログラムに従い、さらに、フラッシュメモリ１２１に記録されているデータ等を用い、第１の端末１００全体を制御し、制御部１１０が行う後述の画像送受信処理Ａを行う。フラッシュメモリ１２１に記録されているプログラム、データ等は、一度、ＲＡＭ（Random Access Memory）１２１に読み出されても良い。 The CPU 111 controls the entire first terminal 100 according to the program recorded in the flash memory 121 and further uses data and the like recorded in the flash memory 121, and the image transmission / reception processing A described later performed by the control unit 110. I do. Programs, data, and the like recorded in the flash memory 121 may be read once into a RAM (Random Access Memory) 121.

ＣＰＵ１１１が行う処理の少なくとも一部は、各種専用回路によって実行されてもよい。つまり、制御部１１０の少なくとも一部を各種専用回路によって構成してもよい。 At least a part of the processing performed by the CPU 111 may be executed by various dedicated circuits. That is, at least a part of the control unit 110 may be configured by various dedicated circuits.

ＲＡＭ１１２は、ＣＰＵ１１１が使用するデータ、ＣＰＵ１１１が生成したデータ、ＣＰＵ１１１に供給されるデータ等を記憶する。 The RAM 112 stores data used by the CPU 111, data generated by the CPU 111, data supplied to the CPU 111, and the like.

フラッシュメモリ１２１は、プログラム、及び、ＣＰＵ１１１が使用する各種データ等を記憶している。内部記憶部１２０は、ＲＯＭ（Read Only Memory）、ハードディスク等の他の記憶装置を備えても良く、この他の記憶装置が、プログラム、ＣＰＵ１１１が使用する各種データ等を記憶してもよい。 The flash memory 121 stores a program, various data used by the CPU 111, and the like. The internal storage unit 120 may include other storage devices such as a ROM (Read Only Memory) and a hard disk, and the other storage devices may store programs, various data used by the CPU 111, and the like.

内部記憶部１２０及び記憶媒体１９０は、制御部１１０が使用するデータを記憶する記憶部を構成し、制御部１１０は、この記憶部からデータを取得し、所定の処理を行うことになる。なお、記憶部は、第１の端末１００がデータを読み書き出来るものであればよく、第１の端末１００の外部にあってもよい。 The internal storage unit 120 and the storage medium 190 constitute a storage unit that stores data used by the control unit 110, and the control unit 110 acquires data from the storage unit and performs predetermined processing. Note that the storage unit only needs to be capable of reading and writing data by the first terminal 100, and may be outside the first terminal 100.

次に図４及び図５を参照して、第２の端末２００の構成を説明する。 Next, the configuration of the second terminal 200 will be described with reference to FIGS. 4 and 5.

第２の端末２００は、第１の端末１００の構成にさらに撮影部２７０を追加した構成である。第２の端末２００には、記憶媒体２９０が取り付けられるが、記憶媒体２９０も記憶媒体１９０と同様のものを使用できる。 The second terminal 200 has a configuration in which a photographing unit 270 is further added to the configuration of the first terminal 100. Although the storage medium 290 is attached to the second terminal 200, the storage medium 290 can be the same as the storage medium 190.

第２の端末２００は、制御部２１０と、内部記憶部２２０と、操作部２３０と、表示部２４０と、通信部２５０と、Ｉ／Ｆ部２６０と、撮影部２７０と、を備える。 The second terminal 200 includes a control unit 210, an internal storage unit 220, an operation unit 230, a display unit 240, a communication unit 250, an I / F unit 260, and an imaging unit 270.

制御部２１０は、制御部１１０に対応する。制御部２１０は、画像送信部２１０ａと画像生成部２１０ｂと画像受信部２１０ｃと表示制御部２１０ｄと撮影制御部２１０ｅとを備え、後述の画像送受信処理Ｂを行う。制御部２１０は、例えば、ＣＰＵ２１１とＲＡＭ２１２とによって構成され、ＣＰＵ２１１はフラッシュメモリ２２１に記録されたプログラムに従って後述の画像送受信処理Ｂを行う。制御部２１０についての他の説明は、制御部１１０についての説明と同様であるので、説明を省略する。 The control unit 210 corresponds to the control unit 110. The control unit 210 includes an image transmission unit 210a, an image generation unit 210b, an image reception unit 210c, a display control unit 210d, and an imaging control unit 210e, and performs image transmission / reception processing B described later. The control unit 210 includes, for example, a CPU 211 and a RAM 212, and the CPU 211 performs an image transmission / reception process B described later according to a program recorded in the flash memory 221. Since the other description about the control part 210 is the same as the description about the control part 110, description is abbreviate | omitted.

撮影部２７０は、撮像素子と、撮像レンズ群と、撮像駆動部と、ＡＦＥ（Analog Front End）と、ＤＳＰ（Digital Signal Processor）と、を含む。これらは、制御部１１０（撮影制御部２１０ｅ）によって制御されて動作する。撮像部２７０は、第２の端末２００の前方（表示部２４０の表示画面の法線方向であって、ユーザ側の方向）を撮影する。 The imaging unit 270 includes an imaging device, an imaging lens group, an imaging drive unit, an AFE (Analog Front End), and a DSP (Digital Signal Processor). These operate under the control of the control unit 110 (imaging control unit 210e). The imaging unit 270 images the front of the second terminal 200 (the normal direction of the display screen of the display unit 240 and the direction on the user side).

撮像素子は、ＣＣＤ（Charge Coupled Device：電荷結合素子）イメージセンサ、ＣＭＯＳ（Complementary Metal Oxide Semiconductor）イメージセンサ等のイメージセンサによって構成される。撮像素子は、制御部２１０の制御のもと、撮像レンズ群を介して入射する光による画像を取り込んで画素毎に光電変換し、光電変換により発生した画素毎の電荷に基づく信号を撮像信号としてＡＦＥに供給する。 The imaging device is configured by an image sensor such as a CCD (Charge Coupled Device) image sensor or a CMOS (Complementary Metal Oxide Semiconductor) image sensor. Under the control of the control unit 210, the image sensor captures an image of light incident through the imaging lens group, performs photoelectric conversion for each pixel, and uses a signal based on the charge for each pixel generated by the photoelectric conversion as an imaging signal. Supply to AFE.

撮像レンズ群は、フォーカスレンズ、ズームレンズ等のレンズ群からなる。撮像駆動部は、撮像レンズ群の駆動、撮像素子に入射する光量の制御、及び、撮像素子に光を当てる時間（撮像素子の受光素子が受光する時間の制御）の制御等を行う機構である。撮像レンズ群及び撮像駆動部は制御部２１０の制御のもとで動作し、撮像素子は撮像を行う。 The imaging lens group includes a lens group such as a focus lens and a zoom lens. The imaging drive unit is a mechanism that controls the driving of the imaging lens group, the control of the amount of light incident on the imaging device, the control of the time during which light is applied to the imaging device (the control of the time that the light receiving element of the imaging device receives light) . The imaging lens group and the imaging drive unit operate under the control of the control unit 210, and the imaging element performs imaging.

ＡＦＥは、制御部２１０の制御のもと、撮像素子から供給される撮像信号に所定の処理（以下、処理Ａという。）を行い、撮像素子が撮像した撮像画像を表すデジタルデータ（元画像データ）を生成して出力する。処理Ａは、ＯＢ（Optical Black）クランプ処理、ＣＤＳ（Correlated Double Sampling）処理、ＡＧＣ（Automatic Gain Control）処理、Ａ／Ｄ（Analog/digital）処理等を含む処理である。 Under the control of the control unit 210, the AFE performs predetermined processing (hereinafter referred to as processing A) on the imaging signal supplied from the imaging device, and represents digital data (original image data) representing a captured image captured by the imaging device. ) Is generated and output. The process A is a process including an OB (Optical Black) clamp process, a CDS (Correlated Double Sampling) process, an AGC (Automatic Gain Control) process, an A / D (Analog / digital) process, and the like.

ＤＳＰは、ＡＦＥから供給される元画像データに対して各種処理（以下、処理Ｂという）を行い、ＹＵＶ（輝度・色差）形式等の画像データを生成する。処理Ｂは、輪郭強調、オートホワイトバランス、オートアイリス等の画質を向上させる処理を含む。これによって、撮像素子が撮像した撮像画像を表す画像データ、つまり、撮影部２７０が撮影した撮影画像データが、生成される。 The DSP performs various processes (hereinafter referred to as process B) on the original image data supplied from the AFE, and generates image data in a YUV (luminance / color difference) format or the like. The process B includes a process for improving the image quality, such as contour enhancement, auto white balance, and auto iris. Thus, image data representing a captured image captured by the image sensor, that is, captured image data captured by the capturing unit 270 is generated.

撮像部２７０は、上記のような上記の構成によって、制御部２１０（撮影制御部２１０ｅ）の制御のもと、撮像する方向にある人物及び背景等の撮像画像を取り込んでこの撮像画像を表す撮像信号を生成し、生成した撮像信号に基づいて、前記の撮像画像を表す元画像データを生成し、生成した元画像データに基づいて、撮影画像データを生成し、制御部２１０に供給する。なお、撮像部２７０は、第２の端末２００の筐体に設けられたランプ等を有し、このランプは制御部２１０の制御のもと、点灯する。 With the above configuration, the imaging unit 270 captures captured images such as a person and a background in the imaging direction under the control of the control unit 210 (imaging control unit 210e), and represents the captured image. A signal is generated, original image data representing the captured image is generated based on the generated imaging signal, captured image data is generated based on the generated original image data, and is supplied to the control unit 210. Note that the imaging unit 270 includes a lamp or the like provided in the housing of the second terminal 200, and this lamp is lit under the control of the control unit 210.

なお、ＤＳＰが行う処理は、制御部２１０（画像生成部２１０ｂ）が行ってもよい。この場合、制御部２１０は、撮像部２７０（ＡＦＥ）から画像元データを受け取り、受け取った画像元データに対して前記の処理Ｂと同様の処理を行うことによって画像データ（ＹＵＶ形式等）を生成する。これによって、制御部２１０は、撮像部２７０が撮像した撮像画像を表す画像データを取得する。 Note that the processing performed by the DSP may be performed by the control unit 210 (image generation unit 210b). In this case, the control unit 210 receives the image original data from the imaging unit 270 (AFE), and generates image data (YUV format or the like) by performing the same process as the process B on the received image original data. To do. Thereby, the control unit 210 acquires image data representing the captured image captured by the imaging unit 270.

内部記憶部２２０と、操作部２３０と、表示部２４０と、通信部２５０と、Ｉ／Ｆ部２６０と、は、それぞれ、第１の端末１００の、内部記憶部１２０と、操作部１３０と、表示部１４０と、通信部１５０と、Ｉ／Ｆ部１６０と、に対応し、それぞれ、略同様のものであるので、詳しい説明は省略する。 The internal storage unit 220, the operation unit 230, the display unit 240, the communication unit 250, and the I / F unit 260 are the internal storage unit 120, the operation unit 130, and the first terminal 100, respectively. Since it corresponds to the display unit 140, the communication unit 150, and the I / F unit 160 and is substantially the same, detailed description thereof is omitted.

サーバ３００は、制御部３１０と、内部記憶部３２０と、操作部３３０と、表示部３４０と、通信部３５０と、を備える。これらは、第１の端末１００の各部と略同様のものである。内部記憶部３２０と、操作部３３０と、表示部３４０と、通信部３５０と、は、それぞれ、第１の端末１００の、内部記憶部１２０と、操作部１３０と、表示部１４０と、通信部１５０と、に対応する。なお、操作部３３０と表示部３４０とは、サーバ３００の管理用に使用されるものである。サーバ３００は、操作部３３０と表示部３４０とを備えなくてもよい。 The server 300 includes a control unit 310, an internal storage unit 320, an operation unit 330, a display unit 340, and a communication unit 350. These are substantially the same as each part of the first terminal 100. The internal storage unit 320, the operation unit 330, the display unit 340, and the communication unit 350 are the internal storage unit 120, the operation unit 130, the display unit 140, and the communication unit of the first terminal 100, respectively. 150. The operation unit 330 and the display unit 340 are used for managing the server 300. The server 300 may not include the operation unit 330 and the display unit 340.

操作部３３０は、例えば、キーボード等の入力装置によって構成されてもよい。制御部３１０は、画像送信部３１０ａと表示制御部３１０ｂと画像受信部３１０ｃとを備え、後述の画像送受信処理Ｃを行う。なお、制御部３１０は、例えば、ＣＰＵ３１１とＲＡＭ３１２を備え、ＣＰＵ３１１は、ハードディスク３２１に記録されたプログラムに従って、画像送信部３１０ａと表示制御部３１０ｂと画像受信部３１０ｃとが行う処理である後述の画像送受信処理Ｃを行う。内部記憶部３２０は、例えば、ハードディスク３２１から構成され、制御部３１０が使用するデータ等を適宜記憶する。その他の説明は、第１の端末１００における説明と同様であるで、詳しい説明を省略する。 The operation unit 330 may be configured by an input device such as a keyboard, for example. The control unit 310 includes an image transmission unit 310a, a display control unit 310b, and an image reception unit 310c, and performs image transmission / reception processing C described later. Note that the control unit 310 includes, for example, a CPU 311 and a RAM 312, and the CPU 311 performs processing performed by the image transmission unit 310 a, the display control unit 310 b, and the image reception unit 310 c, which will be described later, according to a program recorded on the hard disk 321. A transmission / reception process C is performed. The internal storage unit 320 is constituted by, for example, a hard disk 321 and appropriately stores data used by the control unit 310. The other description is the same as the description in the first terminal 100, and the detailed description is omitted.

次に返信画像生成システム１の動作（第１の端末１００、第２の端末２００、サーバ３００の各動作）について説明する。返信画像生成システム１は、画像送受信処理を行う。 Next, operations of the reply image generation system 1 (operations of the first terminal 100, the second terminal 200, and the server 300) will be described. The reply image generation system 1 performs image transmission / reception processing.

まず、画像送受信処理における第１処理（後述の第１画像を第１の端末１００から第２の端末２００に送信する処理）を図８を参照して説明する。この処理では、第１の端末１００と第２の端末２００とサーバ３００とは、それぞれ、画像送受信処理Ａ乃至Ｃにおける第１処理を行う。なお、第１の端末１００は、例えば、第１ユーザが操作部１３０を操作して、第１処理を始める要求を行ったときに、画像送受信処理Ａの第１処理を開始する。また、第２の端末２００とサーバ３００とは、画像データを受信するまで待機しているものとする。 First, a first process in the image transmission / reception process (a process of transmitting a first image described later from the first terminal 100 to the second terminal 200) will be described with reference to FIG. In this process, the first terminal 100, the second terminal 200, and the server 300 perform the first process in the image transmission / reception processes A to C, respectively. The first terminal 100 starts the first process of the image transmission / reception process A when, for example, the first user operates the operation unit 130 and makes a request to start the first process. In addition, it is assumed that the second terminal 200 and the server 300 are on standby until image data is received.

表示制御部１１０ｂは、画像送受信処理Ａの第１処理を開始すると、記憶媒体１９０に記録された画像データが表す画像の一覧（例えば、画像のサムネイルの一覧）の画像を表示部１４０に表示する（ステップＳ１０１）。これによって、表示制御部１１０ｂは、第１ユーザに対して、送信する画像を指定するように促す。ここで、第１ユーザは、第１の端末１００のユーザである。 When the first process of the image transmission / reception process A is started, the display control unit 110b displays on the display unit 140 images of a list of images (for example, a list of thumbnails of images) represented by the image data recorded in the storage medium 190. (Step S101). Accordingly, the display control unit 110b prompts the first user to specify an image to be transmitted. Here, the first user is a user of the first terminal 100.

次に、表示制御部１１０ｂは、画像が選択されたかを判別する（ステップＳ１０２）。第１ユーザが表示部１４０の一覧の画像を見ながら操作部１３０を操作して送信したい画像を一覧の中から選択すると、表示制御部１１０ｂには操作部１３０からこの選択に応じた操作信号が供給される。表示制御部１１０ｂは、この操作信号の供給によって画像が選択されたと判別する（ステップＳ１０２；ＹＥＳ）。このとき、表示制御部１１０ｂは、前記の操作信号によって特定される、第１ユーザに選択された画像を表す画像データをこれから送信する画像データとして特定する（ステップＳ１０３）。 Next, the display control unit 110b determines whether an image is selected (step S102). When the first user operates the operation unit 130 while viewing the list image on the display unit 140 and selects an image to be transmitted from the list, the display control unit 110b receives an operation signal corresponding to the selection from the operation unit 130. Supplied. The display control unit 110b determines that an image has been selected by supplying the operation signal (step S102; YES). At this time, the display control unit 110b specifies image data representing the image selected by the first user, which is specified by the operation signal, as image data to be transmitted (step S103).

一方、表示制御部１１０ｂに操作部１３０から前記の操作信号が供給されない場合、表示制御部１１０ｂは画像が選択されていないと判別し（ステップＳ１０２；ＮＯ）、再度ステップＳ１０２の処理を行う。このようにして、表示制御部１１０ｂは、画像が選択されるまで待機する。 On the other hand, when the operation signal is not supplied from the operation unit 130 to the display control unit 110b, the display control unit 110b determines that no image is selected (step S102; NO), and performs the process of step S102 again. In this way, the display control unit 110b waits until an image is selected.

なお、第１ユーザは、人の顔が写った画像（ここでは、デジタル写真）を、これから送信する画像として選択するものとする。以下では、この画像とこの画像を表す画像データとを、それぞれ、第１画像と第１画像データと、とする。 It is assumed that the first user selects an image (here, a digital photograph) showing a human face as an image to be transmitted. Hereinafter, this image and image data representing this image are referred to as a first image and a first image data, respectively.

次に、表示制御部１１０ｂは、記憶媒体１９０に記録されたアドレスデータが特定する送信先の一覧の画像を表示部１４０に表示する（ステップＳ１０４）。これによって、表示制御部１１０ｂは、第１ユーザに対して、第１画像の送信先を指定するように促す。 Next, the display control unit 110b displays an image of a list of transmission destinations specified by the address data recorded in the storage medium 190 on the display unit 140 (step S104). Accordingly, the display control unit 110b prompts the first user to specify the transmission destination of the first image.

次に、表示制御部１１０ｂは、送信先が選択されたかを判別する（ステップＳ１０５）。第１ユーザが表示部１４０の一覧の画像を見ながら操作部１３０を操作して送信先を一覧の中から選択すると、表示制御部１１０ｂには操作部１３０からこの選択に応じた操作信号が供給される。表示制御部１１０ｂは、この操作信号の供給によって送信先が選択されたと判別する（ステップＳ１０５；ＹＥＳ）。このとき、表示制御部１１０ｂは、前記の操作信号によって特定される、第１ユーザに選択された送信先を特定するアドレスデータをこれから送信する送信先のアドレスデータとして特定する（ステップＳ１０６）。なお、送信先は、ユーザが操作部１３０を操作して直接入力したものであってもよい。また、ここでは、送信先として第２の端末２００が指定されたものとする。 Next, the display control unit 110b determines whether a transmission destination has been selected (step S105). When the first user operates the operation unit 130 while viewing the list image on the display unit 140 and selects a transmission destination from the list, an operation signal corresponding to the selection is supplied from the operation unit 130 to the display control unit 110b. Is done. The display control unit 110b determines that the transmission destination has been selected by supplying the operation signal (step S105; YES). At this time, the display control unit 110b specifies the address data specifying the transmission destination selected by the first user, which is specified by the operation signal, as the transmission destination address data to be transmitted (step S106). Note that the transmission destination may be input directly by the user by operating the operation unit 130. Here, it is assumed that the second terminal 200 is designated as the transmission destination.

一方、表示制御部１１０ｂに操作部１３０から前記の操作信号が供給されない場合、表示制御部１１０ｂは送信先が選択されていないと判別し（ステップＳ１０５；ＮＯ）、再度ステップＳ１０５の処理を行う。このようにして、表示制御部１１０ｂは、送信先が選択されるまで待機する。 On the other hand, when the operation signal is not supplied from the operation unit 130 to the display control unit 110b, the display control unit 110b determines that the transmission destination is not selected (step S105; NO), and performs the process of step S105 again. In this way, the display control unit 110b waits until a transmission destination is selected.

表示制御部１１０ｂは、送信先のアドレスデータと送信する画像データとを特定すると（この特定は、例えば、ＲＡＭ１１２に画像データを保持することによって行われる。）、送信指示を行うかを選択させる画像（例えば、「送信を行いますか？」との文字列と、「ＹＥＳ」のボタンと、「ＮＯ」のボタンと、を含む画像）を表示部１４０に表示し（ステップＳ１０７）、第１ユーザに第１画像を送信させるかを選択させる。そして、表示制御部１１０ｂは、送信指示がされたかを判別する（ステップＳ１０８）。 When the display control unit 110b specifies the address data of the transmission destination and the image data to be transmitted (this specification is performed by holding the image data in the RAM 112, for example), an image for selecting whether to perform a transmission instruction. (For example, an image including a character string “Do you want to send?”, A “YES” button, and a “NO” button) is displayed on the display unit 140 (step S107), and the first user To select whether to transmit the first image. Then, the display control unit 110b determines whether a transmission instruction has been given (step S108).

第１ユーザは、表示部１４０に表示された、送信指示を行うかを選択させる画像を見て、操作部１３０を操作して送信指示を行うかを選択する。表示制御部１１０ｂには操作部１３０からこの選択に応じた操作信号が供給される。 The first user looks at the image displayed on the display unit 140 to select whether to give a transmission instruction, and operates the operation unit 130 to select whether to give a transmission instruction. An operation signal corresponding to this selection is supplied from the operation unit 130 to the display control unit 110b.

第１ユーザが送信指示を行わない旨を操作部１４０を介して入力すると、表示制御部１１０ｂには、この操作に応じた操作信号（例えば、上記「ＮＯ」のボタンを選択したことを示す操作信号）が供給される。表示制御部１１０ｂは、この操作信号が供給されると、送信指示がされていないと判別し（ステップＳ１０８；ＮＯ）、再度ステップＳ１０１の処理を行う。これによって、画像の選択等が再度第１ユーザによって行われる。 When the first user inputs that the transmission instruction is not performed via the operation unit 140, the display control unit 110b displays an operation signal corresponding to the operation (for example, an operation indicating that the “NO” button is selected). Signal). When this operation signal is supplied, the display control unit 110b determines that there is no transmission instruction (step S108; NO), and performs the process of step S101 again. As a result, the image selection and the like are performed again by the first user.

第１ユーザが送信指示を行う旨を操作部１４０を介して入力すると、表示制御部１１０ｂには、この操作に応じた操作信号（例えば、上記「ＹＥＳ」のボタンを選択したことを示す操作信号）が供給される。表示制御部１１０ｂは、この操作信号が供給されると、送信指示がされたと判別する（ステップＳ１０８；ＹＥＳ）。 When the first user inputs a transmission instruction via the operation unit 140, the display control unit 110b receives an operation signal corresponding to this operation (for example, an operation signal indicating that the “YES” button has been selected). ) Is supplied. When this operation signal is supplied, the display control unit 110b determines that a transmission instruction has been given (step S108; YES).

表示制御部１１０ｂが操作指示されたと判別した場合（ステップＳ１０８；ＹＥＳ）、画像送信部１１０ａは、表示制御部１１０ｂが前記で特定した画像データを、表示制御部１１０ｂが前記で特定したアドレスデータ（以下、送信先アドレスデータという。）と、自身のアドレスを示すアドレスデータ（第１の端末１００のアドレスを示すアドレスデータであり、内部記憶部１２０が予め記憶しているものとする。また、このアドレスデータを以下では送信元アドレスデータ）と、とともに、通信部１５０及びネットワーク９００を介してサーバ３００に送信する（ステップＳ１０９）。 When it is determined that the display control unit 110b has been instructed to operate (step S108; YES), the image transmission unit 110a uses the image data specified by the display control unit 110b and the address data specified by the display control unit 110b ( Hereinafter, it is referred to as transmission destination address data) and address data indicating its own address (address data indicating the address of the first terminal 100, which is stored in the internal storage unit 120 in advance). The address data is transmitted to the server 300 via the communication unit 150 and the network 900 together with the transmission source address data below (step S109).

サーバ３００の画像受信部３１０ｃは、通信部３５０を介して、第１の端末１００から送信された画像データ、送信元アドレスデータ、及び、送信先アドレスデータを受信する（ステップＳ１１０）。画像受信部３１０ｃは、受信した画像データを内部記憶部３２０に記録する（ステップＳ１１１）。このとき、画像受信部３１０ｃは、画像データに画像ＩＤ（画像データを識別するための情報、例えば、連番の番号等）を付し、画像ＩＤを示すデータにそれぞれ対応付けて、受信した画像データ、送信先アドレスデータ、送信元アドレスデータを内部記憶部３２０に記録する。なお、画像データは、画像データの記録場所（内部記憶部３２０内の記録場所）を特定するようなデータを介して画像ＩＤと対応付けられて記録される。つまり、画像データの記録場所を示すデータが画像ＩＤと対応付けられて記録される。画像データは、直接画像ＩＤと対応付けられて記録されてもよい。 The image receiving unit 310c of the server 300 receives the image data, the transmission source address data, and the transmission destination address data transmitted from the first terminal 100 via the communication unit 350 (Step S110). The image receiving unit 310c records the received image data in the internal storage unit 320 (step S111). At this time, the image receiving unit 310c attaches an image ID (information for identifying the image data, such as a serial number) to the image data, and associates the received image with the data indicating the image ID. Data, destination address data, and source address data are recorded in the internal storage unit 320. The image data is recorded in association with the image ID via data that specifies the recording location of the image data (recording location in the internal storage unit 320). That is, data indicating the recording location of the image data is recorded in association with the image ID. The image data may be recorded in association with the direct image ID.

図９に内部記憶部３２０に記録された、画像ＩＤのデータと、画像データの記録場所を示すデータと、送信先アドレスデータと、送信元アドレスデータとについてのデータ構造の一例を示す。内部記憶部３２０は、前記のデータを含むデータテーブル（以下、画像送受信用データテーブルという。）を記憶する。図９は、このデータテーブルのデータ構造である。このデータテーブルは、画像データを識別する「画像ＩＤ」と、画像データの記録場所を示す「画像データの記録場所」と、画像を送信する端末（ここでは、第２の端末２００等）を特定する「送信先アドレス」と、画像を送信した端末（ここでは、第１の端末１００等）を特定する「送信元アドレス」と、のそれぞれを示すデータ（互いに対応付けられる。）を含んで構成されるデータ構造になっている。 FIG. 9 shows an example of the data structure of the image ID data, the data indicating the recording location of the image data, the transmission destination address data, and the transmission source address data recorded in the internal storage unit 320. The internal storage unit 320 stores a data table including the above data (hereinafter referred to as an image transmission / reception data table). FIG. 9 shows the data structure of this data table. This data table identifies an “image ID” for identifying image data, an “image data recording location” indicating the image data recording location, and a terminal (in this case, the second terminal 200, etc.) that transmits the image. Data (corresponding to each other) indicating the “transmission destination address” and the “transmission source address” specifying the terminal (here, the first terminal 100 or the like) that transmitted the image. Data structure.

画像受信部３１０ｃが、第１画像データ等を内部記憶部３２０に記録すると、画像送信部３１０ａは、通信部３５０及びネットワーク９００を介して、この第１画像データに対応する送信先アドレスデータ（第１画像データとともに画像受信部３１０ａが受信した送信先アドレスデータ）が示すアドレスの端末（ここでは、第２端末２００）に、この第１画像データと、この第１画像データとともに画像受信部３１０ｃが受信した送信元アドレスデータと、画像ＩＤを示すデータと、を送信する（ステップＳ１１２）。 When the image receiving unit 310c records the first image data or the like in the internal storage unit 320, the image transmitting unit 310a transmits the destination address data (the first address data) corresponding to the first image data via the communication unit 350 and the network 900. The first image data and the image receiver 310c together with the first image data are sent to the terminal (here, the second terminal 200) indicated by the destination address data received by the image receiver 310a together with the one image data. The received transmission source address data and data indicating the image ID are transmitted (step S112).

第２の端末２００の画像受信部２１０ｃは通信部３５０を介して、サーバ２００から送信された第１画像データと送信元アドレスデータと画像ＩＤを示すデータとを受信する（ステップＳ１１３）。画像受信部２１０ｃが前記のデータを受信すると、表示制御部２１０ｄは、例えば、表示部２４０に第１画像データを受信したことを示すメールマーク等を含む画像を表示し、第１画像データを受信したことを第２ユーザに報知する（ステップＳ１１４）。なお、画像受信部２１０ｃは、送信元アドレスデータと第１画像データと画像ＩＤを示すデータとを互いに対応付けて内部記憶部２２０に記録する。また、第２ユーザとは、第２の端末２００のユーザである。 The image receiving unit 210c of the second terminal 200 receives the first image data, the transmission source address data, and the data indicating the image ID transmitted from the server 200 via the communication unit 350 (step S113). When the image receiving unit 210c receives the data, the display control unit 210d displays, for example, an image including a mail mark indicating that the first image data has been received on the display unit 240, and receives the first image data. The second user is notified of this (step S114). The image receiving unit 210c records the source address data, the first image data, and the data indicating the image ID in the internal storage unit 220 in association with each other. The second user is a user of the second terminal 200.

次に、画像送受信処理における第２処理（後述の第３画像を第１の端末１００に返信する処理）を図１０を参照して説明する。この処理では、第１の端末１００と第２の端末２００とサーバ３００とは、それぞれ、画像送受信処理Ａ乃至Ｃにおける第２処理を行う。なお、第２の端末２００は、第１ユーザが操作部２３０を操作して、第２処理を始める要求（例えば、受信している複数の画像データが表す複数の画像のうちの１つを表示部２４０に表示させる要求）を行ったときに、画像送受信処理Ｂの第２処理を開始する。また、第１の端末１００とサーバ３００とは、画像送受信処理Ａ及びＣの第２処理を開始しており、画像データを受信するまで待機しているものとする。 Next, a second process in the image transmission / reception process (a process of returning a third image to be described later to the first terminal 100) will be described with reference to FIG. In this process, the first terminal 100, the second terminal 200, and the server 300 perform the second process in the image transmission / reception processes A to C, respectively. The second terminal 200 displays a request (for example, one of a plurality of images represented by a plurality of received image data) by the first user operating the operation unit 230 to start the second process. The second process of the image transmission / reception process B is started. Further, it is assumed that the first terminal 100 and the server 300 have started the second processing of the image transmission / reception processes A and C and are waiting until the image data is received.

まず、表示制御部２１０ｄは、第１画像データ（例えば、ユーザに指定された第１画像データ）を、例えば、内部記憶部２２０から取得し、取得した第１画像データを表示部２４０に供給して、第１画像データが表す第１画像を表示部２４０に表示する（ステップＳ２０１）。 First, the display control unit 210d acquires first image data (for example, first image data designated by the user) from, for example, the internal storage unit 220, and supplies the acquired first image data to the display unit 240. Then, the first image represented by the first image data is displayed on the display unit 240 (step S201).

画像生成部２１０ｂは、表示制御部２１０ｄが第１画像を表示部２４０に表示すると、撮影制御部２１０ｅは、撮影部２７０を制御して表示部２４０の前方の撮影する（ステップＳ２０２）。なお、撮影制御部２１０ｅは、撮影を開始すると、例えば、撮像部２７０が備える前記のランプ等を点灯させ、撮影が開始していることを、第２ユーザに報知する。 When the display control unit 210d displays the first image on the display unit 240, the image generation unit 210b controls the shooting unit 270 to take a picture in front of the display unit 240 (step S202). Note that when shooting is started, the shooting control unit 210e turns on the lamp and the like included in the imaging unit 270, for example, and notifies the second user that shooting is started.

なお、撮影は、所定の期間（例えば、３０秒）行われる。また、撮影制御部２１０ｅは、静止画像を所定の時間間隔（例えば、１秒ごと）で撮影することによって、撮影を行う。撮影制御部２１０ｅは、この撮影によって順次供給される撮影画像データ（前記の静止画像を表す画像データ）を内部記憶部２２０に順次記録していく。 Note that photographing is performed for a predetermined period (for example, 30 seconds). Further, the shooting control unit 210e performs shooting by shooting still images at predetermined time intervals (for example, every second). The imaging control unit 210e sequentially records captured image data (image data representing the still image) sequentially supplied by the imaging in the internal storage unit 220.

撮影制御部２１０ｅは、順次供給される撮影画像データを表示部２４０に供給し、この撮像画像データが表す静止画像を順次表示部２４０の表示画面の所定の表示領域（例えば、表示画面における向かって右上隅）に表示してもよい。これによって、第２ユーザは、後述の自分の動作を画面上で確認できる。また、後述の画像生成部２１０ｂは、後述の手の領域の検出を、リアルタイムで行い、手の領域が検出した場合に、撮影制御部２１０ｅが所定のランプを点灯することによって、第２ユーザに手の領域が検出されたことをフィードバックしてもよい。 The imaging control unit 210e supplies the captured image data sequentially supplied to the display unit 240, and sequentially displays the still images represented by the captured image data toward a predetermined display area (for example, toward the display screen) of the display screen of the display unit 240. It may be displayed in the upper right corner). As a result, the second user can check his / her own operation on the screen. In addition, the image generation unit 210b described later performs detection of a hand region described later in real time, and when the hand region is detected, the imaging control unit 210e turns on a predetermined lamp to notify the second user. It may be fed back that the hand region has been detected.

上記の撮影の間、表示部２４０の前方にいる第２ユーザは表示部２４０が表示する第１画像を見て、表示部２４０の前で、手で所定のジェスチャーを行う。このジェスチャーとは、ユーザが行う手振りや手真似であって、例えば、「拍手」、「なでる」、「指を指す」、「手を振る」、「包み込む」等の動作がある。また、第２ユーザは、前記のジェスチャーを第１画像に写った所望の顔に対して行う。また、第２ユーザは、いずれか一の動作のみを行うものとする。撮影制御部２１０ｅは、撮影部２７０を用いて、前記の手のジェスチャーを撮影することになる。 During the above photographing, the second user in front of the display unit 240 looks at the first image displayed on the display unit 240 and performs a predetermined gesture with his hand in front of the display unit 240. This gesture is a gesture or hand imitation performed by the user, and includes operations such as “applause”, “stroking”, “pointing fingers”, “waving hands”, “wrapping”, and the like. In addition, the second user performs the gesture on a desired face captured in the first image. In addition, it is assumed that the second user performs only one of the operations. The photographing control unit 210e uses the photographing unit 270 to photograph the hand gesture.

画像生成部２１０ｂは、撮影制御部２１０ｅが前記の所定の期間撮影を行うと、つまり、撮影を終了すると、手の領域を表す第２画像を生成する（ステップＳ２０３）。画像生成部２１０ｂは、撮影制御部２１０ｅが撮影した撮影画像（内部記憶部２２０に記録された撮影画像データ）に基づいて第２画像を生成する。 When the shooting control unit 210e performs shooting for the predetermined period, that is, when shooting is finished, the image generation unit 210b generates a second image representing the hand region (step S203). The image generation unit 210b generates a second image based on the captured image (captured image data recorded in the internal storage unit 220) captured by the capture control unit 210e.

例えば、画像生成部２１０ｂは、撮影画像を画像解析し、撮影画像について、第２ユーザの手の領域とそれ以外の領域とに分けて手（第２ユーザの手）の領域を検出する。この手の領域の検出は、手のジェスチャーで家電等を操作するための画像認識技術などが使える。この手の領域の検出は、例えば、手のテンプレート画像を用いたテンプレートマッチング等によって行われる。テンプレート画像のデータは、内部記憶部２２０に予め記憶されるものとする。また、このテンプレート画像は、頭をなでるときの手の画像（上記「なでる」のジェスチャーに対応）、指を指すときの手の画像（上記「指を指す」のジェスチャーに対応）、拍手をするときの手の画像（上記「拍手」のジェスチャーに対応）、手を振るときの手の画像（上記「手を振る」のジェスチャーに対応）、顔を包み込むときの手の画像（上記「包み込む」のジェスチャーに対応）等、第２ユーザが行うであろう手のジェスチャーに対応した画像とする。一のテンプレート画像で、手の領域が検出出来なかった場合には、順次、他のテンプレート画像が使用される。そして、全てのテンプレート画像について、手の領域を検出出来なかった場合には、その撮影画像では手の領域が検出されないことになる。 For example, the image generation unit 210b performs image analysis on the captured image, and detects the hand (second user's hand) region by dividing the captured image into a second user's hand region and other regions. This hand region detection can use image recognition technology for operating home appliances with hand gestures. The detection of the hand region is performed, for example, by template matching using a hand template image. It is assumed that the template image data is stored in advance in the internal storage unit 220. In addition, this template image is an image of a hand when stroking the head (corresponding to the gesture of “stroking” above), an image of a hand when pointing a finger (corresponding to the gesture of “pointing a finger” above), and applauding Images of hands (corresponding to the gesture of “applause” above), images of hands when waving (corresponding to the gesture of “waving hands” above), images of hands when wrapping the face (above “envelop”) The image corresponding to the gesture of the hand that the second user will perform. If the hand region cannot be detected in one template image, other template images are used in sequence. If the hand area cannot be detected for all template images, the hand area is not detected in the captured image.

画像生成部２１０ｂは、撮影画像について、手の領域を複数検出した場合には、その中で、例えば、一番大きい手の領域を検出する。この手は、撮影部２７０に近い位置にあると考えられ、第２ユーザの手である可能性が高いからである。 When a plurality of hand regions are detected in the captured image, the image generation unit 210b detects, for example, the largest hand region. This is because the hand is considered to be in a position close to the photographing unit 270 and is likely to be the hand of the second user.

画像生成部２１０ｂは、手の領域を検出すると、この手の領域と、それ以外の領域と、に分け、手の領域のみを抽出した画像を生成する。 When detecting the hand region, the image generation unit 210b divides the hand region into other hand regions and generates an image in which only the hand region is extracted.

画像生成部２１０ｂは、前記の画像を生成すると、この画像を表裏反転（左右反転）させる。撮影部２７０は、表示部２４０の前方を撮影しているので、つまり、手を表示部２４０から撮影しているため、第２ユーザが見る手は、実際に撮影された手と表裏が逆になるからである。次に、反転させた画像を黒く塗りつぶして影絵のような画像（手領域画像）に変換する。 When the image generation unit 210b generates the image, the image generation unit 210b inverts the image (inverted horizontally). Since the photographing unit 270 is photographing the front of the display unit 240, that is, since the hand is photographed from the display unit 240, the hand viewed by the second user is reverse to the actually photographed hand. Because it becomes. Next, the inverted image is painted black and converted into a shadow-like image (hand region image).

なお、画像生成部２１０ｂは、上記の影絵内に、所定の線を入れることによって、指等を表現した画像を手領域画像としてもよい。また、手領域画像は、当然、黒く塗りつぶされたものでなくてもよい。 Note that the image generation unit 210b may set an image representing a finger or the like as a hand region image by putting a predetermined line in the shadow picture. In addition, the hand region image does not necessarily have to be painted black.

上記のような方法で生成された画像の例を、図１１を参照して説明する。図１１のように、この手領域画像３０は、例えば、手の領域が黒く塗りつぶされた、例えば、ビットマップで表される影絵のような画像になる。なお、図１１における矩形は、元の撮影画像の外縁を示す。なお、この手領域画像を表す手領域画像データには、適宜、元の撮影画像における手領域の位置（手領域の中心の位置）、方向（手領域の長尺方向の方向）等を示す位置方向データが含まれるものとする。つまり、この手領域画像データによれば、手領域画像の、元の撮影画像における位置等が分かることなる。 An example of an image generated by the above method will be described with reference to FIG. As shown in FIG. 11, the hand region image 30 is an image like a shadow picture represented by a bitmap, for example, in which the hand region is painted black. In addition, the rectangle in FIG. 11 shows the outer edge of the original captured image. The hand area image data representing the hand area image appropriately includes a position indicating the position of the hand area (position of the center of the hand area), direction (direction of the long direction of the hand area), and the like in the original captured image. Direction data shall be included. That is, according to the hand area image data, the position of the hand area image in the original captured image can be known.

なお、上記で説明した一連の処理は、各撮影画像それぞれについて時系列順に行われる。なお、上記のテンプレートマッチングでは、最初に手の領域を検出できたときに使用されたテンプレート画像が、その後の撮影画像におけるテンプレートマッチングで使用される。手領域画像の集合によって、第２画像が構成される。つまり、上記一連の処理によって、人の手の像を含む画像である第２画像であって、複数の手領域画像からなる第２画像が生成されることになる。第２画像を表す画像データ（以下、第２画像データという）は、例えば、第２画像を構成する各手領域画像（時系列順に並ぶ）をそれぞれ表す複数の手領域画像データの集合である。 The series of processing described above is performed in time series for each captured image. In the template matching described above, the template image used when the hand region can be detected for the first time is used for template matching in the subsequent captured image. A second image is constituted by a set of hand region images. That is, by the series of processes described above, a second image that is an image including an image of a human hand and includes a plurality of hand region images is generated. The image data representing the second image (hereinafter referred to as second image data) is, for example, a set of a plurality of hand region image data representing each hand region image (arranged in time series) constituting the second image.

画像生成部２１０ｂが第２画像を生成すると、つまり、第２画像データを生成すると、第２画像を構成する各静止画像の手がどのようなジェスチャーを行っている手であるか（ジェスチャーの種別）を特定するデータ（手種別データ）を生成する。これは、例えば、上記のテンプレートマッチングにおいて、手の領域を検出できたときに使用されたテンプレート画像の種別によって特定できる。 When the image generation unit 210b generates the second image, that is, when the second image data is generated, what kind of gesture is the hand of each still image constituting the second image (the type of gesture) ) To specify data (hand-type data). This can be specified by, for example, the type of template image used when the hand region can be detected in the template matching described above.

画像送信部２１０ａは、画像生成部２１０ｂが生成した第２画像データを、現在表示部２４０に表示されている第１画像の画像ＩＤを特定するデータ（例えば、内部記憶部２２０に記録されており、これを画像送信部２１０ａが取得する）と、手種別データと、ともにネットワーク９００及び通信部２５０を介してサーバ３００に送信する（ステップＳ２０４）。このとき、画像送信部２１０ａは、内部記憶部２２０に前記の画像ＩＤを特定するデータに対応付けられて記録されている、第２画像データとともに、送信元アドレスデータと第２の端末２００自身に設定されているアドレスデータ（前記の送信先アドレスデータ）とを送信する。 The image transmission unit 210a uses the second image data generated by the image generation unit 210b as data specifying the image ID of the first image currently displayed on the display unit 240 (for example, recorded in the internal storage unit 220). The image transmission unit 210a obtains this and the hand type data are transmitted to the server 300 via the network 900 and the communication unit 250 (step S204). At this time, the image transmission unit 210a stores the source image data and the second terminal 200 itself together with the second image data recorded in the internal storage unit 220 in association with the data specifying the image ID. The set address data (the transmission destination address data) is transmitted.

サーバ３００の画像受信部３１０ｃは、通信部３５０を介して、画像送信部２１０ａが送信した、第２画像データと、手種別データと、画像ＩＤを特定するデータと、送信元アドレスデータと、送信先アドレスデータと、を受信する（ステップＳ２０５）。画像受信部３１０ｃがデータを受信すると、画像合成３１０ｂは、内部記憶部３２０が記憶する画像送受信用データテーブルを参照し、画像受信部３１０ｃが受信した、画像ＩＤを特定するデータに基づいて、このデータに対応する第１画像データを特定し、この第１画像データを内部記憶部３２０から取得し、取得した第１画像データが表す第１画像と画像受信部３１０ｃが受信した第２画像データが表す第２画像とを合成し、第３画像を生成する（ステップＳ２０６）。 The image receiving unit 310c of the server 300 transmits the second image data, the hand type data, the data specifying the image ID, the transmission source address data, and the transmission transmitted by the image transmission unit 210a via the communication unit 350. The destination address data is received (step S205). When the image receiving unit 310c receives the data, the image composition 310b refers to the image transmission / reception data table stored in the internal storage unit 320, and based on the data specifying the image ID received by the image receiving unit 310c. First image data corresponding to the data is specified, the first image data is acquired from the internal storage unit 320, and the first image represented by the acquired first image data and the second image data received by the image receiving unit 310c are The second image to be represented is synthesized to generate a third image (step S206).

ここで、ステップＳ２０６の処理の詳細を図１２を参照して説明する。 Details of the process in step S206 will be described with reference to FIG.

画像合成部３１０ｂは、取得した第１画像データに基づいて、顔認識を行う（ステップＳ３０１）。顔認識は、例えば、顔の画像のテンプレート画像を用いたテンプレートマッチング等によって行われる。 The image composition unit 310b performs face recognition based on the acquired first image data (step S301). Face recognition is performed by, for example, template matching using a template image of a face image.

画像合成部３１０ｂは、認識した顔を基準顔として特定する（ステップＳ３０２）。特に、画像合成部３１０ｂは、認識した顔が１つである場合には、その顔を基準顔（後述の画像の合成における対象となる顔）として特定する。また、画像合成部３１０ｂは、認識した顔が複数ある場合には、大きさが最も大きい顔を基準画像として特定する。最も大きい顔が、第１画像におけるメインの被写体であり、第２ユーザ等が注目して見る顔であると考えられるからである。例えば、図１３（ａ）のように、認識した顔が複数ある場合には、最も大きい顔が基準顔として特定される。図１３（ａ）では、点線で囲まれた顔が基準顔になる。このような顔は、第２ユーザがジェスチャーを行う対象（例えば、頭をなでる対象の顔）である可能性が高いと考えられる。このため、この方法では基準顔の精度が良くなる。また、このような方法によれば、簡単に基準顔を特定できる。 The image composition unit 310b identifies the recognized face as a reference face (step S302). In particular, when there is one recognized face, the image composition unit 310b identifies that face as a reference face (a face to be used in image composition described later). In addition, when there are a plurality of recognized faces, the image composition unit 310b specifies the face having the largest size as the reference image. This is because it is considered that the largest face is the main subject in the first image and the face that the second user or the like sees. For example, as shown in FIG. 13A, when there are a plurality of recognized faces, the largest face is specified as the reference face. In FIG. 13A, a face surrounded by a dotted line is a reference face. Such a face is considered to be highly likely to be an object on which the second user performs a gesture (for example, a face of an object stroking the head). For this reason, this method improves the accuracy of the reference face. Further, according to such a method, the reference face can be easily specified.

なお、認識した顔が複数ある場合には、例えば、画像合成部３１０ｂは、予め登録されている顔（例えば、第２ユーザ又は第１ユーザ本人や、友人、家族、知人等の顔）を基準顔として特定してもよい。この場合には、画像ＩＤのデータ毎に、所定の人の顔のテンプレート画像の画像データが記録されており、画像合成部３１０ｂは、画像ＩＤ毎に、対応する画像データが表すテンプレート画像で、テンプレートマッチングを行う。このような顔は、第２ユーザがジェスチャーを行う対象である可能性が高いと考えられる。このため、この方法では基準顔の特定の精度が良くなる。 When there are a plurality of recognized faces, for example, the image composition unit 310b uses a pre-registered face (for example, a face of a second user or the first user, a face of a friend, family, acquaintance, etc.) as a reference. It may be specified as a face. In this case, image data of a template image of a predetermined person's face is recorded for each image ID data, and the image composition unit 310b is a template image represented by corresponding image data for each image ID. Perform template matching. It is considered that such a face is likely to be a target for the second user to perform a gesture. For this reason, this method improves the accuracy of specifying the reference face.

また、認識した顔が複数ある場合には、例えば、画像合成部３１０ｂは、手領域画像データに含まれる、撮影画像における位置及び方向等を特定するデータに基づいて、第１画像に、前記の位置及び方向で、手領域画像を重ねて（撮影画像の画像サイズと第１画像の画像サイズとは同じであるものとする。）、この手領域画像の最も近傍にある顔を基準顔として特定してもよいし、例えば、手領域画像が、指で指したときの手であれば、手の領域において先細りしている部分の延長線上にあって、この部分に最も近い顔を基準顔として特定してもよい。また、例えば、手領域画像が、包み込む手であれば、二つの手の領域の間にある顔を基準顔として特定してもよい。これらのような、手領域画像の手の領域の形状（ジェスチャーの種別）に合わせて基準顔を特定する方法で特定される顔は、第２ユーザがジェスチャーを行う対象（例えば、頭をなでる対象の顔、指で指す対象の顔、手で包み込む対象の顔）である可能性が高いと考えられる。このため、これらの方法では基準顔の特定の精度が良くなる。なお、手の領域の形状（ジェスチャーの種別）は、手種別データによって特定される。 In addition, when there are a plurality of recognized faces, for example, the image composition unit 310b adds, to the first image, the first image based on data specifying the position and direction in the captured image included in the hand region image data. Overlay the hand area image in the position and direction (assuming that the image size of the captured image is the same as the image size of the first image), and specify the face closest to this hand area image as the reference face For example, if the hand area image is a hand when pointing with a finger, the face closest to this part on the extension line of the tapered part in the hand area is used as the reference face. You may specify. Further, for example, if the hand region image is an enveloping hand, the face between the two hand regions may be specified as the reference face. The face specified by the method of specifying the reference face according to the shape of the hand area (type of gesture) of the hand area image is an object that the second user performs a gesture (for example, an object that strokes the head). The face of the subject pointed to by the finger, the face of the subject to be wrapped with the hand). For this reason, in these methods, the accuracy of specifying the reference face is improved. The shape of the hand region (the type of gesture) is specified by the hand type data.

また、認識した顔が複数ある場合には、例えば、画像合成部３１０ｂは、顔が正面を向いているものを基準顔としたり、第１画像の中央領域にある顔を基準顔としたりしてもよい。いずれの場合であっても、これらの方法で特定される顔は、第２ユーザがジェスチャーを行う対象である可能性が高いと考えられる。このため、これらの方法では基準顔の特定の精度が良くなる。 In addition, when there are a plurality of recognized faces, for example, the image composition unit 310b sets the face facing the front as the reference face, or sets the face in the center area of the first image as the reference face. Also good. In any case, it is considered that the face identified by these methods is highly likely to be a target for the second user to perform a gesture. For this reason, in these methods, the accuracy of specifying the reference face is improved.

次に、画像合成部３１０ｂは、第１画像において、基準顔として特定した顔と同程度の大きさ顔の領域（両者の面積の差分が閾値以下の領域）を、第２画像を重畳させるべきでない顔領域として特定する（ステップＳ３０３）。このような顔も、基準顔と同様に、第２ユーザ等にとって重要な人の顔であり、後述のように第２画像を所定以上の面積で重畳させると、重要な人の顔が隠れてしまうからである。例えば、図１３（ｂ）のように、基準顔１１又は１３（点線内の顔）と同様の大きさの顔１２又は１４が顔領域（点線内の顔）として特定される。 Next, the image compositing unit 310b should superimpose the second image on a face area (area where the difference in area between the two is equal to or less than a threshold) in the first image that is the same size as the face specified as the reference face. Is identified as a non-face region (step S303). Like the reference face, such a face is also an important person's face for the second user or the like. If the second image is overlapped with a predetermined area or more as described later, the important person's face is hidden. Because it ends up. For example, as shown in FIG. 13B, a face 12 or 14 having the same size as the reference face 11 or 13 (face within a dotted line) is specified as a face region (face within a dotted line).

なお、ここでの顔領域の特定に、上記の基準顔の特定で用いられた方法（特に、ジェスチャーの種別に合わせて基準顔を特定する方法以外の方法）が用いられてもよい。 Note that the method used for specifying the reference face (in particular, a method other than the method for specifying the reference face in accordance with the type of gesture) may be used for specifying the face area here.

また、このステップＳ３０３は、顔が複数認識された場合に行われる処理であり、顔が１つのみ認識された場合には、画像合成部３１０ｂは、この処理をスキップする。また、画像合成部３１０ｂは、基準顔以外の顔について、基準顔として特定した顔と同程度の大きさのものが無い場合についても、次の処理に進む。 Further, this step S303 is a process performed when a plurality of faces are recognized, and when only one face is recognized, the image composition unit 310b skips this process. Further, the image composition unit 310b proceeds to the next process even when there is no face other than the reference face having the same size as the face specified as the reference face.

次に、画像合成部３１０ｂは、第２画像を第１画像に重畳（合成）する処理を行うが、その前に第２画像を重畳する、第１画像における位置及び第２画像の大きさを決定する処理を行う（ステップＳ３０４）。 Next, the image composition unit 310b performs processing for superimposing (combining) the second image on the first image, but before that, the position in the first image and the size of the second image on which the second image is superimposed are determined. Processing to determine is performed (step S304).

この処理を、図１４を参照して詳細に説明する。 This process will be described in detail with reference to FIG.

まず、画像合成部３１０ｂは、第１画像における、基準顔の像の長尺方向における長さ（例えば基準顔の像を囲む長方形の長尺方向の長さ）を算出する（ステップＳ４０１）。また、画像合成部３１０ｂは、第２画像が含む手領域画像の長尺方向における長さの平均（例えば、手領域画像を囲む長方形の長尺方向の長さの平均）を算出する（ステップＳ４０２）。なお、手領域画像の手が握った状態であっても。この画像の長尺方向の長さを算出する。 First, the image composition unit 310b calculates the length of the reference face image in the long direction (for example, the length in the long direction of the rectangle surrounding the reference face image) in the first image (step S401). Further, the image composition unit 310b calculates the average length in the long direction of the hand region image included in the second image (for example, the average length in the long direction of the rectangle surrounding the hand region image) (step S402). ). Even if the hand of the hand region image is held. The length of this image in the longitudinal direction is calculated.

次に、画像合成部３１０ｂは、サイズ係数を求める（ステップＳ４０３）。画像合成部３１０ｂは、前記で算出した２つの値から、サイズ係数＝Ａ×（基準顔の像の長尺方向における長さ）／（手領域画像を囲む長方形の長尺方向の長さの平均）を求める。Ａは所定の定数であって、基準顔の像の長尺方向における長さを手領域画像を囲む長方形の長尺方向の長さの平均の何倍にするかによって予め設定しておく。 Next, the image composition unit 310b obtains a size coefficient (step S403). From the two values calculated above, the image composition unit 310b calculates the size coefficient = A × (length of the reference face image in the longitudinal direction) / (average length in the longitudinal direction of the rectangle surrounding the hand region image). ) A is a predetermined constant, and is set in advance according to how many times the length of the reference face image in the longitudinal direction is the average of the length in the longitudinal direction of the rectangle surrounding the hand region image.

画像合成部３１０ｂは、第２画像に含まれる各手領域画像を、手領域画像の大きさ（例えば縦及び横の寸法）にサイズ係数を乗じて、サイズ変更する（ステップＳ４０４）。これにより、手領域画像の長さの平均は、基準顔のＡ倍になる。 The image composition unit 310b resizes each hand area image included in the second image by multiplying the size (for example, vertical and horizontal dimensions) of the hand area image by a size coefficient (step S404). As a result, the average length of the hand region image is A times that of the reference face.

次に、画像合成部３１０ｂは、サイズ変更後の各手領域画像を、中心を同じにして重ね合わせ、各手領域画像がすべて入る集合領域（例えば、各手領域画像を重ね合わせた和集合の領域）を求め、求めた集合領域を第１画像に重畳させる（ステップＳ４０５）。このときの第１画像上の位置（集合領域の中心位置）は、例えば、手種別データが特定するジェスチャーの種別に応じて特定される。 Next, the image compositing unit 310b superimposes the resized hand region images with the same center, and collects all the hand region images (for example, a sum set obtained by superimposing the hand region images). Region) is obtained, and the obtained aggregate region is superimposed on the first image (step S405). At this time, the position on the first image (the center position of the gathering area) is specified according to the type of gesture specified by the hand type data, for example.

例えば、上記「なでる」のジェスチャーの場合、基準顔の上に手が位置するのが普通なので、基準顔の中心位置よりも上かつ第１の距離の位置を集合領域の中心位置とする。 For example, in the case of the “stroking” gesture, since the hand is usually positioned on the reference face, the position above the center position of the reference face and at the first distance is set as the center position of the collection area.

例えば、上記「指を指す」のジェスチャーの場合、指が顔を指すのが普通なので、基準顔の中心位置から、前記集合領域における先細りしている方向と反対側の方向に第２の距離ずらした位置を集合領域の中心位置とする。 For example, in the case of the “pointing finger” gesture, since the finger usually points to the face, the second position is shifted from the center position of the reference face in a direction opposite to the tapering direction in the gathering region. This position is set as the center position of the collection area.

例えば、上記「拍手」のジェスチャーの場合、両手が基準顔に重ならなければよいので、基準顔の中心位置から任意の方向に第３の距離ずらした位置を集合領域の中心位置とする。 For example, in the case of the above “applause” gesture, both hands do not have to overlap the reference face. Therefore, a position shifted by a third distance from the center position of the reference face in an arbitrary direction is set as the center position of the collection area.

例えば、上記「手を振る」のジェスチャーの場合、両手が基準顔に重ならなければよいので、基準顔の中心位置から任意の方向に第４の距離ずらした位置を集合領域の中心位置とする。 For example, in the case of the “waving hand” gesture, both hands do not have to overlap the reference face. Therefore, a position shifted by a fourth distance from the center position of the reference face in an arbitrary direction is set as the center position of the collection area. .

例えば、上記「包み込む」のジェスチャーの場合、両手が基準顔を包むようにするので、基準顔の中心位置から左右方向にそれぞれ第５の距離ずらした位置を集合領域の両手の中心位置とする。 For example, in the case of the “envelop” gesture, both hands wrap the reference face, so the positions shifted by the fifth distance in the left-right direction from the center position of the reference face are set as the center positions of both hands of the collective region.

次に、画像合成部３１０ｂは、重畳した集合領域が前記で特定した、基準顔及び第２画像を重畳させるべきでない顔領域に所定の面積以上重なるかを判別し（ステップＳ４０６）、所定の面積以上重なっていない場合には（ステップＳ４０６；ＮＯ）、ステップＳ４０８の処理に進む。一方、重畳した集合領域が所定の面積以上重なる場合には（ステップＳ４０６；ＹＥＳ）、最小の距離で所定の面積以上重ならない位置に移動させる（ステップＳ４０７）。画像生成部３１０ｂは、ステップＳ４０７の処理において、例えば、集合領域を第１画像において、元の位置から上下左右に１画素、２画素、３画素・・・というように、ずらしていく。そして、例えば、最初に集合領域が基準顔及び第２画像を重畳させるべきでない顔領域第２画像を重畳させるべきでない顔領域に所定の面積以上重ならなくなった位置を最小の距離で重ならない位置とし、集合領域を移動させる。そして、第１画像における、この集合領域の中心位置を特定しておく。なお、集合領域が両手分の二つの離れた領域からなる場合には、適宜、両者について独立して移動させる。 Next, the image composition unit 310b determines whether or not the overlapped aggregate region overlaps the specified face and the face region that should not be superimposed with the second image (step S406). If there is no overlap (step S406; NO), the process proceeds to step S408. On the other hand, when the overlapped aggregate region overlaps with a predetermined area or more (step S406; YES), it is moved to a position where it does not overlap with the predetermined area at the minimum distance (step S407). In the process of step S407, for example, the image generation unit 310b shifts the collection area from the original position vertically, horizontally, 1 pixel, 2 pixels, 3 pixels,... In the first image. And, for example, a position where the set area does not overlap with the reference area and the face area where the second image should not be superimposed on the face area where the second image should not overlap with the minimum area at the minimum distance. And move the collection area. Then, the center position of the collective region in the first image is specified in advance. If the gathering area is composed of two separate areas for both hands, the two are moved independently as appropriate.

次に、画像合成部３１０ｂは、第１画像に重畳している集合領域が第１画像の外側に、はみ出していないかを判別し（ステップＳ４０８）、はみ出していないと判別すると（ステップＳ４０８；ＮＯ）、ステップＳ３０５の処理を行う。画像合成部３１０ｂは、はみ出していると判別した場合（ステップＳ４０８；ＹＥＳ）、集合領域を縮小して、最大のサイズで第１画像に納まるような、縮小率（左右の縮小率は同じにする。）を特定する。なお、画像合成部３１０ｂは、縮小率を徐々に大きくして、集合領域を順次縮小し、最初に、縮小後の集合領域が第１画像をはみ出さない縮小率を前記の縮小率として特定する。なお、画像生成部３１０ｂは、集合領域を縮小する際に、基準顔側に近い所定の１点を中心にして縮小をしていく。このため、縮小後の集合領域の中心位置は縮小前に比べて変化する。画像生成部３１０ｂは、この中心位置も特定する。上記のようにして、画像生成部３１０ｂは、縮小後の集合領域が第１画像をはみ出さない縮小率を前記の縮小率と特定するとともに、第１画像における集合領域の中心の位置も特定する（ステップＳ４０９）。画像生成部３１０ｂは、その後、ステップＳ３０５の処理を行う。 Next, the image composition unit 310b determines whether or not the collective region superimposed on the first image protrudes outside the first image (step S408), and determines that it does not protrude (step S408; NO). ), The process of step S305 is performed. When it is determined that the image composition unit 310b has protruded (step S408; YES), the reduction ratio (the left and right reduction ratios are the same) is reduced so that the aggregate area is reduced and fits in the first image with the maximum size. .) The image composition unit 310b gradually increases the reduction ratio to sequentially reduce the collection area, and first specifies the reduction ratio at which the reduced collection area does not protrude the first image as the reduction ratio. . Note that the image generation unit 310b performs reduction around a predetermined point close to the reference face side when reducing the collective region. For this reason, the center position of the aggregated area after the reduction changes compared with that before the reduction. The image generation unit 310b also specifies the center position. As described above, the image generation unit 310b specifies the reduction ratio at which the reduced aggregate area does not protrude from the first image as the reduction ratio, and also specifies the position of the center of the aggregate area in the first image. (Step S409). Thereafter, the image generation unit 310b performs the process of step S305.

上記集合領域と、基準顔との関係の例について、図１５乃至図１８を参照して説明する。なお、各集合領域は、便宜上矩形等の単純な図形で模式化されている。また、顔が１つのとき（基準顔が１つで）、 An example of the relationship between the collective area and the reference face will be described with reference to FIGS. 15 to 18. Each set area is schematically represented by a simple figure such as a rectangle for convenience. Also, when there is one face (one reference face)

図１５は、「なでる」のジェスチャーの場合の図である。（ａ）のように、集合領域（黒の矩形部分）は基準顔に重なっている場合、画像合成部３１０ｂは、集合領域を移動させるが（下側の図の点線矩形参照）、移動後の集合領域は第１画像を、はみ出しているので、縮小されている。（ｂ）のように、集合領域（黒の矩形部分）は基準顔に重なっている場合、画像合成部３１０ｂは、集合領域を移動させる（下側の図参照）。（ｂ）では、基準顔が小さく、それに伴って集合領域も小さくなっている。このため、移動後の集合領域は第１画像を、はみ出していない。 FIG. 15 is a diagram in the case of the “stroking” gesture. As in (a), when the collection area (black rectangular portion) overlaps the reference face, the image composition unit 310b moves the collection area (see the dotted rectangle in the lower diagram), but after the movement The gathering area is reduced because it protrudes from the first image. As shown in (b), when the collective region (black rectangular portion) overlaps the reference face, the image composition unit 310b moves the collective region (see the lower diagram). In (b), the reference face is small and the gathering area is also small accordingly. For this reason, the collective area after the movement does not protrude from the first image.

図１６は、「指を指す」のジェスチャーの場合の図である。（ａ）のように、集合領域（黒の多角形部分）は基準顔に重なっている場合、画像合成部３１０ｂは、集合領域を移動させるが（下側の図の多角形参照）、移動後の集合領域は第１画像を、はみ出しているので、縮小されている。（ｂ）のように、集合領域（黒の多角形部分）は基準顔に重なっている場合、画像合成部３１０ｂは、集合領域を移動させる（下側の図参照）。（ｂ）では、基準顔が小さく、それに伴って集合領域も小さくなっている。このため、移動後の集合領域は第１画像を、はみ出していない。 FIG. 16 is a diagram in the case of the gesture “pointing finger”. As in (a), when the set area (black polygonal portion) overlaps the reference face, the image composition unit 310b moves the set area (see the polygon in the lower diagram), but after the move The gathering area is reduced because it protrudes from the first image. As shown in (b), when the set area (black polygonal portion) overlaps the reference face, the image composition unit 310b moves the set area (see the lower diagram). In (b), the reference face is small and the gathering area is also small accordingly. For this reason, the collective area after the movement does not protrude from the first image.

図１７は、「拍手」又は「手を振る」（両手で手を振る）のジェスチャーの場合の図である。（ａ）のように、集合領域（黒の矩形部分）は基準顔に重なっている場合、画像合成部３１０ｂは、集合領域を移動させるが（下側の図の点線矩形参照）、移動後の集合領域は第１画像を、はみ出しているので、縮小されている。（ｂ）のように、集合領域（黒の矩形部分）は基準顔に重なっている場合、画像合成部３１０ｂは、集合領域を移動させる（下側の図参照）。（ｂ）では、基準顔が小さく、それに伴って集合領域も小さくなっている。このため、移動後の集合領域は第１画像を、はみ出していない。なお、この時は、両手を含めて一の集合領域としている。 FIG. 17 is a diagram in the case of a gesture of “applause” or “waving hands” (waving hands with both hands). As in (a), when the collection area (black rectangular portion) overlaps the reference face, the image composition unit 310b moves the collection area (see the dotted rectangle in the lower diagram), but after the movement The gathering area is reduced because it protrudes from the first image. As shown in (b), when the collective region (black rectangular portion) overlaps the reference face, the image composition unit 310b moves the collective region (see the lower diagram). In (b), the reference face is small and the gathering area is also small accordingly. For this reason, the collective area after the movement does not protrude from the first image. Note that at this time, a single gathering area including both hands is used.

図１８は、「包み込む」のジェスチャーの場合の図である。（ａ）のように、集合領域（黒の矩形部分）は基準顔に重なっている場合、画像合成部３１０ｂは、集合領域を移動させる（下側の図参照）。（ｂ）のように、集合領域（黒の矩形部分）は基準顔に重なっている場合、画像合成部３１０ｂは、集合領域を移動させる（下側の図参照）。（ａ）及び（ｂ）両者では、移動後の集合領域は第１画像を、はみ出していない。なお、このようなジェスチャーの場合、集合領域は互いに離間した二つの領域を含み、画像合成部３１０ｂは、各領域について、移動、縮小等を行う。 FIG. 18 is a diagram in the case of the gesture of “wrapping”. As in (a), when the collective region (black rectangular portion) overlaps the reference face, the image composition unit 310b moves the collective region (see the lower diagram). As shown in (b), when the collective region (black rectangular portion) overlaps the reference face, the image composition unit 310b moves the collective region (see the lower diagram). In both (a) and (b), the gathered area after the movement does not protrude the first image. In the case of such a gesture, the collective region includes two regions that are separated from each other, and the image composition unit 310b performs movement, reduction, and the like for each region.

なお、上記の画像、領域の移動、縮小等は、例えば、各画素についての座標変換によって行われる。 Note that the above-described movement, reduction, and the like of the image and area are performed by, for example, coordinate conversion for each pixel.

次に画像合成部３１０ｂは、第２画像に第１画像を合成し、第３画像を生成する（ステップＳ３０５）。画像合成部３１０ｂは、第１画像データと第２画像データとに基づいて第３画像を生成する。例えば、画像合成部３１０ｂは、第２画像データが表す各手領域画像を、それぞれ、第１画像に重畳させて、複数の第３画像を生成する。つまり、ここで生成される第３画像の数は、手領域画像の数と同数になる。画像合成部３１０ｂは、ステップＳ４０４で生成された、サイズ係数倍（縮小又は拡大）された各手領域画像を第１画像に重畳する。また、各手領域画像の中心位置が、前記ステップＳ４０５、ステップＳ４０７及びステップＳ４０９で特定された、集合領域の中心位置になるように、各手領域画像を第１画像に重畳させる。また、縮小率が特定されている場合には、各手領域画像を特定されている縮小率で縮小し、縮小後の各手領域画像を第１画像に重畳させる。上記のように、集合領域は基準顔及び第２画像を重畳させるべきでない顔領域と所定の面積以上重ならないので、集合領域を構成する各手領域画像もこれらと所定の面積以上重ならないことになる。このようにして、画像合成部３１０ｂは、第１画像に、第２画像（を構成する各手領域画像）を合成した複数の第３画像を生成する。 Next, the image composition unit 310b composes the first image with the second image to generate a third image (step S305). The image composition unit 310b generates a third image based on the first image data and the second image data. For example, the image composition unit 310b superimposes each hand region image represented by the second image data on the first image to generate a plurality of third images. That is, the number of third images generated here is the same as the number of hand region images. The image compositing unit 310b superimposes each hand region image generated in step S404 and multiplied by the size coefficient (reduced or enlarged) on the first image. Further, each hand region image is superimposed on the first image so that the center position of each hand region image becomes the center position of the collective region specified in Steps S405, S407, and S409. If the reduction ratio is specified, each hand area image is reduced at the specified reduction ratio, and each reduced hand area image is superimposed on the first image. As described above, the collective area does not overlap with the face area that should not be superimposed with the reference face and the second image more than a predetermined area, so that each hand area image that constitutes the collective area does not overlap with the face area more than a predetermined area. Become. In this way, the image composition unit 310b generates a plurality of third images obtained by compositing the second image (each hand region image constituting the first image) with the first image.

次に画像合成部３１０ｂが第３画像を生成すると、画像送信部３１０ａは、生成した第３画像データを、内部記憶部３２０に記録された送信元アドレスデータ（第２の端末２００から送信された画像ＩＤを特定するデータに対応して記録されたデータ）が示す送信元（第１の端末１００）に送信する（ステップＳ２０７）。このとき、画像送信部３１０ａは、適宜、第３画像データとともに送信先アドレスデータを送信してもよい。これによって、第１の端末１００では第１ユーザが返信元を特定できる。画像合成部３１０ｂは、通信部３５０及びネットワーク９００を介して前記のデータを送信する。 Next, when the image composition unit 310b generates the third image, the image transmission unit 310a transmits the generated third image data to the transmission source address data recorded in the internal storage unit 320 (transmitted from the second terminal 200). The data is recorded to the transmission source (first terminal 100) indicated by the data (corresponding to the data specifying the image ID) (step S207). At this time, the image transmission unit 310a may transmit destination address data together with the third image data as appropriate. Thereby, in the first terminal 100, the first user can specify the reply source. The image composition unit 310b transmits the data via the communication unit 350 and the network 900.

次に、画像受信部１１０ｃが前記の第３画像データを通信部１５０を介して受信すると（ステップＳ２０８）、表示制御部１１０ｂは、第３画像データを表示部１４０に供給し、第３画像データが表す第３画像を表示部１４０に表示する（ステップＳ２０９）。ここでは、第３画像は、複数あるので、時系列に沿って順次表示部１４０に表示される。このため、表示部１４０は、スライドショー又は動画のような構成で、第３画像が表示される。このため、第２ユーザの手の動きが再現され、表示部１４０に表示される画像は面白みのある画像になる。 Next, when the image receiving unit 110c receives the third image data via the communication unit 150 (step S208), the display control unit 110b supplies the third image data to the display unit 140, and the third image data Is displayed on the display unit 140 (step S209). Here, since there are a plurality of third images, they are sequentially displayed on the display unit 140 in time series. Therefore, the display unit 140 displays the third image with a configuration such as a slide show or a moving image. For this reason, the movement of the hand of the second user is reproduced, and the image displayed on the display unit 140 is an interesting image.

このようにして、第１画像の返信画像として、この第１画像に、第２ユーザのジェスチャーの画像（第２画像）が重畳された第３画像が、第１の端末１００に供給され、表示されることになる。 In this way, the third image obtained by superimposing the second user's gesture image (second image) on the first image is supplied to the first terminal 100 as a reply image of the first image and displayed. Will be.

図１９に上記画像送受信処理の第２処理の概要を示す。 FIG. 19 shows an outline of the second process of the image transmission / reception process.

図１９のように、第２の端末２００の表示部１４０に第１画像（第１の端末１００から送信された画像）が表示されると、第２ユーザは例えば、第１画像に含まれる人物の顔をなでるようなジェスチャーを、表示部２４０の前方で行う。撮影部２７０は様子を撮像する。第２の端末は、撮影画像に基づいて、第２画像を生成し、サーバ３００に送信する。サーバ３００は、第２画像と第１画像とを合成し、合成した第３画像を生成して第１の端末１００に送信する。第１の端末１００は、受信した第３画像を表示部１４０に表示する。このとき、第２ユーザの手の画像の位置、大きさ等は、第１画像の被写体（顔）の大きさに応じて変更される。このため、（ａ）では、手５０（影絵）が縮小されている。また、（ｂ）では、手５０が縮小されていない。 When the first image (the image transmitted from the first terminal 100) is displayed on the display unit 140 of the second terminal 200 as illustrated in FIG. 19, the second user is, for example, a person included in the first image. A gesture such as stroking his face is performed in front of the display unit 240. The imaging unit 270 images the state. The second terminal generates a second image based on the captured image and transmits it to the server 300. The server 300 combines the second image and the first image, generates a combined third image, and transmits it to the first terminal 100. The first terminal 100 displays the received third image on the display unit 140. At this time, the position and size of the image of the second user's hand are changed according to the size of the subject (face) of the first image. For this reason, the hand 50 (shadow) is reduced in (a). Moreover, in (b), the hand 50 is not reduced.

以上のような構成によって、ネットワーク９００経由で送られて来た写真（第１画像）へ返信する際、第２の端末２００の撮像部２７０で返信者（第２ユーザ）の手のジェスチャーを撮影し、それを動く影絵（第２画像）として第１画像へ重畳した第３画像を生成する。これにより、キーボードやマウス等の入力装置なしに、簡単に動きのあるリッチなコンテンツを生成し、手軽に楽しく返信することができる。これは、第１の端末１００等がデジタルフォトフレームであるときに特に言える。また、第３画像の生成に際しては、手と顔の画像認識を用いる。抽出した手の動きを、顔の位置・サイズを考慮し自動調整して重畳することで、ユーザが細かい位置・サイズ指定をすることなく、適切な配置の影絵が生成される。 With the configuration as described above, when returning to the photograph (first image) sent via the network 900, the gesture of the hand of the responder (second user) is photographed by the imaging unit 270 of the second terminal 200. Then, a third image superimposed on the first image is generated as a moving picture (second image). This makes it possible to easily generate rich contents with movement without using an input device such as a keyboard and a mouse, and to reply easily and happily. This is particularly true when the first terminal 100 or the like is a digital photo frame. In addition, hand and face image recognition is used when generating the third image. By automatically adjusting and superimposing the extracted hand movement in consideration of the position and size of the face, a silhouette with an appropriate arrangement can be generated without the user specifying a fine position and size.

デジタルフォトフレーム等の画像表示機器は、高機能になってもデザイン等の要請からキーボード等のリッチな入力装置が備えられず、ネットワーク９００経由で写真を受け取っても、メッセージや動きのあるコンテンツで返信することが難しかった。本実施形態では、撮像２７０を用いて手を撮影し返信に使う影絵を生成する。影絵を重畳する位置やサイズを、顔認識情報を使うことで自動化したので、撮像部２７０以外に特別な入力装置なしで動きのあるリッチな返信コンテンツを生成することが出来る。また、手の動きは自由であるので、例えば定型メッセージやスタンプといった選択型の定型的な画像の返信方法に比べ、多くのバリエーションを持たせられる。また、撮像部２７０は、手の輪郭が認識できる程度の品質の画像が撮れれば良いので、ＱＲコード認識用などの低画素数の撮像素子を用いた撮像部２７０とも共用できる。 An image display device such as a digital photo frame is not equipped with a rich input device such as a keyboard due to a request for design even if it becomes highly functional, and even if a photograph is received via the network 900, it is a message or moving content. It was difficult to reply. In this embodiment, a hand is photographed using the imaging 270, and a shadow picture used for reply is generated. Since the position and size for superimposing the shadow picture are automated by using the face recognition information, it is possible to generate a rich reply content with movement without a special input device other than the imaging unit 270. Further, since the movement of the hand is free, there are many variations as compared with a selective type image reply method such as a standard message or a stamp. In addition, the image capturing unit 270 only needs to be able to capture an image with a quality that allows the outline of the hand to be recognized. Therefore, the image capturing unit 270 can be used in common with the image capturing unit 270 that uses an image sensor with a low pixel number for QR code recognition.

また、上記では、撮影画像ではなく、手の領域を表す第２画像等をサーバに送る方式にしたため、第２の端末２００からサーバ３００へ送るデータ量が抑えられている。このため、第２の端末２００が低速で安価なネットワーク回線で上りが細い場合も、十分高速にデータを送信することができる。また、第１の端末１００内に第１画像がある場合、第２画像のみを送り、再生時に第１の端末１００で第３画像を生成するようにすれば、サーバ３００から第１の端末１００の下り方面のネットワーク負荷も低減することができる。 In the above description, since the second image representing the hand region, etc., is sent to the server instead of the captured image, the amount of data sent from the second terminal 200 to the server 300 is reduced. For this reason, even when the second terminal 200 is a low-speed and inexpensive network line and the uplink is thin, data can be transmitted at a sufficiently high speed. In addition, when there is a first image in the first terminal 100, if only the second image is sent and the third image is generated by the first terminal 100 at the time of reproduction, the first terminal 100 is transmitted from the server 300. The network load in the downstream direction can also be reduced.

本実施形態においては、サーバ３００（画像処理装置）は、上記構成によって、第１画像を取得（受信）する第１画像取得部（ここでは画像受信部３１０ａ）と、第２画像を取得し、取得した第２画像と、画像受信部３１０ａが取得した第１画像とを合成する画像合成部３１０ｂと、を備える。また、画像合成部３１０ｂは、第１画像において所定の基準を満たす第１領域（ここでは、顔の領域であって、基準顔の領域を含む領域）を検出し、検出した第１領域の近傍の領域に第２画像を重畳させ、第１画像と第２画像とを合成した第３画像を生成する。 In the present embodiment, the server 300 (image processing apparatus) acquires the first image acquisition unit (here, the image reception unit 310a) that acquires (receives) the first image, the second image, and the second image by the above-described configuration. An image combining unit 310b that combines the acquired second image and the first image acquired by the image receiving unit 310a; Further, the image composition unit 310b detects a first region (here, a region of a face and including a reference face region) that satisfies a predetermined criterion in the first image, and is in the vicinity of the detected first region A second image is superimposed on the area of the first image, and a third image is generated by combining the first image and the second image.

本実施形態においては、返信画像生成システム１は、上記構成によって、第１の端末１００が第２の端末２００に送信する第１画像に基づいて、第１画像に対する返信画像を生成する返信画像生成システム１であって、第１画像を表示した第２の端末２００の表示面前方を撮影した撮影画像を取得し、取得した撮影画像に基づいて、所定の像を抽出し、抽出した所定の像を表す第２画像を生成する画像生成部２１０ｂと、第１画像と画像生成部２１０ｂが生成した第２画像とを合成する画像合成部３１０ｂと、を備える。また、画像合成部３１０ｂは、第１画像において所定の基準を満たす第１領域を検出し、検出した第１領域の近傍の領域に第２画像を重畳させ、第１画像と第２画像とを合成した第３画像を生成する。 In the present embodiment, the reply image generation system 1 generates a reply image for the first image based on the first image that the first terminal 100 transmits to the second terminal 200 with the above configuration. In the system 1, a captured image obtained by capturing the front of the display surface of the second terminal 200 displaying the first image is acquired, a predetermined image is extracted based on the acquired captured image, and the extracted predetermined image An image generation unit 210b that generates a second image representing the image, and an image combination unit 310b that combines the first image and the second image generated by the image generation unit 210b. In addition, the image composition unit 310b detects a first region that satisfies a predetermined criterion in the first image, superimposes the second image on a region near the detected first region, and combines the first image and the second image. A synthesized third image is generated.

上記のような構成によって、本実施形態では、所定の基準を満たす第１領域の近傍の領域に第２画像を重畳させるので、所定の画像を面白みのある画像に加工することができる。 With the configuration as described above, in the present embodiment, the second image is superimposed on a region in the vicinity of the first region that satisfies the predetermined criterion, so that the predetermined image can be processed into an interesting image.

また、上記構成によって、画像合成部３１０ｂは、第２画像の大きさを第１領域の大きさに応じて調整して、第１領域の近傍の領域に前記第２画像を重畳させる。これによって、第２画像が前記の第１領域に対して大きすぎたり、小さすぎたりしても、適切な大きさで第２画像を重畳させることができるので、精度良く面白みのある画像を生成することができる。 Further, with the above configuration, the image composition unit 310b adjusts the size of the second image according to the size of the first region, and superimposes the second image on a region near the first region. As a result, even if the second image is too large or too small with respect to the first region, the second image can be superimposed with an appropriate size, so that an interesting and accurate image can be generated. can do.

なお、上記では、第１画像は、人物の顔が写った画像であるが、他の画像であってもよく、上記では、第２画像は、人の手の像を表す画像であるが他の画像であっても良い。さらに、第１領域は、人物の顔を含む領域であるが、他の領域であってもよい。 In the above, the first image is an image showing a person's face, but may be another image. In the above, the second image is an image representing an image of a human hand. It may be an image. Furthermore, the first area is an area including a human face, but may be another area.

また、上記では、第２画像は複数の静止画像を含み、画像合成部３１０ｂは、第２画像に含まれる複数の静止画像それぞれを第１画像に重畳させた、複数の第３画像を生成する。これによって、動きのある第３画像が生成され、より面白みのある画像が生成される。 In the above, the second image includes a plurality of still images, and the image composition unit 310b generates a plurality of third images in which the plurality of still images included in the second image are superimposed on the first image. . As a result, a moving third image is generated, and a more interesting image is generated.

画像合成部３１０ｂは、第２画像を第１領域に所定の面積以上重ならないように（全く重ならない場合も含む。）第１画像に重畳するので、第３画像では、第１領域と第２画像との関係が明確になり、より面白みのある画像が生成される。 The image composition unit 310b superimposes the second image on the first image so as not to overlap the first region by more than a predetermined area (including the case where the second image does not overlap at all). The relationship with the image becomes clear, and a more interesting image is generated.

また、第１画像は、第１の端末１００がネットワーク３００を介して第２の端末２００に送信した送信画像であり、第３画像は、第１の端末１００によって表示される、送信画像に対する返信画像である。このように、上記では、返信画像についておもしろい画像を生成できる。 Further, the first image is a transmission image transmitted from the first terminal 100 to the second terminal 200 via the network 300, and the third image is a reply to the transmission image displayed by the first terminal 100. It is an image. Thus, in the above, an interesting image can be generated for the reply image.

なお、上記では、第３画像は、複数の静止画像からなるとしたが、一の第１画像に、各手領域画像を順次重畳させたような、所謂ダイナミックフォト等であってもよい。また、第３画像は、スライドショーのプレイリストのような形式であってもよい。このように、第３画像の形式は、一の画像であってもよく、どんなものであっても良い。 In the above description, the third image is composed of a plurality of still images. However, the third image may be a so-called dynamic photo in which each hand region image is sequentially superimposed on one first image. The third image may be in the form of a slide show playlist. Thus, the format of the third image may be one image or any other format.

また、第２画像は、一の静止画像からなるものであってもよい。この場合は、一の静止画像について上記の処理が行われる。第３画像も一の静止画像になる。 Further, the second image may be a single still image. In this case, the above process is performed for one still image. The third image is also a still image.

また、第１の端末１００、第２の端末２００、及び、サーバ３００が有する各構成要素は、他の装置が備えても良い。例えば、サーバ３００が、画像生成部２１０ｂを備えてもよい。この場合、第２の端末２００は、撮影画像をそのままサーバ３００に送信し、サーバ３００の画像生成部が第３画像を生成する。同様にして、サーバ３００の画像合成部３１０ｂは、第２の端末２００が備えても良い。この場合には、第２の端末２００で、第３画像が生成され、サーバ３００を介して、第１の端末１００に第３画像が返信される。同様にして、第１の端末１００が画像生成部２１０ｂ及び画像合成部３１０ｂを備えても良く、この場合には、撮影画像が第１の端末１００に供給され、第１の端末１００は、第２画像及び第３画像を生成する。このようにして、各部は、どの装置が備えても良い。このため、構成によっては、上記の画像処理装置は、第１の端末１００、又は、第２の端末２００となることもある。また、上記の取得とは、記憶部から画像等を取得する、通信部を介して画像等を取得する等の他、画像等を生成することによって取得することも含む。 In addition, each device included in the first terminal 100, the second terminal 200, and the server 300 may be included in another device. For example, the server 300 may include the image generation unit 210b. In this case, the second terminal 200 transmits the captured image as it is to the server 300, and the image generation unit of the server 300 generates a third image. Similarly, the second terminal 200 may include the image composition unit 310b of the server 300. In this case, the second image is generated by the second terminal 200 and the third image is returned to the first terminal 100 via the server 300. Similarly, the first terminal 100 may include an image generation unit 210b and an image composition unit 310b. In this case, a captured image is supplied to the first terminal 100, and the first terminal 100 Two images and a third image are generated. In this manner, any device may be included in each unit. For this reason, depending on the configuration, the image processing apparatus may be the first terminal 100 or the second terminal 200. Moreover, said acquisition includes acquiring by producing | generating an image etc. other than acquiring an image etc. from a memory | storage part, acquiring an image etc. via a communication part.

なお、プログラムは、ＯＳ（Operation System）と協働してＣＰＵ１１１（ＣＰＵ２１１又はＣＰＵ３１１）に後述の画像送受信処理Ａ（Ｂ又はＣ）を行わせるものであってもよい。この場合、ＯＳもフラッシュメモリ１２１（フラッシュメモリ２２１又はハードディスク３２１）に記録される。また、プログラムは、持ち運び可能な記憶媒体（例えば、ＣＤ−ＲＯＭ（Compact Disk Read Only Memory）、ＤＶＤ−ＲＯＭ（Digital Versatile Disk Read Only Memory））に記録され、第１の端末１００（第２の端末２００又はサーバ３００）に供給され、フラッシュメモリ１２１（フラッシュメモリ２２１又はハードディスク３２１）に記録されてもよい。また、プログラムは、ネットワーク９００を介して第１の端末１００（第２の端末２００又はサーバ３００）に供給され、フラッシュメモリ１２１（フラッシュメモリ２２１又はハードディスク３２１）に記録されてもよい。 The program may cause the CPU 111 (CPU 211 or CPU 311) to perform image transmission / reception processing A (B or C) described later in cooperation with an OS (Operation System). In this case, the OS is also recorded in the flash memory 121 (flash memory 221 or hard disk 321). The program is recorded on a portable storage medium (for example, CD-ROM (Compact Disk Read Only Memory), DVD-ROM (Digital Versatile Disk Read Only Memory)), and the first terminal 100 (second terminal). 200 or server 300) and may be recorded in the flash memory 121 (flash memory 221 or hard disk 321). The program may be supplied to the first terminal 100 (the second terminal 200 or the server 300) via the network 900 and recorded in the flash memory 121 (the flash memory 221 or the hard disk 321).

プログラムが記録された、フラッシュメモリ１２１（フラッシュメモリ２２１又はハードディスク３２１）等の記憶装置、又は、持ち運び可能な記憶媒体等は、プログラムを記憶した、コンピュータが読み取り可能な記憶媒体になる。プログラムは、機能の少なくとも一部が専用回路によって実現された装置（コンピュータ）を動作させるものであってもよい。つまり、第１の端末１００（第２の端末２００又はサーバ３００）は、全体として、後述の画像送受信処理Ａ（Ｂ又はＣ）を行うものであればよく、プログラムは、そのような第１の端末１００（第２の端末２００又はサーバ３００）を動作させるものであればよい。 A storage device such as the flash memory 121 (the flash memory 221 or the hard disk 321) or a portable storage medium in which the program is recorded becomes a computer-readable storage medium storing the program. The program may operate a device (computer) in which at least a part of the functions is realized by a dedicated circuit. That is, the first terminal 100 (the second terminal 200 or the server 300) may be anything as long as it performs an image transmission / reception process A (B or C), which will be described later, and the program may be such a first Any device that operates the terminal 100 (the second terminal 200 or the server 300) may be used.

本実施形態では、第１画像に第２画像を重畳するが、そのまま重畳するだけでは、適切な第３画像にはならない。例えば、手と端末と撮像部とのサイズ・位置関係では手が大きく撮影され過ぎ、そのまま第２画像を重畳させると第１画像の大半が隠れてしまう。また、大画面ＴＶのような端末に、画面の前の全範囲をカバーできるカメラを載せたとすると、今度は手が小さくなり過ぎてしまう。 In the present embodiment, the second image is superimposed on the first image. However, if the second image is simply superimposed as it is, an appropriate third image cannot be obtained. For example, the hand, the terminal, and the size / position relationship between the imaging unit captures the hand too much, and if the second image is superimposed as it is, most of the first image is hidden. If a camera such as a large-screen TV is mounted with a camera that can cover the entire range in front of the screen, the hand becomes too small.

そこで、第２画像の配置のルールとして、第２画像の配置のときに、第１画像における所定の顔を隠さない。また、顔が複数ある場合は、一番大きいものを基準顔とし、それと同程度の大きさの顔全てを隠さないように配置する。また、顔が大きい場合は、影絵の手も大きくする。 Therefore, as a rule for arrangement of the second image, a predetermined face in the first image is not hidden when the second image is arranged. When there are a plurality of faces, the largest face is used as a reference face, and all faces having the same size are arranged so as not to hide them. If the face is large, the shadow hand is also enlarged.

第１画像に加える第２画像は、様々なものが考えられるが、本実施形態のような「受け取った画像への返信」というシチュエーションに限ると、ある程度種類が絞れる。これまで、(アナログ写真を含めて)アルバム等で贈られる画像は、一緒に行った旅行の写真、離れて暮らす孫のイベント写真など、人が写っているケースが多い。このため、返信に用いる画像は、大事な登場人物が背景の人物に比して大きく写っている写真を使う場合が圧倒的に多いと推定される。また、そのような場合、手は大事な登場人物を隠すような動きをすることはせず、かつ、なでる，指差す，手を振る，包み込む等、人をターゲットとした動きをするが多いとも推定される(面白ネタ写真を作るためにわざと人を隠すことはあっても、家族・友人・知人への写真のお礼で返信相手本人を隠すようなことはしづらい)。そこで、上記のルールによって、第２画像を第１画像に重畳することによって、写真の主要被写体の人物(顔が大きく写っている群)を避けた位置に、手の動きのターゲットである人物の顔に比例したサイズの手を重畳することができる。 There are various second images to be added to the first image. However, if the situation is limited to the “reply to received image” situation as in this embodiment, the types can be limited to some extent. Until now, many of the images presented in albums (including analog photos) have been photographed, such as travel photos taken together and event photos of grandchildren living apart. For this reason, it is estimated that the image used for the reply is overwhelmingly often used when a photograph in which an important character appears larger than the background character is used. In addition, in such cases, the hand does not move to hide important characters, and often moves to target people, such as stroking, pointing, waving, wrapping. It is estimated (although it is difficult to hide people intentionally in order to make interesting news photos, it is difficult to hide the person who responded by thanks to photos of family, friends, and acquaintances). Therefore, by superimposing the second image on the first image according to the above rules, the position of the person who is the target of the hand movement in the position avoiding the main subject person in the photograph (a group with a large face) A hand of a size proportional to the face can be superimposed.

なお、図１７（ａ）のように、「拍手，両手を振る」の顔の大きい場合のように位置が大きくずれてしまう場合がある。しかし、上記のルールで上述したように、返信というシチュエーションでは大事な登場人物の顔を隠すようなジェスチャーは好まれないので、このようなケースはほとんど発生しないと思われる。また、ユーザが誤って顔を大きく隠すようなジェスチャーをしてしまった場合の為にも、このようにずらした方が望ましい。 Note that, as shown in FIG. 17A, the position may be greatly shifted as in the case of a large face of “applause and shake hands”. However, as described above in the above rules, since a gesture that hides the face of an important character is not preferred in a reply situation, such a case is unlikely to occur. In addition, it is desirable to shift in this way even when the user mistakenly makes a gesture that largely hides the face.

１・・・返信画像生成システム、１００・・・第１の端末、１１０・・・制御部、１１０ａ・・・画像送信部、１１０ｂ・・・表示制御部、１１０ｃ・・・画像受信部、１１１・・・ＣＰＵ、１１２・・・ＲＡＭ、１２０・・・内部記憶部、１２１・・・フラッシュメモリ、１３０・・・操作部、１４０・・・表示部、１５０・・・通信部、１６０・・・Ｉ／Ｆ部、記憶媒体・・・１９０、２００・・・第２の端末、２１０・・・制御部、２１０ａ・・・画像送信部、１１０ｂ・・・画像生成部、１１０ｃ・・・画像受信部、１１０ｄ・・・表示制御部、１１０ｅ・・・撮影制御部、２１１・・・ＣＰＵ、２１２・・・ＲＡＭ、２２０・・・内部記憶部、２２１・・・フラッシュメモリ、２３０・・・操作部、２４０・・・表示部、２５０・・・通信部、２６０・・・Ｉ／Ｆ部、２７０・・・撮影部、記憶媒体・・・２９０、３００・・・サーバ、３１０・・・制御部、３１０ａ・・・画像送信部、３１０ｂ・・・画像合成部、３１０ｃ・・・画像受信部、３１１・・・ＣＰＵ、３１２・・・ＲＡＭ、３２０・・・内部記憶部、３２１・・・ハードディスク、３３０・・・操作部、３４０・・・表示部、３５０・・・通信部、９００・・・サーバ DESCRIPTION OF SYMBOLS 1 ... Reply image generation system, 100 ... 1st terminal, 110 ... Control part, 110a ... Image transmission part, 110b ... Display control part, 110c ... Image reception part, 111 ... CPU, 112 ... RAM, 120 ... internal storage unit, 121 ... flash memory, 130 ... operation unit, 140 ... display unit, 150 ... communication unit, 160 ... -I / F unit, storage medium ... 190, 200 ... second terminal, 210 ... control unit, 210a ... image transmission unit, 110b ... image generation unit, 110c ... image Receiving unit, 110d ... display control unit, 110e ... photographing control unit, 211 ... CPU, 212 ... RAM, 220 ... internal storage unit, 221 ... flash memory, 230 ... Operation unit, 240... Display unit, 250. Communication unit 260 ... I / F unit, 270 ... shooting unit, storage medium ... 290, 300 ... server, 310 ... control unit, 310a ... image transmission unit, 310b ..Image composition unit, 310c ... Image receiving unit, 311 ... CPU, 312 ... RAM, 320 ... Internal storage unit, 321 ... Hard disk, 330 ... Operation unit, 340 ... -Display unit, 350 ... Communication unit, 900 ... Server

Claims

First image acquisition means for acquiring a first image;
An image combining unit that acquires a second image and combines the acquired second image with the first image acquired by the first image acquiring unit;
The image synthesizing unit detects a first region satisfying a predetermined criterion in the first image acquired by the first image acquiring unit, and superimposes the second image on a region near the detected first region. Generating a third image obtained by synthesizing the first image and the second image;
An image processing apparatus.

The image synthesizing unit adjusts the size of the second image according to the size of the first region, and superimposes the second image on a region near the first region;
The image processing apparatus according to claim 1.

The first image is an image showing a person's face,
The second image is an image representing an image of a human hand,
The first area is an area including the face of the person.
The image processing apparatus according to claim 1, wherein the image processing apparatus is an image processing apparatus.

The second image includes a plurality of still images,
The image synthesizing unit generates one or more third images obtained by superimposing the plurality of still images included in the second image on the first image;
The image processing apparatus according to claim 1.

The image composition means superimposes the second image on the first image so as not to overlap the first region by a predetermined area;
The image processing apparatus according to claim 1.

The first image is a transmission image transmitted from the first terminal to the second terminal via the network,
The third image is a reply image to the transmission image displayed by the first terminal.
The image processing apparatus according to claim 1.

A reply image generation system for generating a reply image for the first image based on a first image transmitted from a first terminal to a second terminal,
A photographed image obtained by photographing the front of the display surface of the second terminal displaying the first image is obtained, a predetermined image is extracted based on the obtained photographed image, and a first image representing the extracted prescribed image is obtained. Image generating means for generating two images;
Image synthesizing means for synthesizing the first image and the second image generated by the image generating means,
The image composition means detects a first region that satisfies a predetermined criterion in the first image, superimposes the second image on a region in the vicinity of the detected first region, and the first image and the second image Generating a third image synthesized with the image;
A reply image generation system characterized by that.

Computer
First image acquisition means for acquiring a first image;
Acquiring a second image, and causing the acquired second image to function as an image combining unit that combines the first image acquired by the first image acquiring unit;
The image synthesizing unit detects a first region satisfying a predetermined criterion in the first image acquired by the first image acquiring unit, and superimposes the second image on a region near the detected first region. Generating a third image obtained by synthesizing the first image and the second image;
A program characterized by that.