JP2009251679A

JP2009251679A - Imaging apparatus and image recognition method

Info

Publication number: JP2009251679A
Application number: JP2008095421A
Authority: JP
Inventors: Norifumi Shibayama; 憲文柴山
Original assignee: Sony Ericsson Mobile Communications Japan Inc
Current assignee: Sony Corp
Priority date: 2008-04-01
Filing date: 2008-04-01
Publication date: 2009-10-29

Abstract

<P>PROBLEM TO BE SOLVED: To efficiently and quickly recognize an object image which has newly appeared in an image while following up a detected object image in an image in continuous images. <P>SOLUTION: This imaging apparatus includes a function for retrieving the object image in one portion of the region of an image when retrieving any object image other than the object image detected in continuous images in the image. Thus, it is possible to reduce the retrieval processing quantity of each image when retrieving any object image other than the detected object image, and to efficiently and quickly recognize an object which has newly appeared in the image. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は撮像装置及び画像認識方法に関し、より詳細には、連続画像中から所定の対象物画像を認識することができる撮像装置及び画像認識方法に関する。 The present invention relates to an imaging apparatus and an image recognition method, and more particularly to an imaging apparatus and an image recognition method that can recognize a predetermined object image from continuous images.

従来、デジタルカメラ等の撮像装置の多くには、被写体に焦点を合わすための自動焦点（オートフォーカス）機能が備わっている。また、最近では、画面中から顔の特徴（顔の形、顔の色、目や鼻などの特徴）を抽出してその顔に焦点を自動的に合わせる（顔検知オートフォーカス）機能を備えたカメラも開発されている。このようなカメラでは、顔検知オートフォーカス機能をオンにしておけば、カメラを人物に向けるだけで顔に自動的に焦点が合う。この際、例えばカメラの液晶画面等に表示された顔の周りに四角形の枠が表示され、顔が検知されていることを確認することができる。 2. Description of the Related Art Conventionally, many imaging devices such as digital cameras have an autofocus function for focusing on a subject. Recently, a facial feature (facial shape, facial color, eye, nose, etc.) is extracted from the screen and the focus is automatically adjusted to that face (face detection autofocus). Cameras are also being developed. With such a camera, if the face detection autofocus function is turned on, the camera automatically focuses on the face just by pointing the camera at the person. At this time, for example, a square frame is displayed around the face displayed on the liquid crystal screen of the camera, and it can be confirmed that the face is detected.

このような装置では、連続画像中で顔画像（以下では、単に顔ともいう）の認識及び追尾処理をいかに少ない演算処理量で高速化するかが一つの課題となる。そこで、従来、連続画像中での顔認識及び追尾処理を高速化するための様々な技術が提案されている（例えば、特許文献１参照）。 In such an apparatus, one problem is how to speed up recognition and tracking processing of a face image (hereinafter also simply referred to as a face) in a continuous image with a small amount of calculation processing. Therefore, various techniques for speeding up face recognition and tracking processing in continuous images have been proposed (see, for example, Patent Document 1).

特許文献１には、認識済みの顔を追尾する際に、過去の画像内での顔の検知結果（具体的には顔の位置情報）を参照して、次の画像では、顔の探索範囲を制限する技術が提案されている。この特許文献１の技術では、顔追尾時の認識処理の処理量を減らすことにより、認識処理の高速化を図っている。 In Patent Document 1, when a recognized face is tracked, a face detection result (specifically, position information of a face) in a past image is referred to. A technique for limiting the above has been proposed. In the technique of Patent Document 1, the recognition processing speed is increased by reducing the amount of recognition processing during face tracking.

特開２００４−３１８３３１号公報JP 2004-318331 A

上述した特許文献１に記載の技術では、画像内ですでに検知されている顔の追尾は高速処理可能であるが、画像内に新たに別の顔が現れた場合に、その新たに現れた顔を認識する処理については十分考慮されていない。このような状況が発生した場合、画像中に新たに現れた顔を検知するためには、通常、画像全体を探索する必要があり、処理効率が悪いという問題があった。 In the technique described in Patent Document 1 described above, tracking of a face that has already been detected in an image can be processed at high speed. However, when another face appears in the image, the face appears newly. The process for recognizing the face is not sufficiently considered. When such a situation occurs, in order to detect a newly appearing face in the image, it is usually necessary to search the entire image, and there is a problem that processing efficiency is poor.

本発明は上記問題を解決するためになされたものである。本発明の目的は、すでに検出されている顔画像などの対象物画像を追尾しながら、新たに画像中に現れた対象物画像を効率よく短時間で認識することができる撮像装置及び画像認識方法を提供することである。 The present invention has been made to solve the above problems. An object of the present invention is to provide an imaging apparatus and an image recognition method capable of efficiently recognizing a target image newly appearing in an image while tracking a target image such as a face image that has already been detected. Is to provide.

上記課題を解決するために、本発明の撮像装置は、撮像素子を有して画像信号を出力する撮像部と、撮像部が出力した画像信号から特定の条件に合致する対象物画像を探索して検知し、検知結果に基づいて所定の画像処理を行う画像処理部とを備える構成とした。また、本発明の撮像装置は、画像処理部で、対象物画像として第１の対象物画像を探索して追尾しながら第２の対象物画像を画像内で探索する際に、画像信号で得られる画像の一部の領域に制限して第２の対象物画像を探索する制御を行う制御部を備える構成とした。 In order to solve the above problems, an imaging apparatus of the present invention searches an object image that has an imaging element and outputs an image signal, and an object image that meets a specific condition from the image signal output by the imaging unit. And an image processing unit that performs predetermined image processing based on the detection result. The imaging device according to the present invention obtains an image signal when the image processing unit searches for the second object image in the image while searching for and tracking the first object image as the object image. It is set as the structure provided with the control part which controls to search for a 2nd target object image, restrict | limiting to the one part area | region of the image to be performed.

また、上記課題を解決するために、本発明の画像認識方法は次のような手順を含む方法とした。まず、撮像して得た画像信号から第１の対象物画像を探索する（第１のステップ）。次いで、第１のステップで探索された第１の対象物画像を追尾する（第２のステップ）。次いで、第２のステップで第１の対象物画像を追尾しながら、第２の対象物画像を画像の一部の領域に制限して探索する。 In order to solve the above problems, the image recognition method of the present invention is a method including the following procedure. First, a first object image is searched from an image signal obtained by imaging (first step). Next, the first object image searched in the first step is tracked (second step). Next, while tracking the first object image in the second step, the second object image is limited to a partial area of the image and searched.

本発明では、検知済みの対象物画像（第１の対象物画像）以外の対象物画像（第２の対象物画像）を画像内で探索する際には、画像の一部の領域で対象物画像を探索する。それゆえ、検知済みの対象物画像とは別の対象物画像を検知する際の画像毎の処理量を減らすことができ、その別の対象物画像をより迅速に検知することが可能になる。 In the present invention, when searching for an object image (second object image) other than the detected object image (first object image) in the image, the object is detected in a partial area of the image. Search for an image. Therefore, it is possible to reduce the processing amount for each image when detecting an object image different from the detected object image, and to detect the other object image more quickly.

上述のように、本発明の撮像装置及び画像認識方法では、画像内で検知済みの対象物画像とは別の対象物画像を探索する際には画像の一部の領域を探索するので、該別の対象物画像を検知する際の画像毎の処理量を減らすことができる。それゆえ、本発明によれば、検知済みの対象物画像を追尾しながら、画像中に新たに現れた対象物画像に対しても効率よく短時間で認識することができる。 As described above, in the imaging apparatus and the image recognition method of the present invention, when searching for an object image different from the detected object image in the image, a partial area of the image is searched. The processing amount for each image when detecting another object image can be reduced. Therefore, according to the present invention, an object image newly appearing in an image can be efficiently recognized in a short time while tracking a detected object image.

以下、本発明の撮像装置の一実施形態の例を、図１〜４を参照ながら具体的に説明する。なお、本実施形態では、撮像装置として、連続画像中に写った顔を認識する機能を有する移動通信端末を例にとって説明する。ここでいう移動通信端末は、いわゆる携帯電話端末と称される無線電話用の基地局と無線通信を行う端末装置である。ただし、本発明はこの実施形態に限定されるものではない。 Hereinafter, an example of an embodiment of an imaging apparatus of the present invention will be specifically described with reference to FIGS. In this embodiment, a mobile communication terminal having a function of recognizing a face shown in a continuous image will be described as an example of the imaging device. The mobile communication terminal here is a terminal device that performs wireless communication with a radiotelephone base station called a so-called mobile phone terminal. However, the present invention is not limited to this embodiment.

［装置構成］
まず、本実施形態に係る移動通信端末の構成について説明する。本実施形態に係る移動通信端末のブロック構成図を図１に示した。移動通信端末１は、図１に示すように、アンテナ１１と、制御部１２と、アンテナ１１と接続された通信回路１３と、表示部１４と、操作部１５と、記憶部１６と、時計部１７とを備える。また、移動通信端末１は、スピーカ１８と、マイクロフォン１９と、音声処理部２０と、無線ＬＡＮ（Local Area Network）用通信回路２１と、無線ＬＡＮ用アンテナ２２とを備える。 [Device configuration]
First, the configuration of the mobile communication terminal according to the present embodiment will be described. A block diagram of the mobile communication terminal according to the present embodiment is shown in FIG. As shown in FIG. 1, the mobile communication terminal 1 includes an antenna 11, a control unit 12, a communication circuit 13 connected to the antenna 11, a display unit 14, an operation unit 15, a storage unit 16, and a clock unit. 17. The mobile communication terminal 1 also includes a speaker 18, a microphone 19, an audio processing unit 20, a wireless LAN (Local Area Network) communication circuit 21, and a wireless LAN antenna 22.

さらに、移動通信端末１は、カメラ２４を備える。カメラ２４は、撮像素子と、その撮像素子に像光を入射させるレンズ機構と、撮像素子で撮像して得た画像信号を処理して出力する撮像信号処理部とを有する。 Furthermore, the mobile communication terminal 1 includes a camera 24. The camera 24 includes an imaging device, a lens mechanism that causes image light to enter the imaging device, and an imaging signal processing unit that processes and outputs an image signal obtained by imaging with the imaging device.

カメラ２４での撮像は、例えば一定のフレーム周期ごとに連続して行われる。このフレーム周期で連続して得られる画像信号をそのまま記録することで、動画像信号が記録されるいわゆるビデオカメラとなる。また、１フレームの画像信号だけを記録することで、静止画像信号が記録されるいわゆるスチルカメラとなる。なお、カメラ２４が備えるレンズ機構としては、例えばオートフォーカス機構を備え、ターゲットとなる被写体に自動的にフォーカスを合わせることができる。ターゲットとなる被写体は、例えば後述するように、画像信号による画像から認識された顔を適用できる。 Imaging with the camera 24 is performed continuously, for example, every fixed frame period. By recording the image signal continuously obtained in this frame period as it is, a so-called video camera in which a moving image signal is recorded is obtained. Also, by recording only one frame of image signal, a so-called still camera can be obtained in which a still image signal is recorded. In addition, as a lens mechanism with which the camera 24 is provided, for example, an autofocus mechanism is provided, and it is possible to automatically focus on a target subject. As the target subject, for example, as described later, a face recognized from an image based on an image signal can be applied.

また、移動通信端末１は、図１に示すように、制御ライン２５及びデータライン２６を備える。制御ライン２５は、これに接続されている各装置部を制御する信号を流す信号線である。移動通信端末１内の幾つかの装置部は、図１に示すように、制御ライン２５を通じて制御部１２と接続しており、制御部１２の制御により各装置部での処理が行われる。また、データライン２６は、これに接続されている装置部間でデータ転送を行うための信号線である。なお、図１には示していないが、移動通信端末１は電源部を備えており、電源部から各装置部に電力が供給されている。 Further, the mobile communication terminal 1 includes a control line 25 and a data line 26 as shown in FIG. The control line 25 is a signal line through which a signal for controlling each device connected to the control line 25 flows. As shown in FIG. 1, some device units in the mobile communication terminal 1 are connected to the control unit 12 through a control line 25, and processing in each device unit is performed under the control of the control unit 12. The data line 26 is a signal line for transferring data between device units connected thereto. Although not shown in FIG. 1, the mobile communication terminal 1 includes a power supply unit, and power is supplied from the power supply unit to each device unit.

制御部１２は、例えばＣＰＵ（Central Processing Unit）等の演算制御装置からなり、移動通信端末１を構成する各部を制御する。また、制御部１２は、画像中に写っている顔を探索及び認識処理する画像処理部、及びその画像探索及び認識処理を制御する制御部としても機能する。 The control unit 12 includes an arithmetic control device such as a CPU (Central Processing Unit), for example, and controls each unit constituting the mobile communication terminal 1. The control unit 12 also functions as an image processing unit that searches and recognizes a face in the image and a control unit that controls the image search and recognition processing.

通信回路１３は、制御部１２の制御により、アンテナ１１を介して携帯電話基地局（不図示）との間で送信信号の送信及び受信信号の受信を行う。また、通信回路１３は携帯電話基地局とやり取りする電波の変調及び復調も行う。 The communication circuit 13 transmits and receives a transmission signal to and from a mobile phone base station (not shown) via the antenna 11 under the control of the control unit 12. The communication circuit 13 also modulates and demodulates radio waves exchanged with the mobile phone base station.

表示部１４は、液晶ディスプレイ（ＬＣＤ：Liquid Crystal Display）などで構成される。また、操作部１５は、ジョグダイアルやキーパッドなどから構成される。操作部１５では、電話番号やメール文などの入力操作、各種モードの設定操作などの入力操作信号を入力することができる。 The display unit 14 includes a liquid crystal display (LCD). The operation unit 15 includes a jog dial, a keypad, and the like. The operation unit 15 can input an input operation signal such as an input operation such as a telephone number or an e-mail text, or a setting operation of various modes.

記憶部１６は、フラッシュメモリ（半導体メモリ）等の不揮発性メモリから構成される。記憶部１６には、電話帳やスケジュール、メールメッセージ、動画、静止画、音楽、アプリケーションソフトウェア、ブックマーク、ウェブページ等の様々なデータ及びコンピュータプログラムが格納される。 The storage unit 16 includes a nonvolatile memory such as a flash memory (semiconductor memory). The storage unit 16 stores various data and computer programs such as a phone book, a schedule, a mail message, a moving image, a still image, music, application software, a bookmark, and a web page.

時計部１７は、時刻及び指定されたイベントの発生時間を計時するものであり、コンピュータを管理するＯＳ（Operating System）は、この時計部１７から日時情報を取得する。また、音声処理部２０は、主に、音声通話時に音声データをアナログデジタル変換する装置部である。 The clock unit 17 measures time and the occurrence time of a specified event, and an OS (Operating System) that manages the computer acquires date and time information from the clock unit 17. The voice processing unit 20 is a device unit that mainly converts voice data from analog to digital during a voice call.

無線ＬＡＮ用通信回路２１は、制御部１２の制御に基づいて所定の変調及び復調を行い、無線ＬＡＮ用アンテナ２２を介して外部のアクセスポイント装置（不図示）との間で無線信号の送信及び受信を行う。 The wireless LAN communication circuit 21 performs predetermined modulation and demodulation based on the control of the control unit 12, and transmits wireless signals to and from an external access point device (not shown) via the wireless LAN antenna 22. Receive.

［画像認識処理］
次に、本実施形態に係る移動通信端末１において、連続画像から顔を認識する処理を、図２〜４を参照しながら説明する。図２は、本実施形態の画像認識処理の手順を示したフローチャートである。図３は、図２中のステップＳ７での処理手順をより詳細に示したフローチャートである。図４（ａ）〜４（ｄ）は、図２中のステップＳ３、Ｓ５及びＳ７における顔の探索及び追尾の様子を示した図である。なお、以下で説明する顔の認識処理手順では、カメラの起動する工程から説明する。すなわち、直前に撮影した画像が無い状態から説明する。 [Image recognition processing]
Next, processing for recognizing a face from a continuous image in the mobile communication terminal 1 according to the present embodiment will be described with reference to FIGS. FIG. 2 is a flowchart showing a procedure of image recognition processing according to the present embodiment. FIG. 3 is a flowchart showing in more detail the processing procedure in step S7 in FIG. FIGS. 4A to 4D are views showing the face search and tracking in steps S3, S5 and S7 in FIG. In the face recognition processing procedure described below, the process of starting the camera will be described. That is, a description will be given from the state where there is no image taken immediately before.

まず、カメラ２４を起動する（図２中のステップＳ１）。次いで、直前の画像（以下では、フレームともいう）で一つ以上の顔が検知されているか否か判定する（図２中のステップＳ２）。なお、上述のように、カメラ２４の起動直後の時点では、直前の画像が存在しないので、この段階では、ステップＳ２の判定は「Ｎｏ」になる。 First, the camera 24 is activated (step S1 in FIG. 2). Next, it is determined whether or not one or more faces have been detected in the immediately preceding image (hereinafter also referred to as a frame) (step S2 in FIG. 2). Note that, as described above, there is no previous image immediately after the camera 24 is activated, so the determination in step S2 is “No” at this stage.

次いで、画像が撮影された場合には、フレーム（コマ）の全領域に渡って顔の探索を行う（図２中のステップＳ３）。このステップＳ３におけるフレーム内での顔探索の様子を示したのが、図４（ａ）である。なお、図４（ａ）中の斜線部分が顔の探索領域である。この際、例えば、図４（ａ）に示すように、フレーム３０内に人物５１の顔（以下、第１の顔ともいう）が存在すれば、このステップＳ３で第１の顔が検出される。 Next, when an image is taken, a face search is performed over the entire area of the frame (frame) (step S3 in FIG. 2). FIG. 4A shows the face search in the frame in step S3. The hatched portion in FIG. 4A is a face search area. At this time, for example, as shown in FIG. 4A, if the face of the person 51 (hereinafter also referred to as the first face) exists in the frame 30, the first face is detected in this step S3. .

次いで、ステップＳ３で第１の顔が検出されれば、第１の顔のフレーム３０内の位置を検出して、第１の顔の位置情報を出力する（図２中のステップＳ４）。このステップＳ４では、例えば、検知した顔の周囲に四角形の枠を表示部１４の画面に表示するなどの処理により、第１の顔の位置情報を出力する。次いで、ステップＳ２に戻る。なお、ステップＳ３で顔が認識されない場合にもステップＳ３の後はステップＳ４を介してステップＳ２に戻るが、この場合にはステップＳ４で顔の位置情報を出力しない。 Next, if the first face is detected in step S3, the position of the first face in the frame 30 is detected, and the position information of the first face is output (step S4 in FIG. 2). In this step S4, for example, the position information of the first face is output by processing such as displaying a square frame around the detected face on the screen of the display unit 14. Next, the process returns to step S2. Even if the face is not recognized in step S3, the process returns to step S2 via step S4 after step S3, but in this case, the face position information is not output in step S4.

次いで、再度、ステップＳ２で直前のフレーム中に一つ以上の顔が検知されたか否か判定する。ここで、ステップＳ２でＮｏ判定となった場合（直前の画像中で顔が検知されなかった場合）には、上述したステップＳ３、ステップＳ４及びステップＳ２の工程をこの順で繰り返す。 Next, it is determined again in step S2 whether one or more faces have been detected in the immediately preceding frame. Here, when it becomes No determination in step S2 (when a face is not detected in the immediately preceding image), the above-described steps S3, S4, and S2 are repeated in this order.

一方、図４（ａ）に示すように、直前のフレームで顔が検知された場合には、ステップＳ２ではＹｅｓ判定となる。この場合、ステップＳ４で出力された顔の位置情報に基づいて、次のフレームでは直前のフレームで検知された顔の位置を含む所定領域で顔を探索する（図２中のステップＳ５）。 On the other hand, as shown in FIG. 4A, when a face is detected in the immediately preceding frame, a Yes determination is made in step S2. In this case, based on the face position information output in step S4, a face is searched in a predetermined area including the face position detected in the immediately preceding frame in the next frame (step S5 in FIG. 2).

ステップＳ５では、まず、直前のフレームで検出された第１の顔の位置で第１の顔の探索を行う。この位置で検知されなければ、探索範囲を広げて第１の顔の探索を行う。そして、その広げた探索範囲で第１の顔が検知されない場合は、さらに探索範囲を広げて（例えば、範囲を２倍にする）第１の顔を探索する。ステップＳ５では、この処理を第１の顔が検出されるまで繰り返す。最終的には、ステップＳ５でフレーム３０の全領域を探索する場合もある。 In step S5, first, the first face is searched at the position of the first face detected in the immediately preceding frame. If not detected at this position, the search range is expanded to search for the first face. If the first face is not detected in the expanded search range, the search range is further expanded (for example, the range is doubled) and the first face is searched. In step S5, this process is repeated until the first face is detected. Eventually, the entire region of the frame 30 may be searched in step S5.

また、ステップＳ５では、第１の顔の探索処理にかかった時間（以下、探索時間ともいう）も計測する。この探索時間は、移動通信端末１に内蔵されているタイマ（時計部１７に含まれる）を用いて測定してもよいし、ステップＳ５において探索範囲を広げて探索する工程の繰り返し回数から推定してもよい。 In step S5, the time required for the first face search process (hereinafter also referred to as search time) is also measured. The search time may be measured using a timer (included in the clock unit 17) built in the mobile communication terminal 1, or estimated from the number of repetitions of the search process in which the search range is expanded in step S5. May be.

上述のように、ステップＳ５では実質的に、直前の直前のフレームで検出された第１の顔の追尾処理を行う。 As described above, in step S5, the first face detected in the immediately preceding frame is substantially tracked.

次に、ステップＳ５で計測した探索時間を所定の設定時間（例えば、１００ｍｓｅｃ）と比較する（図２中のステップＳ６）。なお、設定時間は、１画像（１フレーム）当たりの画素数、連続画像のフレームレート（秒間コマ数）、装置の処理能力等に応じて適宜変更することができる。 Next, the search time measured in step S5 is compared with a predetermined set time (for example, 100 msec) (step S6 in FIG. 2). The set time can be changed as appropriate according to the number of pixels per image (one frame), the frame rate of continuous images (number of frames per second), the processing capability of the apparatus, and the like.

ステップＳ５での探索時間が所定の設定時間以上である場合には、ステップＳ６でＹｅｓ判定となる。ステップＳ６でＹｅｓ判定となった場合には、図２に示すように、ステップＳ４を経てステップＳ２の工程が行われる。なお、ステップＳ５においてフレームの全領域を探索しても第１の顔が検知されない場合には、ステップＳ６でＹｅｓ判定となり、次のステップＳ４では顔の位置情報は出力されない。 If the search time in step S5 is equal to or longer than the predetermined set time, a Yes determination is made in step S6. If the determination is Yes in step S6, the process of step S2 is performed through step S4 as shown in FIG. If the first face is not detected even after searching the entire area of the frame in step S5, a Yes determination is made in step S6, and face position information is not output in the next step S4.

一方、ステップＳ６でＮｏ判定となった場合（ステップＳ５で、比較的短時間で第１の顔が検知できた場合）には、ステップＳ７に移り、フレームを複数の領域に分割して、分割された一領域（画像の一部の領域）で顔を探索する。 On the other hand, if the determination is No in step S6 (if the first face can be detected in a relatively short time in step S5), the process moves to step S7, where the frame is divided into a plurality of regions. The face is searched for in the one area (partial area of the image).

ここで、図３及び図４（ｂ）〜４（ｄ）を参照しながらステップＳ７の処理動作をより詳細に説明する。本実施形態では、図４（ｂ）〜４（ｄ）に示すように、フレーム３０の分割形態として、フレーム３０を左右に等分割する形態を用いる。そして、本実施形態では２つの左側探索エリア３２及び右側探索エリア３３を、フレーム毎に交互に切り替えながら、フレーム中に現れた第１の顔以外の顔を探索する。 Here, the processing operation of step S7 will be described in more detail with reference to FIG. 3 and FIGS. 4 (b) to 4 (d). In the present embodiment, as shown in FIGS. 4B to 4D, a form in which the frame 30 is equally divided into left and right is used as the form of dividing the frame 30. In the present embodiment, the two left search areas 32 and the right search area 33 are switched alternately for each frame, and a face other than the first face that appears in the frame is searched.

なお、図４（ｂ）〜４（ｄ）に示した例では、最初に左側探索エリア３２を探索する場合を説明する。また、図４（ｂ）〜４（ｄ）に示した例では、図４（ｃ）のタイミング（図４（ｃ）のフレーム）で、左側探索エリア３２に新たに人物５２の顔が現れる場合を説明する。 In the example shown in FIGS. 4B to 4D, a case where the left search area 32 is searched first will be described. In the example shown in FIGS. 4B to 4D, a face of the person 52 newly appears in the left search area 32 at the timing shown in FIG. 4C (the frame shown in FIG. 4C). Will be explained.

まず、ステップＳ５で比較的短時間で第１の顔が検知された場合（ステップＳ６でＮｏ判定となった場合）、フレーム３０を左右２分割する（図３中のステップＳ７Ａ）。次いで、左側探索エリア３２を探索する（図３中のステップＳ７Ｂ：図４（ｂ）の状態）。図４（ｂ）のフレーム３０では、左側探索エリア３２に新たな顔画像が存在しないので、このフレーム３０では、新たな顔は検知されない。なお、図４（ｂ）中の領域３１はステップＳ５で第１の顔を検出する際に探索した領域であり、第１の顔の追尾処理もこの領域３１で行われる。 First, when the first face is detected in a relatively short time in Step S5 (when No determination is made in Step S6), the frame 30 is divided into left and right parts (Step S7A in FIG. 3). Next, the left search area 32 is searched (step S7B in FIG. 3: state shown in FIG. 4B). In the frame 30 of FIG. 4B, no new face image exists in the left search area 32, and therefore no new face is detected in this frame 30. An area 31 in FIG. 4B is an area searched when the first face is detected in step S 5, and the first face tracking process is also performed in this area 31.

また、この例では、ステップＳ５で第１の顔が検知されたフレームと同じフレームでステップＳ７が開始されているものとする。なお、ステップＳ６で設定されている設定時間によっては、ステップＳ５で第１の顔を検知した後、同一フレーム中で別の顔を探索するのに十分な探索時間が得られない場合がある。その場合には、次のフレームからステップＳ７を開始する。 In this example, it is assumed that step S7 is started in the same frame as the frame in which the first face is detected in step S5. Depending on the set time set in step S6, after detecting the first face in step S5, a search time sufficient to search for another face in the same frame may not be obtained. In that case, step S7 is started from the next frame.

次いで、ステップＳ７Ｂで新たな顔が検知されたか否か判定する（図３中のステップＳ７Ｃ）。図４（ｂ）のフレーム３０では新たな顔が検知されていないので、ステップＳ７ＣではＮｏ判定となり、ステップＳ７Ｄに移る。なお、ステップＳ７Ｂで新たな顔が検知された場合には、ステップＳ７ＣでＹｅｓ判定となり、ステップＳ４に進む。 Next, it is determined whether or not a new face has been detected in step S7B (step S7C in FIG. 3). Since no new face is detected in the frame 30 of FIG. 4B, the determination in step S7C is No, and the process proceeds to step S7D. If a new face is detected in step S7B, a Yes determination is made in step S7C, and the process proceeds to step S4.

次に、ステップＳ７Ｄでは、フレームが切り替わり、探索領域も切り替わる。図４（ｃ）のフレーム３０では、探索領域を右側探索エリア３３に切り替えて、右側探索エリア３３で顔を探索する（図４（ｃ）の状態）。 Next, in step S7D, the frame is switched and the search area is also switched. In the frame 30 of FIG. 4C, the search area is switched to the right search area 33, and the face is searched for in the right search area 33 (state of FIG. 4C).

図４（ｃ）のフレーム３０では、左側探索エリア３２に新たな人物５２の顔が現れるが、探索中の右側探索エリア３３には新たな顔は存在しない。それゆえ、図４（ｃ）のタイミングでも新たな顔は検知されず、次のステップＳ７Ｃでは、再度Ｎｏ判定となる。 In the frame 30 of FIG. 4C, a new face of the person 52 appears in the left search area 32, but no new face exists in the right search area 33 under search. Therefore, a new face is not detected even at the timing of FIG. 4C, and the determination at step S7C is No again.

次いで、再度、ステップＳ７Ｄに移り、フレームが切り替わり、探索領域も左側探索エリア３２に切り替わる（図４（ｄ）の状態）。図４（ｄ）のフレーム３０では、探索中の左側探索エリア３２に人物５２の顔（以下では、第２の顔ともいう）が存在するので、このタイミングで新たに第２の顔が検知される。本実施形態の顔の認識処理では、上述のようにして、二つの顔を異なるフレーム（タイミング）で検知することができる。なお、ステップＳ７における探索時間は予め所定の時間に設定されており、上述したステップＳ７Ａ〜Ｓ７Ｄはその設定された所定の探索時間内で行われる。なお、この探索時間は、１画像（１フレーム）当たりの画素数、連続画像のフレームレート（秒間コマ数）、装置の処理能力等に応じて適宜変更することができる。 Next, the process proceeds to step S7D again, the frame is switched, and the search area is also switched to the left search area 32 (state shown in FIG. 4D). In the frame 30 of FIG. 4D, since the face of the person 52 (hereinafter also referred to as a second face) exists in the left search area 32 being searched, the second face is newly detected at this timing. The In the face recognition process of the present embodiment, two faces can be detected at different frames (timing) as described above. Note that the search time in step S7 is set to a predetermined time in advance, and steps S7A to S7D described above are performed within the set predetermined search time. The search time can be changed as appropriate according to the number of pixels per image (one frame), the frame rate of continuous images (the number of frames per second), the processing capability of the apparatus, and the like.

なお、図４（ｂ）〜４（ｄ）の例では、最初に左側探索エリア３２から探索する例を説明したが、右側探索エリア３３からステップＳ７の探索を開始してもよい。その場合、第２の顔は図４（ｃ）のタイミングで検出されることになる。 In the example of FIGS. 4B to 4D, the example in which the search is first performed from the left search area 32 has been described. However, the search in step S <b> 7 may be started from the right search area 33. In that case, the second face is detected at the timing shown in FIG.

次いで、再度図２に戻って、ステップＳ７以降の工程を説明する。上述のようにステップＳ７で新たに第２の顔が検出されれば、ステップＳ４で第２の顔の位置情報が出力され、ステップＳ２に戻る。なお、上記ステップＳ２〜ステップＳ７の一連の処理は図１に示した移動通信端末１の制御部１２で行われる。 Next, returning to FIG. 2 again, the steps after step S7 will be described. If a new second face is detected in step S7 as described above, the position information of the second face is output in step S4, and the process returns to step S2. Note that a series of processing from step S2 to step S7 is performed by the control unit 12 of the mobile communication terminal 1 shown in FIG.

上述のように、本実施形態では、フレーム内ですでに検知されている顔を追尾しながら、その検知済みの顔以外の顔を探索する際には、画像の一部の領域を探索する。それゆえ、フレーム毎の処理時間を低減することにより、ユーザの使用感への悪影響を軽減することもできる。また、本実施形態では、フレーム毎の処理時間を低減することができるので、画像毎の処理量が大きくなる撮影画素数の大きい装置や、画像毎の探索時間を短くしなければならない高フレームレートの装置の画像認識処理に好適である。さらに、より効率的な認識処理が必要となる処理能力の比較的低い装置にも本実施形態は好適である。 As described above, in the present embodiment, a part of the image is searched when searching for a face other than the detected face while tracking a face that has already been detected in the frame. Therefore, by reducing the processing time for each frame, it is possible to reduce adverse effects on the user's feeling of use. In this embodiment, since the processing time for each frame can be reduced, an apparatus with a large number of captured pixels that increases the processing amount for each image, and a high frame rate that requires a shorter search time for each image. It is suitable for the image recognition processing of this device. Furthermore, the present embodiment is also suitable for an apparatus with a relatively low processing capability that requires more efficient recognition processing.

［変形例］
上記実施形態では、連続画像において異なるフレーム（異なるタイミング）で画像中に現れた２つの顔を検出する例を説明したが、連続画像中において異なるタイミングで３つの顔が現れた場合も同様の認識処理により検出することができる。その一例を図２、３及び５を参照しながら説明する。図５（ａ）〜５（ｄ）は、３つの顔を検出する際の図２中のステップＳ３、Ｓ５及びＳ７における顔の探索及び追尾の様子を示した図である。 [Modification]
In the above-described embodiment, an example in which two faces appearing in images at different frames (different timings) in the continuous image has been described. However, similar recognition is also performed when three faces appear in the continuous images at different timings. It can be detected by processing. One example will be described with reference to FIGS. FIGS. 5A to 5D are diagrams showing face search and tracking in steps S3, S5 and S7 in FIG. 2 when detecting three faces.

なお、この例では、フレーム中に新たに現れた顔を探索する際の画像分割形態としては、図４（ａ）〜４（ｄ）と同様に、フレーム４０を左右に等分割した例を説明する。また、図５（ａ）〜５（ｄ）の例では、図５（ｃ）のタイミング、すなわち、右側探索エリア４４を探索しているときに左側探索エリア４３に新たな人物５３の顔が現れた場合を説明する。 In this example, as an image division mode for searching for a newly appearing face in the frame, an example in which the frame 40 is equally divided into left and right as in FIGS. 4 (a) to 4 (d) will be described. To do. 5A to 5D, the new face of the person 53 appears in the left search area 43 when searching the timing shown in FIG. 5C, that is, the right search area 44. The case will be described.

図５（ａ）は、すでにフレーム４０中に２つの顔（人物５１及び５２の顔）が検出された状態を示している。この状態は、図２中のステップＳ３で２つの顔を同時に検出する、あるいは、上述した図２のステップＳ１〜Ｓ７の動作を経て、人物５１の顔（第１の顔）及び人物５２の顔（第２の顔）を別のタイミングで検出することにより得られる。 FIG. 5A shows a state in which two faces (faces of the persons 51 and 52) have already been detected in the frame 40. FIG. In this state, two faces are simultaneously detected in step S3 in FIG. 2, or the face of the person 51 (first face) and the face of the person 52 are obtained through the operations in steps S1 to S7 in FIG. It is obtained by detecting (second face) at another timing.

次に、第１の顔の位置を含む所定領域４１及び第２の顔の位置を含む所定領域４２で、それぞれ第１の顔及び第２の顔の探索を行う（図２中のステップＳ５）。この際、第１の顔及び第２の顔が所定の設定時間より短い探索時間で検知された場合には、フレーム４０内の探索エリアを左右に２等分し（図３中のステップＳ７Ａ）、左側探索エリア４３から顔探索を開始する（図３中のステップＳ７Ｂ：図５（ｂ）の状態）。図５（ｂ）のフレーム４０では、探索中の左側探索エリア４３に新たな顔が存在しないので、新たな顔は検知されない。それゆえ、つぎのステップＳ７ＣではＮｏ判定となる。 Next, the first face and the second face are searched for in the predetermined area 41 including the position of the first face and the predetermined area 42 including the position of the second face, respectively (step S5 in FIG. 2). . At this time, if the first face and the second face are detected within a search time shorter than a predetermined set time, the search area in the frame 40 is divided into two equal parts (step S7A in FIG. 3). Then, the face search is started from the left search area 43 (step S7B in FIG. 3: state shown in FIG. 5B). In the frame 40 of FIG. 5B, no new face is detected in the left search area 43 being searched, and therefore no new face is detected. Therefore, the next step S7C is No.

次いでステップＳ７Ｄに移り、フレームが切り替わり、探索エリアも右側探索エリア４４に切り替わる（図５（ｃ）の状態）。図５（ｃ）のフレーム４０では、左側探索エリア４３に新たな人物５３の顔が現れるが、探索中の右側探索エリア４４には新たな顔は存在しない。それゆえ、図５（ｃ）のフレームでも新たな顔は検知されず、次のステップＳ７Ｃでは再度Ｎｏ判定となる。 Next, the process proceeds to step S7D, the frame is switched, and the search area is also switched to the right search area 44 (state shown in FIG. 5C). In the frame 40 of FIG. 5C, a new face of the person 53 appears in the left search area 43, but no new face exists in the right search area 44 under search. Therefore, a new face is not detected even in the frame of FIG. 5C, and the determination in step S7C is No again.

次いで、再度ステップＳ７Ｄに移り、さらにフレームが切り替わり、探索エリアが再度左側探索エリア４３に切り替わる。このフレームでは、探索中の左側探索エリア４３に人物５３の顔（以下では、第３の顔ともいう）が存在するので、このフレームで新たに第３の顔が検知される（図５（ｄ）の状態）。連続画像中において異なるタイミングで３つの顔が現れた場合には、上述のようにして３つの顔を認識することができる。 Next, the process proceeds to step S7D again, the frame is further switched, and the search area is switched to the left search area 43 again. In this frame, since the face of the person 53 (hereinafter, also referred to as a third face) exists in the left search area 43 during the search, a third face is newly detected in this frame (FIG. 5D ) State). When three faces appear at different timings in the continuous image, the three faces can be recognized as described above.

上述した実施形態では、検知済みの顔以外に、連続画像中に新たに現れる顔を探索する際（図２中のステップＳ７及び図３中のステップＳ７Ａ〜Ｓ７Ｄ）には、フレームを左右の探索エリアに等分割した例を説明したが、本発明はこれに限定されない。左右の探索エリアが同サイズでなくてもよいし、探索エリアが上下に分割されていてもよいし、フレームが３つ以上の探索エリアに分割されていてもよい。 In the embodiment described above, when searching for a face newly appearing in the continuous image other than the detected face (step S7 in FIG. 2 and steps S7A to S7D in FIG. 3), the frame is searched left and right. Although an example of equal division into areas has been described, the present invention is not limited to this. The left and right search areas may not be the same size, the search area may be divided up and down, or the frame may be divided into three or more search areas.

また、上記実施形態では、検知済みの顔以外に、連続画像中に新たに現れる顔を検知する際には、検知済みの顔（例えば、図４中の人物５１の顔）の探索（追尾）領域を含む領域（例えば、図４（ｃ）中の右側探索エリア３３）を探索した例を説明した。しかしながら、本発明はこれに限定されず、検知済みの顔の探索領域以外の領域を適宜分割してその分割された一領域で新たに現れる顔を探索してもよい。なお、探索エリアの分割形態は、１フレーム当たりの画素数、フレームレート、画像処理装置の処理能力等に応じて適宜変更することができる。 In the above embodiment, when detecting a face newly appearing in the continuous image other than the detected face, search (tracking) of the detected face (for example, the face of the person 51 in FIG. 4). The example which searched the area | region (For example, right search area 33 in FIG.4 (c)) including the area | region was demonstrated. However, the present invention is not limited to this, and an area other than the detected face search area may be appropriately divided and a face newly appearing in the divided area may be searched. The search area division mode can be changed as appropriate according to the number of pixels per frame, the frame rate, the processing capability of the image processing apparatus, and the like.

ここで、１枚当たりの画素数が６４０×４８０＝３０７２００画素である画像に対して、本発明の画像認識方法を適用した場合の探索エリアの分割形態と、ステップＳ７におけるフレーム毎の探索時間との具体的な関係を下記表１に示した。なお、表１中の分割形態に記載の「上下２等分」とはフレームを上下方向に２等分した分割形態であり、「左右２等分」とはフレームを左右方向に２等分した分割形態（図４（ｂ）〜４（ｄ）の分割形態）である。また、表１中の「上下左右４等分」とは、フレームを上下方向に２等分し、さらに左右方向に２等分した分割形態である。 Here, with respect to an image in which the number of pixels per frame is 640 × 480 = 307200 pixels, the search area division form when the image recognition method of the present invention is applied, and the search time for each frame in step S7 The specific relationship is shown in Table 1 below. The “upper and lower halves” described in the division form in Table 1 is a division form in which the frame is divided into two equal parts in the vertical direction, and “left and right equal parts” means the frame is divided into two equal parts in the left and right direction It is a division form (the division form of FIGS. 4B to 4D). Further, the “upper and lower left and right equally” in Table 1 is a division form in which the frame is divided into two equal parts in the vertical direction and further divided into two equal parts in the left and right direction.

表１に示すように、図２中のステップＳ７におけるフレーム毎の探索時間は、分割数の増大に比例して短くなる。なお、画像サイズが変わった場合には、ステップＳ７におけるフレーム毎の探索時間はフレームの画素数にほぼ比例して変化する。また、表１に示した探索時間は一例であり、撮像装置の処理能力等によっても変化する。 As shown in Table 1, the search time for each frame in step S7 in FIG. 2 decreases in proportion to the increase in the number of divisions. When the image size changes, the search time for each frame in step S7 changes almost in proportion to the number of pixels in the frame. Further, the search time shown in Table 1 is an example, and changes depending on the processing capability of the imaging apparatus.

上記実施形態では、撮像装置として移動通信端末を例にとり説明したが、本発明はこれに限定されない。本発明は、連続的に画像を撮影可能な撮像装置であれば、任意の装置に適用可能である。また、本発明の画像認識方法は、連続画像を撮影する機能を有しないが、連続画像から所定の対象物画像を検知（抽出）する処理を行う装置（例えば、パーソナルコンピュータ等）であれば、任意の装置に適用可能である。 In the above embodiment, a mobile communication terminal has been described as an example of the imaging device, but the present invention is not limited to this. The present invention can be applied to any device as long as it can capture images continuously. Further, the image recognition method of the present invention does not have a function of capturing a continuous image, but if it is an apparatus (for example, a personal computer) that performs processing for detecting (extracting) a predetermined object image from the continuous image, Applicable to any device.

また、上記実施形態では、画像中で検知する対象物画像として顔画像を例にしたが、本発明はこれに限定されず、画像から何らかの画像パターンで抽出可能な対象物（例えば、自動車等）画像であれば任意のものに適用することができる。 Moreover, in the said embodiment, although the face image was taken as an example as the target object image detected in an image, this invention is not limited to this, The target object (for example, car etc.) which can be extracted with a certain image pattern from an image Any image can be applied.

図１は、本発明の一実施形態に係る移動通信端末のブロック構成図である。FIG. 1 is a block diagram of a mobile communication terminal according to an embodiment of the present invention. 図２は、本発明の一実施形態に係る画像認識処理のフローチャートである。FIG. 2 is a flowchart of image recognition processing according to an embodiment of the present invention. 図３は、図２中のステップＳ７内の処理手順を示したフローチャートである。FIG. 3 is a flowchart showing a processing procedure in step S7 in FIG. 図４（ａ）〜（ｄ）は、図２中のステップＳ３、Ｓ５及びＳ７での顔の探索及び追尾処理の様子を示した図である。FIGS. 4A to 4D are views showing the face search and tracking process in steps S3, S5 and S7 in FIG. 図５（ａ）〜（ｄ）は、変形例における図２中のステップＳ３、Ｓ５及びＳ７での顔の探索及び追尾処理の様子を示した図である。FIGS. 5A to 5D are views showing the face search and tracking process in steps S3, S5 and S7 in FIG. 2 in the modification.

Explanation of symbols

１…移動通信端末、１２…制御部、１７…時計部、２４…カメラ、３０…フレーム、３１…所定領域、３２…右側探索エリア、３３…左側探索エリア DESCRIPTION OF SYMBOLS 1 ... Mobile communication terminal, 12 ... Control part, 17 ... Clock part, 24 ... Camera, 30 ... Frame, 31 ... Predetermined area, 32 ... Right search area, 33 ... Left search area

Claims

An imaging unit having an imaging element and outputting an image signal;
An image processing unit that searches for and detects an object image that matches a specific condition from the image signal output by the imaging unit, and performs predetermined image processing based on the detection result;
When searching for a second object image in the image while searching and tracking the first object image as the object image in the image processing unit, a part of the image obtained by the image signal An imaging device comprising: a control unit that performs control to search for the second object image while limiting to a region.

When the image processing unit searches for the first object image in the predetermined image, the first object image is detected in a predetermined area including the position of the first object image detected in the immediately preceding image. A first processing unit that searches for a first object image, a measurement unit that measures a search time in the first processing unit, and a second object image that searches for a second object image in a partial region of the image based on the measurement result of the search time The imaging device according to claim 1, further comprising: 2 processing units.

The imaging device according to claim 2, wherein the second processing unit includes a switching unit that switches the partial area searched for each image to a different area.

The imaging device according to claim 1, wherein the object image searched by the image processing unit is a human face image.

A first step of searching for a first object image from an image signal obtained by imaging;
A second step of tracking the first object image searched in the first step;
And a third step of searching while limiting the second object image to a partial area of the image while tracking the first object image in the second step.