JP2010186274A

JP2010186274A - Sunglasses wearing detection apparatus

Info

Publication number: JP2010186274A
Application number: JP2009029171A
Authority: JP
Inventors: Hideyuki Aoki; 秀行青木
Original assignee: Secom Co Ltd
Current assignee: Secom Co Ltd
Priority date: 2009-02-10
Filing date: 2009-02-10
Publication date: 2010-08-26
Anticipated expiration: 2029-02-10
Also published as: JP5271742B2

Abstract

PROBLEM TO BE SOLVED: To provide an apparatus for determining whether or not a person who is imaged in an input image by a camera etc. which is provided at a side of an ATM is wearing sunglasses, and for detecting a person who is still unnaturally wearing the sunglasses. SOLUTION: Determination of wearing the sunglasses is made from distribution of black pixels which satisfy a predetermined condition within an input image by focusing a shape particular to the sunglass. An index indicating the fact that the black pixels are symmetrically distributed at both sides of a face and an index indicating the fact that a lower portion of a lens and a bridge portion are separated up and down within the image are used in the determination. COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、画像処理において順次撮影される画像中の人物がサングラスを着用しているか否かを判定する装置に関する。特に、ＡＴＭ（現金自動預け支払機）に併設することで、サングラスを着用のまま操作する人物を不審人物として検出することが可能となる装置に関する。 The present invention relates to an apparatus for determining whether or not a person in an image sequentially photographed in image processing is wearing sunglasses. In particular, the present invention relates to an apparatus that can detect a person who operates while wearing sunglasses as a suspicious person by being attached to an ATM (automated teller machine).

近年、発生件数が増大し、社会問題になっている犯罪にいわゆる「振り込め詐欺」がある。これは、例えば家族が交通事故を起こしたので示談金を振り込むよう騙された高齢者が、犯人が指定した口座に振り込んでしまい、その後、犯人グループの一員が、その口座から現金を引き出すという詐欺の手口である。
この詐欺の手口では、犯人グループのうち、現金を引き出す役割を担う一員の顔画像が防犯カメラやＡＴＭに備え付けのカメラにより撮影されることが多く、犯人逮捕につなげるひとつの手がかりになっている。 In recent years, the number of incidents has increased, and so-called “transfer fraud” is a crime that has become a social problem. For example, this is a scam in which an elderly person who has been tricked to transfer money from a family due to a car accident transfers money to an account designated by the criminal, and then a member of the criminal group withdraws cash from the account. It is a trick.
In this scam technique, a face image of a member of the criminal group who plays a role of withdrawing cash is often taken by a security camera or a camera provided in an ATM, which is one clue that leads to arrest of the criminal.

しかし、ＡＴＭから現金を引き出す人物が犯人グループの一員であることは、引き出された後に初めてわかるものであるうえ、人相がわからないようサングラスの着用など変装を行っているのが一般的である。そこで、ＡＴＭの前に立った人物が、人相がわからないようサングラスを着用しているならば、サングラスを外すよう依頼したり、場合によっては警備担当者に通報することが望まれる。 However, the fact that the person who withdraws cash from the ATM is a member of the criminal group is known for the first time after being withdrawn, and is generally disguised such as wearing sunglasses so that the human phase is not known. Therefore, if a person standing in front of an ATM wears sunglasses so that the human phase is not known, it is desirable to request that the person remove the sunglasses or, in some cases, report to the security officer.

特許文献１には、ＡＴＭの前に立った人物の顔画像を取得し、予め記憶した目の周辺部分の標準パターンとの比較の結果、所定以上類似していない場合、当該人物がサングラスを着用しているとして、そのままＡＴＭの操作をしようとすると不審な人物と判定する技術が開示されている。 In Patent Document 1, if a face image of a person standing in front of an ATM is acquired and compared with a pre-stored standard pattern of the peripheral part of the eye, the person wears sunglasses when the person is not more than a predetermined similarity However, a technique for determining that a person is a suspicious person when attempting to operate an ATM as it is is disclosed.

特開平０９−０９１４３２号公報JP 09-091432 A

特許文献１の技術では、目の周辺部分の明るさの分布に基づくことを基本にして判定しているため、標準パターンとの比較を行うにしても、ＡＴＭ周辺の照明条件によっては、目の周辺が暗く映ってしまい、その場合には、標準パターンとは類似していないこととなり、不要な不審者判定を行ってしまう可能性がある。 In the technique of Patent Document 1, since the determination is based on the brightness distribution of the peripheral portion of the eye, even if the comparison with the standard pattern is performed, depending on the illumination conditions around the ATM, the eye The surroundings appear dark, and in this case, it is not similar to the standard pattern, and an unnecessary suspicious person determination may be performed.

不要な不審者判定の頻発は、利用者に多大な不快感を与えるとともに、不審者の判定装置を設置した金融機関への信頼感を大きく損なうものであり、照明条件や髪型による明るさの分布の影響を受けない判定方法が必要である。 The frequent occurrence of unnecessary suspicious person judgments causes great discomfort to users and greatly impairs the reliability of financial institutions with suspicious person judgment equipment installed, and the distribution of brightness depending on lighting conditions and hairstyle Judgment methods that are not affected by this are needed.

したがって、本発明は、画像処理において順次撮影される画像中の人物がサングラスを着用をしているか否かを、サングラス特有の形状に基づいて検出するサングラス着用検出装置を提供することを目的とする。本発明にかかる装置を用い、ＡＴＭの前に立つ人物を撮影した場合には、不審人物の検出が可能となる。 Therefore, an object of the present invention is to provide a sunglasses wearing detection device that detects whether or not a person in an image sequentially photographed in image processing is wearing sunglasses based on a shape peculiar to sunglasses. . When a person standing in front of an ATM is photographed using the apparatus according to the present invention, a suspicious person can be detected.

上記課題を解決するため、本発明にかかるサングラス着用検出装置は、入力画像を取得する画像入力部と、前記入力画像からサングラスを着用している人物を検出する画像処理部と、を具備し、前記画像処理部は、前記入力画像から人物の頭部に相当する頭部領域を抽出する頭部抽出手段と、前記頭部領域に含まれる略黒色を呈する黒画素を抽出する黒画素抽出手段と、前記黒画素の分布に基づいて前記頭部領域の特徴量を算出する特徴量算出手段と、前記特徴量を用いて、前記黒画素が略左右対称に分布していると前記人物がサングラスを着用していると判定し、前記黒画素が略左右対称に分布していないと前記人物はサングラスを着用していないと判定する判定手段を備えることを特徴とする。 In order to solve the above problems, a sunglasses wearing detection device according to the present invention includes an image input unit that acquires an input image, and an image processing unit that detects a person wearing sunglasses from the input image, The image processing unit includes a head extracting unit that extracts a head region corresponding to the head of a person from the input image, and a black pixel extracting unit that extracts a black pixel having a substantially black color included in the head region. A feature amount calculating means for calculating a feature amount of the head region based on the distribution of the black pixels, and using the feature amount, when the black pixels are distributed substantially symmetrically, the person wears sunglasses. It is determined that the person wears, and if the black pixels are not distributed substantially symmetrically, the person is provided with a determining unit that determines that the person is not wearing sunglasses.

サングラスは左右対称の形状であることに注目し、人間の顔面を含む頭部領域を抽出したうえで、その内部にサングラスを写していると思われる黒画素について顔の左右の分布を調べ、それが対称性を持つと判断される場合にサングラスの着用を判定する構成により、頭髪の影や照明条件の影響を受けにくくなり、高精度にサングラスの着用を判定できる。 Focusing on the symmetrical shape of sunglasses, after extracting the head region including the human face, we examined the distribution of the left and right faces of the black pixels that appear to be wearing sunglasses inside, The configuration in which the wearing of the sunglasses is determined when it is determined that has a symmetry makes it less likely to be affected by the shadow of the hair and the lighting conditions, and the wearing of the sunglasses can be determined with high accuracy.

本発明にかかるサングラス着用検出装置において、前記黒画素を垂直軸に射影した度数に関する縦ヒストグラムを作成し、前記縦ヒストグラムの最大度数の示した垂直位置にて前記特徴量を算出することが好適である。 In the sunglasses wearing detection device according to the present invention, it is preferable that a vertical histogram relating to the frequency obtained by projecting the black pixel on the vertical axis is created, and the feature amount is calculated at a vertical position indicated by the maximum frequency of the vertical histogram. is there.

かかる構成により、サングラスを写していると思われる黒画素が集中する行に特に着目するので、サングラス以外の原因による黒画素、即ちノイズの影響を受けにくくなる効果がある。 With such a configuration, attention is particularly paid to a row where black pixels that are considered to be capturing sunglasses are concentrated, and therefore, there is an effect that the black pixels caused by causes other than sunglasses, that is, the influence of noise is less likely to be affected.

本発明にかかるサングラス着用検出装置において、左右の対称性を判断する基準を、人間の頭部の正中線にして、特徴量を算出することが好適である。 In the sunglasses wearing detection device according to the present invention, it is preferable that the feature amount is calculated with the reference for determining the left-right symmetry as the midline of the human head.

サングラスの形状は一般的に、ブリッジ（左右のレンズをつなぐフレーム）が細く、その両脇にレンズがあるので、頭部領域中でブリッジが位置する正中線を基準に左右の対称性を判定すると高精度にサングラスの着用を判定できる。 The shape of sunglasses is generally a narrow bridge (a frame that connects the left and right lenses), and there are lenses on both sides of the bridge, so if you determine the symmetry of the left and right with reference to the midline where the bridge is located in the head region Wearing sunglasses can be determined with high accuracy.

本発明にかかるサングラス着用検出装置において、前記特徴量算出手段は、前記垂直位置より下方に存在し、かつ前記正中線から所定画素数離れており前記黒画素とその他の色の画素との境界である境界画素のうち、最下部に位置する画素と前記垂直位置までとの垂直画素数を算出し、前記判定手段は、前記垂直画素数が所定の閾値以下であれば、サングラスを着用していないと判定する、サングラスの着用を判定しないことが好適である。 In the apparatus for detecting wearing sunglasses according to the present invention, the feature amount calculating means is located below the vertical position, and is separated from the median line by a predetermined number of pixels at a boundary between the black pixel and other color pixels. Of the certain boundary pixels, the number of vertical pixels between the pixel located at the bottom and the vertical position is calculated, and the determination means does not wear sunglasses if the number of vertical pixels is equal to or less than a predetermined threshold value. It is preferable not to determine wearing sunglasses.

サングラスはブリッジ部分にはレンズが無いのが一般的であるので、黒画素であっても正中線上に存在するものと、レンズ下部に対応した黒画素とは垂直方向の距離は大きくなる。それが小さい場合には、サングラス以外の原因で黒画素が抽出された場合であるので、サングラス着用とは判定しないことで、高い精度を実現できる。 Sunglasses generally do not have a lens in the bridge portion, and therefore, even in the case of a black pixel, the distance in the vertical direction is large between the black pixel existing on the midline and the black pixel corresponding to the lower part of the lens. When it is small, since it is a case where a black pixel is extracted for reasons other than sunglasses, high precision is realizable by not determining with sunglasses wearing.

本発明によれば、照明条件や髪型による明るさの分布の影響を受けずに、サングラス特有の形状に基づいて精度良く、サングラス着用の有無を検出することができる。 According to the present invention, the presence / absence of wearing sunglasses can be detected with high accuracy based on the shape unique to sunglasses without being affected by the distribution of brightness depending on illumination conditions and hairstyle.

本発明の一実施形態であるＡＴＭの見取り図であるIt is a sketch of ATM which is one Embodiment of this invention 本発明の一実施形態である不審人物検出装置のブロック図であるIt is a block diagram of a suspicious person detection device which is one embodiment of the present invention. サングラスの検出に用いる特徴量の計算を説明するための画像の模式図であるIt is a schematic diagram of the image for demonstrating calculation of the feature-value used for detection of sunglasses. 人間の頭部を抽出するための楕円テンプレートを説明する図であるIt is a figure explaining the ellipse template for extracting a human head. 楕円テンプレートを用いた類似度の計算方法を説明するための図であるIt is a figure for demonstrating the calculation method of the similarity degree using an ellipse template サングラスの検出には不要な、頭髪や耳を除いて処理する領域を説明するための図であるIt is a figure for demonstrating the area | region processed except a hair and an ear unnecessary for the detection of sunglasses. 本発明の一実施形態である不審人物検出装置の全体動作を示すフロー図であるIt is a flowchart which shows the whole operation | movement of the suspicious person detection apparatus which is one Embodiment of this invention. サングラス検出処理の詳細を示すフロー図であるIt is a flowchart which shows the detail of a sunglasses detection process.

本発明の好適な実施形態として、本発明にかかるサングラス着用検出装置を、金融機関のＡＴＭに併設される不審人物検出装置に適用し、サングラスを着用したままＡＴＭを操作する人物を画像処理によって不審人物として検出する例について説明する。 As a preferred embodiment of the present invention, the sunglasses wear detection device according to the present invention is applied to a suspicious person detection device attached to an ATM of a financial institution, and a person who operates the ATM while wearing sunglasses is suspicious by image processing. An example of detecting a person will be described.

本実施の形態にかかる不審人物検出装置は、ＡＴＭに併設され、その撮像装置はそのＡＴＭを操作する人物を撮像する。そして、当該人物がサングラスを着用していると判定される場合に、所定の警告を発したり、警備担当者に通知する機能を有する。 The suspicious person detection device according to the present embodiment is attached to an ATM, and the imaging device images a person who operates the ATM. And when it determines with the said person wearing sunglasses, it has a function which issues a predetermined warning or notifies a security officer.

図１は、本発明の一実施形態である不審人物検出装置が備えられたＡＴＭの例を示す図である。図１の例に示すように、ＡＴＭ４０には、人の接近を検知するセンサ（以下、「人感センサ」）３０と、撮像装置２０を併設するものとする。本実施の形態にかかる不審人物検出装置の本体（図１では不図示）は、利用者が触ることができないよう金融機関の従業者のみが立ち入ることができる場所、いわゆるバックヤードに設置されているとする。 FIG. 1 is a diagram showing an example of an ATM provided with a suspicious person detection device according to an embodiment of the present invention. As shown in the example of FIG. 1, the ATM 40 is provided with a sensor (hereinafter, “human sensor”) 30 for detecting the approach of a person and an imaging device 20. The main body (not shown in FIG. 1) of the suspicious person detection device according to the present embodiment is installed in a so-called backyard where only the employee of the financial institution can enter so that the user cannot touch it. And

人感センサ３０は、ＡＴＭ４０を操作しようとＡＴＭ４０に近づいた人物が存在すると判定される場合に、その旨を不審人物検出装置に出力する。人感センサ３０は、人物の存在または接近を検出できるタイプであれば、任意の検出機構を有するものでよい。例えば、超音波やマイクロ波による測距型のセンサを用いることができる。また、利用者がＡＴＭ４０の真正面に立つとは限らないので、図１に示すように、横に複数並べて設置することが望ましい。 When it is determined that there is a person who approaches the ATM 40 in order to operate the ATM 40, the human sensor 30 outputs that fact to the suspicious person detection device. The human sensor 30 may have an arbitrary detection mechanism as long as it can detect the presence or approach of a person. For example, a distance measuring sensor using ultrasonic waves or microwaves can be used. Further, since the user does not always stand directly in front of the ATM 40, it is desirable that a plurality of users be installed side by side as shown in FIG.

撮像装置２０は、いわゆるカメラである。ＡＴＭ４０の前に立つ人物を所定の時間間隔で撮影し、撮影された画像を順次不審人物検出装置に送る。以下、この所定時間で刻まれる時間の単位を時刻と称する。撮像装置２０としては、例えば、ＣＣＤ素子又はＣ−ＭＯＳ素子等の撮像素子、光学系部品、Ａ／Ｄ変換器等を含んで構成される内蔵カメラ又は外部カメラなどの公知のものを用いることができる。また、撮像装置２０の解像度は、適宜要求される画角などに応じて選ぶことができ、例えばＮＴＳＣ規格、ＳＤＴＶ規格またはＨＤＴＶ規格を用いることができる。また、画像の撮像に用いる波長帯としては、後述のように肌色か黒色かの区別をするためカラー情報を用いることとするので、可視光波長を選択する。 The imaging device 20 is a so-called camera. A person standing in front of the ATM 40 is photographed at predetermined time intervals, and the photographed images are sequentially sent to the suspicious person detection device. Hereinafter, the unit of time recorded in the predetermined time is referred to as time. As the imaging device 20, for example, a known device such as an internal camera or an external camera configured to include an imaging device such as a CCD device or a C-MOS device, an optical system component, an A / D converter, or the like is used. it can. Further, the resolution of the imaging device 20 can be selected according to a required angle of view, and the NTSC standard, SDTV standard, or HDTV standard can be used, for example. Further, as the wavelength band used for image capturing, color information is used for distinguishing between skin color and black as described later, and therefore the visible light wavelength is selected.

図２は、本発明の一実施形態である、サングラス着用検出装置１０が実装された不審人物検出装置１の構成を示すブロック図である。図２に示すように、本発明のサングラス着用検出装置１０は、画像入力部１１０、画像処理部１３０、記憶部１４０、通報部１５０を有し、撮像装置２０および人感センサ３０に接続されている。 FIG. 2 is a block diagram illustrating a configuration of the suspicious person detection device 1 on which the sunglasses wearing detection device 10 is mounted, which is an embodiment of the present invention. As shown in FIG. 2, the sunglasses wearing detection device 10 of the present invention includes an image input unit 110, an image processing unit 130, a storage unit 140, and a notification unit 150, and is connected to the imaging device 20 and the human sensor 30. Yes.

画像入力部１１０は、撮像装置２０から順次送られる画像信号を受け取るためのインターフェースであり、撮像装置２０からＳＣＳＩ、ＵＳＢ、ＬＡＮ、専用ケーブル、などの有線又は無線配線を介して画像を受信する。画像入力部１１０は、撮像装置２０から受信した画像をサングラス着用検出装置１０の各部へ送る。尚、本実施例では、画像を構成する各画素のうち、人間の肌を写していると判断される画素であるか否かの判定に色情報を用いて説明するので、撮像装置２０及び画像入力部１１０としては、カラー画像を処理できるものを使用する。 The image input unit 110 is an interface for receiving image signals sequentially transmitted from the imaging device 20, and receives images from the imaging device 20 via wired or wireless wiring such as SCSI, USB, LAN, and dedicated cables. The image input unit 110 sends the image received from the imaging device 20 to each unit of the sunglasses wearing detection device 10. In the present embodiment, the color information is used to determine whether or not each of the pixels constituting the image is a pixel that is determined to represent human skin. As the input unit 110, an input unit capable of processing a color image is used.

記憶部１４０は、各種プログラム及び各種データを記憶することができ、例えばＲＡＭ又はＲＯＭ、ＥＰＲＯＭなどの半導体メモリ、ハードディスクなどの磁気記録媒体、ＣＤ−ＲＯＭ、ＤＶＤ−Ｒ／Ｗなどの光記録媒体などを用いて構成することができる。記憶部１４０は、画像入力部１１０、画像処理部１３０、通報部１５０と接続されており、各部からの要求に応じて各種プログラムや各種データなどを読み書きする。記憶部１４０は、基準画像１４２と、不審者画像１４４を記憶する。 The storage unit 140 can store various programs and various data. For example, a semiconductor memory such as RAM or ROM, EPROM, a magnetic recording medium such as a hard disk, an optical recording medium such as a CD-ROM, DVD-R / W, or the like. Can be used. The storage unit 140 is connected to the image input unit 110, the image processing unit 130, and the reporting unit 150, and reads and writes various programs and various data according to requests from each unit. The storage unit 140 stores a reference image 142 and a suspicious person image 144.

基準画像１４２は、変化領域を抽出するとき比較対象の基準となる画像であり、例えば、人感センサ３０がＯＦＦ、すなわちＡＴＭ４０の前に人が存在しない場合に、撮像装置２０にて撮影された画像を記憶部１４０に記憶しておく。また、後述のように更新手段１３７により、適宜更新される。 The reference image 142 is an image that serves as a reference for comparison when extracting the change area. For example, when the human sensor 30 is OFF, that is, when no person is present before the ATM 40, the reference image 142 is taken by the imaging device 20. The image is stored in the storage unit 140. Further, as will be described later, it is appropriately updated by the updating means 137.

不審者画像１４４は、後述のように画像処理部１３０にて、撮像装置２０にて撮影された画像に含まれる人物がサングラスを着用していると判定された際、この人物の画像を記憶したものである。 The suspicious person image 144 stores an image of the person when the image processing unit 130 determines that the person included in the image photographed by the imaging device 20 is wearing sunglasses as described later. Is.

画像処理部１３０は、画像入力部１１０から順次受け取った画像を処理して画像中に含まれる人物がサングラスを着用しているか否かを判定する。図２に示すように、画像処理部１３０は、変化領域抽出手段１３１と、頭部抽出手段１３１と、黒画素抽出手段１３３と、特徴量算出手段１３５と、判定手段１３６とを備える。 The image processing unit 130 processes the images sequentially received from the image input unit 110 and determines whether a person included in the image is wearing sunglasses. As shown in FIG. 2, the image processing unit 130 includes a change area extraction unit 131, a head extraction unit 131, a black pixel extraction unit 133, a feature amount calculation unit 135, and a determination unit 136.

変化領域抽出手段１３１は、予め記憶部１４０に記憶しておいたＡＴＭ４０の前に人が存在しない基準画像１４２と、撮像装置２０が撮影し画像入力部１１０を介して受け取った画像（以下、入力画像という）との間で、公知の方法で白黒化した上で差分を計算し、一定の閾値以上の差分がある画素群を変化領域として抽出する。閾値は実験により求めるものとする。ＡＴＭ４０の前に立つ人物以外に人物や光の射し込みなどでノイズが抽出されることもあるので、適宜画像処理技術では周知な膨張収縮処理を行い、ノイズ除去を行う。 The change area extraction unit 131 includes a reference image 142 in which no person exists in front of the ATM 40 stored in the storage unit 140 and an image captured by the imaging device 20 and received via the image input unit 110 (hereinafter referred to as input). The difference between the images is calculated in black and white by a known method, and a pixel group having a difference equal to or greater than a certain threshold is extracted as a change region. The threshold value is obtained by experiment. In addition to the person standing in front of the ATM 40, noise may be extracted due to a person or light shining, so that a known expansion and contraction process is appropriately performed in the image processing technique to remove the noise.

頭部抽出手段１３２は、撮像装置２０が取得した入力画像から、変化領域抽出手段１３１にて抽出した変化領域の内部において、ＡＴＭ４０の前に立つ人物の頭部を抽出する。 The head extraction unit 132 extracts the head of a person standing in front of the ATM 40 in the change area extracted by the change area extraction unit 131 from the input image acquired by the imaging device 20.

まず、頭部抽出手段１３２は、入力画像を公知の方法にて白黒化し、エッジ抽出フィルターを作用させ、エッジ強度とエッジ角度（水平を基準としたエッジ方向）を求める。そしてエッジ強度画像とエッジ角度画像を作成して記憶部１４０に一時的に記憶させる。同時に入力画像を白黒化した画像も、後続の各処理のため、一時的に記憶部１４０に記憶させておく。これら一時的に記憶した画像は、次の時刻で新たな入力画像が取得された場合に、順次消去される。なお、エッジ抽出フィルターはＳｏｂｅｌフィルターなどの公知のものを用いればよい。 First, the head extraction unit 132 converts the input image to black and white by a known method, applies an edge extraction filter, and obtains the edge strength and the edge angle (edge direction with respect to the horizontal). Then, an edge strength image and an edge angle image are created and temporarily stored in the storage unit 140. At the same time, the image obtained by converting the input image to black and white is also temporarily stored in the storage unit 140 for each subsequent process. These temporarily stored images are sequentially deleted when a new input image is acquired at the next time. The edge extraction filter may be a known filter such as a Sobel filter.

次に、頭部抽出手段１３２は、あらかじめ作成され、記憶部１４０に記憶してある複数の楕円テンプレートをエッジ強度画像およびエッジ角度画像上にてずらしマッチングを行い、エッジ強度とエッジ角度を用いて各画素において類似度を求め、それが高い位置を、ＡＴＭ４０の前に立つ人物の頭部候補として抽出する。 Next, the head extracting means 132 performs matching by shifting a plurality of ellipse templates created in advance and stored in the storage unit 140 on the edge strength image and the edge angle image, and using the edge strength and the edge angle. The similarity is obtained for each pixel, and the position where the similarity is high is extracted as a head candidate of a person standing in front of the ATM 40.

頭部候補を抽出するための楕円テンプレートを、図４を用いて説明する。楕円テンプレートは図４（ａ）の符号３００にて斜線部分として示すように、所定の幅を持つ楕円形状の部分画像である。楕円テンプレート３００は、ＡＴＭ４０の前に立つ人物の頭部の大きさに対応できるよう複数種類を用意する。 An elliptic template for extracting head candidates will be described with reference to FIG. The ellipse template is an elliptical partial image having a predetermined width, as indicated by a hatched portion in FIG. 4A. A plurality of types of ellipse templates 300 are prepared so as to correspond to the size of the head of a person standing in front of the ATM 40.

図４（ｂ）に示すように、楕円テンプレート３００上の各画素３１３は、楕円の中心３１２から、符号３１１ａ乃至３１１ｄに示すような放射方向に方向ベクトル３１０を考えた時の、Ｘ軸（水平軸）に対する角度θを有する。 As shown in FIG. 4B, each pixel 313 on the ellipse template 300 has an X axis (horizontal) when the direction vector 310 is considered in the radial direction as indicated by reference numerals 311a to 311d from the center 312 of the ellipse. An angle θ with respect to the axis).

楕円テンプレート３００とエッジ強度画像との類似度は、楕円テンプレート３００をエッジ強度画像に重ねた際、楕円テンプレート３００に含まれることになる画素のエッジ強度の総和で求められる。但し、その画素について、楕円テンプレート３００に定義されている角度と、エッジ角度画像上での角度との差が大きい場合は類似度の計算に際して加算の対象とはしない。これを図５を用いて説明する。 The similarity between the ellipse template 300 and the edge intensity image is obtained by the sum of the edge intensities of the pixels that are included in the ellipse template 300 when the ellipse template 300 is superimposed on the edge intensity image. However, when the difference between the angle defined in the ellipse template 300 and the angle on the edge angle image is large for the pixel, the pixel is not included in the calculation of the similarity. This will be described with reference to FIG.

図５において、頭部抽出手段１３２にて求められたエッジ強度画像について、ノイズを除去できる閾値以上の強度である画素のうち、楕円テンプレート３００（描画都合上、点線で示す）と重なりを持つものを模式的に線状で符号３２２と３２４に示す。符号３２２に含まれる画素を、説明のため符号３２６に示す。同様に符号３２４に含まれる画素を、説明のために符号３２８に示す。 In FIG. 5, among the edge intensity images obtained by the head extracting means 132, among the pixels whose intensity is equal to or higher than a threshold value capable of removing noise, an overlap with the ellipse template 300 (shown by a dotted line for convenience of drawing). Is schematically shown in the form of a line at reference numerals 322 and 324. A pixel included in reference numeral 322 is indicated by reference numeral 326 for the sake of explanation. Similarly, a pixel included in reference numeral 324 is indicated by reference numeral 328 for the sake of explanation.

画素３２６のエッジ方向は、エッジ角度画像を参照して、符号３３２とする。またその画素についての、楕円テンプレートの中心からの方向ベクトルを符号３３０とすると、符号３３２と符号３３０とのなす角度はδ１である。δ１が１８０度を超えると３６０度から引いた値とする。このδ１が所定のエッジ角度閾値より小さい場合には、画素３２６についてエッジ強度画像を参照して、類似度にエッジ強度を加算する。
画素３２８についても同様に、エッジ方向３３４と方向ベクトル３２４とから角度δ２を求める。この角度δ２は前記エッジ角度閾値よりも大きい場合である。この場合には類似度にエッジ強度を加算することはしない。 The edge direction of the pixel 326 is denoted by reference numeral 332 with reference to the edge angle image. If the direction vector from the center of the ellipse template for the pixel is denoted by reference numeral 330, the angle formed by the reference numeral 332 and the reference numeral 330 is δ1. When δ1 exceeds 180 degrees, the value is subtracted from 360 degrees. When δ1 is smaller than a predetermined edge angle threshold, the edge strength is added to the similarity by referring to the edge strength image for the pixel 326.
Similarly, for the pixel 328, the angle δ2 is obtained from the edge direction 334 and the direction vector 324. This angle δ2 is larger than the edge angle threshold. In this case, the edge strength is not added to the similarity.

以上のように、楕円テンプレート３００に重なり、一定以上の強度の画素について、角度情報も参照することで、楕円形状と関係のないエッジ成分の影響を除去できる。なお、変化領域抽出手段１３２にて抽出した変化領域からはずれる部分にＡＴＭ４０の前に立つ人物の頭部がある可能性は低いので、楕円テンプレート３００によるずらしマッチング処理は、その中心３１２が、変化領域抽出手段１３２にて抽出した変化領域内にある場合に限定して行うことが好適である。 As described above, it is possible to remove the influence of the edge component that is not related to the elliptical shape by referring to the angle information for the pixel having a certain intensity or higher and overlapping the elliptical template 300. Since there is a low possibility that the head of a person standing in front of the ATM 40 is in a part that deviates from the change area extracted by the change area extraction unit 132, the shift matching process using the ellipse template 300 has the center 312 at the change area. It is preferable to carry out only when the change area is extracted by the extraction means 132.

楕円テンプレート３００によるマッチング処理により、頭部候補として、入力画像中で適切なところ、即ち所定以上の類似度が求まった楕円テンプレート３００の位置が１つのみであった場合には、その位置にてＡＴＭ４０の前に立つ人物の頭部として決定する。頭部候補として適切なところが複数抽出された場合には、さらに次に述べるような取捨選択のための判断処理を行い、１つに絞り込むとする。 As a result of the matching process using the ellipse template 300, if there is only one position of the ellipse template 300 that is suitable as the head candidate in the input image, that is, the degree of similarity equal to or greater than a predetermined value is obtained, The head of a person standing in front of ATM 40 is determined. When a plurality of appropriate parts are extracted as head candidates, a determination process for selection as described below is further performed to narrow down to one.

頭部抽出手段１３２は、各頭部候補について特徴量として（ｉ）中心の上方度ａ１、（ｉｉ）エッジ包含率ａ２、（ｉｉｉ）肌色画素比率ａ３、（ｉｖ）中心の横偏心度ａ４、（ｖ）変化領域境界一致度ａ５、を算出し、それらと、（ｖｉ）楕円テンプレート３００による類似度ａ６、との線形和にて、入力画像から人物の頭部を抽出する。 For each head candidate, the head extraction unit 132 includes (i) an upward degree a1 at the center, (ii) an edge coverage a2, (iii) a skin color pixel ratio a3, (iv) a lateral eccentricity a4 at the center, (V) The change area boundary coincidence a5 is calculated, and the head of the person is extracted from the input image by linear sum of these and (vi) the similarity a6 based on the elliptical template 300.

（ｉ）中心の上方度ａ１は、頭部候補の中心点が、入力画像中でどれだけ上方にあるかを示す特徴量である。これは頭部候補が誤って、人物の胴体部分で抽出される可能性も否定できない一方、頭部は入力画像中で上方に位置するのが常であることに着目したものである。頭部候補の中心点が入力画像の最上部にあるときに１．０、最下部にあるときに０としてＹ座標に応じて線形に比例関係にあるものとし、頭部候補の中心点のＹ座標を調べて決定する。 (I) The upper degree a1 of the center is a feature amount indicating how far the center point of the head candidate is in the input image. This is because the possibility that the head candidate is erroneously extracted at the body portion of the person cannot be denied, while the head is usually positioned upward in the input image. It is assumed that the center point of the head candidate is linearly proportional to the Y coordinate as 1.0 when the center point is at the top of the input image, and 0 when it is at the bottom. Determine by examining the coordinates.

（ｉｉ）エッジ包含率ａ２は、頭部候補に含まれるエッジ強度の平均を、最大のエッジ強度である２５５で除算したものである。これは、顔の各部位についてある程度の強度のエッジが抽出されるものの、服などでは強いエッジは抽出されないことを利用したものである。 (Ii) The edge coverage rate a2 is obtained by dividing the average of the edge strengths included in the head candidate by 255 which is the maximum edge strength. This is based on the fact that edges with a certain degree of strength are extracted for each part of the face, but strong edges are not extracted with clothes and the like.

（ｉｉｉ）肌色画素比率ａ３は、頭部候補に含まれる総画素数に対する、肌色と思われる画素数の割合である。肌色かどうかの判断は、各画素の色成分をＨ（色相）Ｓ（彩度）Ｖ（明度）に分解し、あらかじめ定めたＨＳＶ表色系の範囲から肌色の可能性が十分に大きいか否かによる。この判断に当たっては、近年ディジタルカメラにおいて実用化されている、人間の顔検出機能に用いられているものと同様な公知の検出処理を用い、いわゆる肌色の条件を満たす画素を特定すれば良いので、詳細は省略する。 (Iii) The skin color pixel ratio a3 is a ratio of the number of pixels considered to be skin color to the total number of pixels included in the head candidate. The skin color is determined by dividing the color component of each pixel into H (hue) S (saturation) V (lightness), and whether the possibility of skin color is sufficiently large from the range of the predetermined HSV color system. Depending on. In this determination, it is only necessary to specify a pixel that satisfies the so-called skin color condition by using a known detection process similar to that used in the human face detection function that has been practically used in digital cameras in recent years. Details are omitted.

（ｉｖ）中心の横偏心度ａ４は、頭部候補の中心のＸ座標が入力画像のＸ軸方向の中心にあるときに１．０、画像の左右端にあるときに０となり、その間は画像のＸ座標に比例して線形に変化するような値とする。これは、ＡＴＭ利用者はＡＴＭの正面で操作することが多いため、頭部は入力画像中では中心に位置する可能性が高いという性質を利用したものである。 (Iv) The lateral eccentricity a4 at the center is 1.0 when the X coordinate of the center of the head candidate is at the center in the X-axis direction of the input image, and is 0 when it is at the left and right ends of the image, and in the meantime It is assumed that the value changes linearly in proportion to the X coordinate. This is because the ATM user often operates in front of the ATM, so that the head is likely to be located at the center in the input image.

（ｖ）変化領域境界一致度ａ５は、変化領域抽出手段１３２にて抽出した変化領域の最上端を結んだ曲線について、頭部候補に対応する楕円テンプレートが一致する度合いである。最上端の曲線に含まれる総画素数に対する、楕円テンプレート３００を構成する画素中で一致する画素数の割合で定義する。通常はＡＴＭ４０の前に立つ人物の頭部は変化領域の最上端に位置するため、最上端の曲線と楕円テンプレートの上側とは一致しやすいという性質を利用したものである。 (V) The change area boundary coincidence a5 is the degree to which the elliptical templates corresponding to the head candidates match on the curve connecting the uppermost ends of the change areas extracted by the change area extracting unit 132. It is defined by the ratio of the number of matching pixels in the pixels constituting the elliptical template 300 to the total number of pixels included in the uppermost curve. Usually, since the head of a person standing in front of ATM 40 is located at the uppermost end of the change area, the characteristic is that the uppermost curve and the upper side of the elliptical template are likely to coincide with each other.

（ｖｉ）楕円テンプレート３００による類似度の算出方法は既に述べたが、さらに０乃至１の値に収まるように、正規化する。それには、楕円テンプレート３００に含まれる画素数に最大のエッジ強度である２５５を乗算して、求めた類似度を除算することで正規化後の類似度ａ６を算出する。 (Vi) Although the method of calculating the similarity using the elliptical template 300 has already been described, normalization is performed so that the value falls within the range of 0 to 1. For this purpose, the normalized similarity a6 is calculated by multiplying the number of pixels included in the ellipse template 300 by 255 which is the maximum edge strength and dividing the obtained similarity.

以上のａ１乃至ａ６から次式（１）に示すように、それぞれに対する重み係数ｗ１乃至ｗ６を乗算し、線形和を求めて、頭部らしい度合いである頭部度合いＨとし、それが最大の頭部候補を、ＡＴＭ４０の前に立つ人物の頭部として決定する。 As shown in the following formula (1) from the above a1 to a6, weight coefficients w1 to w6 for each are multiplied to obtain a linear sum to obtain a head degree H that is a head-like degree. The copy candidate is determined as the head of a person standing in front of the ATM 40.

Ｈ＝ｗ１×ａ１＋ｗ２×ａ２＋ｗ３×ａ３＋ｗ４×ａ４＋ｗ５×ａ５＋ｗ６×ａ６・・・（１） H = w1 * a1 + w2 * a2 + w3 * a3 + w4 * a4 + w5 * a5 + w6 * a6 (1)

重み係数ｗ１乃至ｗ６は、撮影条件などにより適宜実験的に決めるものとする。本実施の形態では、簡単にすべて１とした。 The weighting factors w1 to w6 are determined experimentally as appropriate depending on the shooting conditions. In the present embodiment, all of them are simply set to 1.

なお、サングラスの有無を検出するには、頭部全体を処理対象としても良いが、頭髪や耳や顎は不要である。そこで、図６に示すように、楕円テンプレート３００に対して、頭髪などを含まないような、符号３５２のような矩形領域を考え、入力画像中において矩形領域３５２に含まれる部分のみを以後の処理対象とするのが好適である。入力画像から矩形領域３５２に含まれる部分を切り出したものを頭部領域画像として、記憶部１４０に一時的に記憶しておく。 In order to detect the presence or absence of sunglasses, the entire head may be processed, but hair, ears, and chin are not required. Therefore, as shown in FIG. 6, a rectangular area such as reference numeral 352 that does not include hair or the like is considered for the ellipse template 300, and only the portion included in the rectangular area 352 in the input image is processed thereafter. It is suitable for the object. An image obtained by cutting out a portion included in the rectangular area 352 from the input image is temporarily stored in the storage unit 140 as a head area image.

黒画素抽出手段１３３は、頭部抽出手段１３２にて抽出した、図６の符号３５２に示す矩形領域画像に含まれる各画素が、サングラスを写している可能性の高い画素であるか否かを判断し、可能性の高い画素（以下、「黒画素」と称する）を抽出する。この黒画素の抽出処理について図３を用いて説明する。 The black pixel extraction unit 133 determines whether or not each pixel included in the rectangular area image indicated by the reference numeral 352 in FIG. 6 extracted by the head extraction unit 132 is a pixel that is likely to capture sunglasses. Judgment is made and a highly likely pixel (hereinafter referred to as “black pixel”) is extracted. This black pixel extraction processing will be described with reference to FIG.

図３（ａ）に示す符号２００は、頭部抽出手段１３２にて抽出された頭部領域画像を示している。
人間の顔つきは個人ごとに特徴があるものの、顔を構成する目や鼻などの各部位の位置関係はおおよそ人に依らず一定である。そこで、頭部領域画像２００が得られた場合に、おおよそ目が位置すると思われる部分を含むように、図３（ａ）の符号２０２に示すようなサングラス検索領域を設定し、この領域内で黒画素を抽出し、以下で述べるようにその分布からサングラスの存在を判定する処理を行う。 A reference numeral 200 shown in FIG. 3A indicates a head region image extracted by the head extraction means 132.
Although the human face has a characteristic for each individual, the positional relationship of each part such as eyes and nose constituting the face is approximately constant regardless of the person. Therefore, when the head region image 200 is obtained, a sunglasses search region as shown by reference numeral 202 in FIG. 3A is set so as to include a portion where the eyes are likely to be located. Black pixels are extracted, and processing for determining the presence of sunglasses from the distribution is performed as described below.

ここで、ＡＴＭ４０が設置されるのは多くの場合、金融機関併設のＡＴＭコーナーや無人ＡＴＭブースであり、加えて近年ではコンビニエンスストアの店内にも設置されることが多くなった。
従って、入力画像を取得するにあたり、照明条件はＡＴＭ４０が設置される場所により大きく変わり、それに従い、入力画像中のＡＴＭ４０を操作する人物の顔の明るさや色合いはばらつきが生じ、それによる判定精度の低下が無視できない。そこで、頭部領域画像の画素値を正規化した上で以下に述べる処理を行うものとする。 In many cases, ATMs 40 are installed at ATM corners and unattended ATM booths in addition to financial institutions, and in recent years, they are also often installed in convenience store shops.
Therefore, in obtaining the input image, the illumination conditions vary greatly depending on the location where the ATM 40 is installed, and accordingly, the brightness and color tone of the face of the person operating the ATM 40 in the input image varies, and the accuracy of the determination due to this varies. The decline cannot be ignored. Therefore, the following processing is performed after normalizing the pixel values of the head region image.

頭部領域画像に含まれる各画素の色成分に基づき、人の肌色の可能性が高い画素を抽出する。その方法は、前述のようにディジタルカメラの分野にて用いられている周知の方法を採用すればよい。 Based on the color component of each pixel included in the head region image, a pixel having a high possibility of human skin color is extracted. As the method, a known method used in the field of digital cameras as described above may be employed.

次に、抽出した肌色の可能性が高い画素内の画素のカラーバランス（ＲＧａｉｎ＝Ｒ／Ｇ、ＢＧａｉｎ＝Ｂ／Ｇ）と、公知の方法にて白黒化した場合の輝度値について、平均値を算出し、それぞれをＲＧａｉｎＡｖｅ、ＢＧａｉｎＡｖｅ、ＧｒａｙＡｖｅとする。そして、頭部領域画像２００内の全画素について、カラーバランスと輝度値を正規化する。正規化は、次式（２）乃至（４）による。なお、ＲＴとＢＴは、それぞれ、理想的に肌色が抽出された際のＲ成分とＢ成分である。 Next, the average value is calculated for the color balance (RGain = R / G, BGain = B / G) of the pixels in the extracted pixel with a high possibility of skin color, and the luminance value when black and white is obtained by a known method. And RGainAve, BGainAve, and GrayAve, respectively. Then, the color balance and the luminance value are normalized for all the pixels in the head region image 200. Normalization is performed according to the following equations (2) to (4). Note that RT and BT are an R component and a B component, respectively, when the skin color is ideally extracted.

Ｒ＝Ｒ×ＲＴ／ＲＧａｉｎＡｖｅ×ＧｒａｙＴ／ＧｒａｙＡｖｅ・・・（２） R = R * RT / RGainAve * GrayT / GrayAve (2)

Ｇ＝Ｇ×ＧｒａｙＴ／ＧｒａｙＡｖｅ・・・（３） G = G × GrayT / GrayAve (3)

Ｒ＝Ｂ×ＢＴ／ＢＧａｉｎＡｖｅ×ＧｒａｙＴ／ＧｒａｙＡｖｅ・・・（４） R = B × BT / BGainAve × GrayT / GrayAve (4)

そして、正規化後の各画素についてＳ（彩度）、Ｖ（明度）、ＲＧａｉｎおよびＢＧａｉｎを用い、次式（５）または次式（６）のいずれかを満たした場合は、サングラスの可能性が大きい画素である黒画素として抽出する。ただし、Ｓ（彩度）とＶ（明度）の値は、０〜２５５の範囲に正規化されているものとする。 If each pixel after normalization uses S (saturation), V (lightness), RGain, and BGain and satisfies either of the following equations (5) or (6), the possibility of sunglasses Are extracted as black pixels which are large pixels. However, it is assumed that the values of S (saturation) and V (lightness) are normalized to a range of 0 to 255.

（Ｓ＜３０ｏｒ７０＜Ｓ）ａｎｄＶ＜７０・・・（５） (S <30 or 70 <S) and V <70 (5)

（ＲＧａｉｎ＜０．８ｏｒＢＧａｉｎ＞１．２）ａｎｄ７０≦Ｖ＜１２０・・・（６） (RGain <0.8 or BGain> 1.2) and 70 ≦ V <120 (6)

式（５）は、画素が黒いこと、すなわち、公知の方法にてグレースケール化した場合、輝度が低く、かつ、Ｓ（彩度）の範囲として肌色を表さない条件を示している。
通常、サングラスは黒いため、Ｖ（明度）は低くなるが、その一方で人間の肌も照明条件や髪型、帽子の鍔の影によっては暗くなり、明度が低くなることも多い。よって、単純に明度のみでは、画素がサングラスを写しているのか、影がかかった肌を写しているのかの区別は困難である。そこで式（５）ではＳ（彩度）の値も参照することにより、判定精度を高めている。 Expression (5) indicates that the pixel is black, that is, a condition where the luminance is low and the skin color is not expressed as a range of S (saturation) when grayscale is formed by a known method.
Since sunglasses are usually black, V (brightness) is low. On the other hand, human skin is also darkened due to lighting conditions, hairstyles, and shadows of hats, and brightness is often low. Therefore, it is difficult to distinguish whether a pixel shows sunglasses or a shadowed skin only by lightness. Therefore, in the expression (5), the determination accuracy is improved by referring to the value of S (saturation).

式（６）は、サングラスの表面で鏡面反射が発生し、明度が高くなった場合に対応するものである。鏡面反射が発生した場合、鏡面反射が発生していない部分よりも明度は高くなるものの、例えば直接光源を撮影するような、撮像装置２０のダイナミックレンジを越えるほどの明るさにはならない。
また鏡面反射の色は、サングラスが写している他の物体、例えばＡＴＭコーナーの壁やＡＴＭ本体の塗装によって生じる色であり、あらかじめ範囲を定めることが難しい。そこでＲＧａｉｎとＢＧａｉｎの値を参照し、明らかに肌色ではない画素のみを黒画素として抽出する。 Equation (6) corresponds to the case where specular reflection occurs on the surface of the sunglasses and the brightness increases. When the specular reflection occurs, the brightness is higher than that of the portion where the specular reflection does not occur, but the brightness does not become so bright as to exceed the dynamic range of the imaging device 20, for example, when directly photographing a light source.
Further, the color of the specular reflection is a color generated by the painting of other objects reflected by the sunglasses, for example, the wall of the ATM corner or the ATM body, and it is difficult to determine the range in advance. Therefore, by referring to the values of RGain and BGain, only pixels that are clearly not skin color are extracted as black pixels.

ただし、頭部抽出手段１３２が求めたエッジ強度画像中、エッジ強度が高い画素に対応する頭部領域画像２００中の画素は黒画素とはしない。これは目や眉毛は黒く、単に色成分を調べると黒画素との区別が困難であるためである。そこで、サングラスのレンズにはテクスチャは無いものの、目や眉毛にはテクスチャがあることに着目し、輝度エッジ強度が高い部分は黒画素にはしないこととしたものである。 However, in the edge intensity image obtained by the head extracting unit 132, the pixel in the head region image 200 corresponding to the pixel having a high edge intensity is not a black pixel. This is because eyes and eyebrows are black, and it is difficult to distinguish them from black pixels simply by examining the color components. Therefore, although there is no texture in the lens of the sunglasses, attention is paid to the texture in the eyes and eyebrows, and the portion with high luminance edge strength is not set as a black pixel.

特徴量算出手段１３５は、ＡＴＭ４０の前に立つ人物がサングラスを着用しているかどうかの可能性を表すサングラス特徴量を、黒画素抽出手段１３３にて抽出した黒画素の分布状況に基づいて算出する。サングラス特徴量は、（ｉ）対称度、（ｉｉ）エッジ一致度、（ｉｉｉ）形状度に分けられる。 The feature quantity calculating means 135 calculates a sunglasses feature quantity that indicates whether or not a person standing in front of the ATM 40 is wearing sunglasses based on the distribution state of black pixels extracted by the black pixel extracting means 133. . The sunglasses feature amount is divided into (i) symmetry, (ii) edge coincidence, and (iii) shape.

（ｉ）対称度は、黒画素が頭部領域画像２００において左右対称、即ち、図３（ａ）のようにＸ軸とＹ軸を定義した場合に、頭部領域画像２００を左右に２分するＸ座標一定の線を対称軸として、その左右で概略同等に分布していることを表す特徴量である。この対称度の求め方を以下に説明する。 (I) The degree of symmetry is such that the black pixel is bilaterally symmetric in the head region image 200, that is, when the X axis and the Y axis are defined as shown in FIG. This is a feature amount indicating that the X-coordinate constant line is distributed approximately equally on the left and right sides of the line. A method for obtaining this degree of symmetry will be described below.

まず黒画素を垂直軸（Ｙ軸）に投影して、度数に関するヒストグラムを作成する。当然ながら、度数が多い部分ほど頭部領域画像２００において、実際にサングラスのレンズが存在する可能性が高いことになる。ヒストグラムの例を図３（ｂ）の符号２１１に示し、以下、本明細書では「縦ヒストグラム」と称する。また、縦ヒストグラムの最大度数を示したＹ座標をＨｍａｘＹとする。なお、符号２１３に示す斜線領域は、黒画素抽出手段１３３にて抽出された黒画素が含まれる領域を示している。 First, black pixels are projected on the vertical axis (Y axis), and a histogram relating to the frequency is created. Of course, the higher the frequency, the higher the possibility that a lens of sunglasses actually exists in the head region image 200. An example of the histogram is indicated by reference numeral 211 in FIG. 3B, and is hereinafter referred to as “vertical histogram” in the present specification. Further, the Y coordinate indicating the maximum frequency of the vertical histogram is defined as HmaxY. A hatched area indicated by reference numeral 213 indicates an area including the black pixels extracted by the black pixel extracting means 133.

次にＹ座標がＨｍａｘＹの行において、その度数を調べ、度数に関する閾値ｔｈ１より小さい場合には、後述の判定手段１３６でサングラスは検出されなかったと判定する。これは、サングラスらしい画素があまり含まれていない場合であるので、以降の処理を行う必要がない場合である。閾値ｔｈ１は、撮像装置２０の解像度や画角を考慮して適切な値を設定する。 Next, in the row where the Y coordinate is HmaxY, the frequency is examined. If the frequency is smaller than the threshold value th1 relating to the frequency, it is determined that the sunglasses are not detected by the determination unit 136 described later. This is a case where pixels like sunglasses are not included so much, and therefore, it is not necessary to perform the subsequent processing. The threshold th1 is set to an appropriate value in consideration of the resolution and angle of view of the imaging device 20.

また、ＨｍａｘＹが、座標に関する閾値ｔｈ２より大きい場合、つまり、ＨｍａｘＹが頭部領域画像２００の下の方に位置することになった場合も後述の判定手段１３６でサングラスは検出されなかったと判定する。これは、サングラスは、人の顔の構造上、上部にある目の付近に位置することを利用したものである。閾値ｔｈ２は、撮像装置２０の解像度や画角を考慮して決定する。 Further, when HmaxY is larger than the threshold value th 2 regarding coordinates, that is, when HmaxY is positioned below the head region image 200, it is determined that the sunglasses are not detected by the determination unit 136 described later. This utilizes the fact that sunglasses are positioned in the vicinity of the upper eye in the structure of a human face. The threshold th2 is determined in consideration of the resolution and angle of view of the imaging device 20.

次に、Ｙ座標がＨｍａｘＹとなる行に含まれる黒画素の数ＳＰｉｘＡＬＬを計数し、頭部領域画像２００の横幅の画素数に対する割合をＳＲａｔｉｏとする。このＳＲａｔｉｏが大きいほどその行はサングラスを写している行の可能性が高いが、Ｙ座標がＨｍａｘＹである行は、一般的なサングラスの形状を考慮すると、ブリッジ等フレーム付近の行であると考えられる。 Next, the number SPixALL of black pixels included in the row where the Y coordinate is HmaxY is counted, and the ratio of the horizontal width of the head region image 200 to the number of pixels is set to SRatio. The larger this SRatio is, the higher the possibility that the line is a line that captures sunglasses, but the line whose Y coordinate is HmaxY is considered to be a line near the frame such as a bridge in consideration of the shape of a general sunglasses. It is done.

その一方で、例えば前髪を大きく垂らす髪型の場合には、髪も黒いのでサングラスでなくてもＳＲａｔｉｏが大きくなる場合がある。しかし、図３（ｅ）の符号２４０に示す頭部領域画像のように、片目を髪が覆うことがあっても両目を覆うと前が見えず、ＡＴＭを操作するのが難しくなるため、両目を覆うことは少ないと考えられる。よって、サングラスの形状は左右で対称であることを利用した対称度に着目できるのである。対称度は次式（７）にて求められる。 On the other hand, for example, in the case of a hairstyle that droops bangs largely, the hair is black, so SRatio may increase even if it is not sunglasses. However, as in the head region image indicated by reference numeral 240 in FIG. 3 (e), even if hair covers one eye, if both eyes are covered, the front cannot be seen and it is difficult to operate the ATM. It is thought that there is little covering. Therefore, it is possible to pay attention to the degree of symmetry using the fact that the shape of the sunglasses is symmetrical on the left and right. The degree of symmetry is obtained by the following equation (7).

対称度＝Ｍｉｎ（ＳＰｉｘＬ/ＳＰｉｘＡＬＬ，ＳＰｉｘＲ/ＳＰｉｘＡＬＬ）×２・・・・（７） Symmetry = Min (SPixL / SPixALL, SPixR / SPixALL) × 2 (7)

ここで、ＳＰｉｘＲは、頭部領域画像２００において、基準位置から人の顔の右半面に相当する部分に含まれる黒画素の数であり、ＳＰｉｘＬは、同様に左半面に相当する部分に含まれる黒画素の数である。また、Ｍｉｎ（、）は、かっこ内の数値のうちの最小値を示す。 Here, SPixR is the number of black pixels included in the portion corresponding to the right half of the human face from the reference position in the head region image 200, and SPixL is also included in the portion corresponding to the left half. This is the number of black pixels. Min (,) indicates the minimum value among the numerical values in parentheses.

対称度は、Ｙ座標がＨｍａｘＹの行において、顔の半面にのみ黒画素が存在する場合に０となり、黒画素が左右対称に分布している場合に１となる。１になるほど対称性が高いことを表す。 The degree of symmetry is 0 when a black pixel exists only on the half face of the face with a Y coordinate HmaxY, and is 1 when the black pixel is distributed symmetrically. A value of 1 indicates higher symmetry.

人の顔の右半面と左半面を区別する基準位置の決め方は、人の顔に関する構造情報に基づいたものを適宜採用することができるが、本実施の形態では簡単に頭部領域画像２００の左右の中点を基準位置とする。後述のように、人の頭部の正中線を求めて、それを基準にしても良い。 As a method of determining the reference position for distinguishing the right half surface and the left half surface of the human face, a method based on the structural information related to the human face can be appropriately adopted. However, in the present embodiment, the head region image 200 is simply determined. The left and right midpoints are used as reference positions. As will be described later, the midline of the human head may be obtained and used as a reference.

（ｉｉ）エッジ一致度は、上部一致度と下部一致度に基づいて定まり、サングラスの着用と、帽子や髪の影が目の付近にかかっていたり、彫りの深い顔つきの人物では目の付近が暗くなったりすることとの区別をするために導入したものである。
すなわち、画素の明るさや色合いだけでは、一般に黒いサングラスと黒い髪や影がかかった肌との区別は難しい。一方で、帽子や髪の影、彫りの深い顔つきの目の付近では、黒い画素が固まって抽出されてもその周辺部分に着目すると、強いエッジが抽出されることは稀で、抽出されても黒画素の位置とは一致しないことが多い。当然、サングラスを着用していれば、レンズの縁で強いエッジが抽出され、サングラス候補領域の境界とは一致しやすいことを利用したものである。 (Ii) The degree of edge matching is determined based on the degree of matching between the upper part and the lower part. Wearing sunglasses and the shadow of a hat or hair on the eyes, It was introduced to distinguish it from darkening.
That is, it is generally difficult to distinguish black sunglasses from dark hair or shadowed skin only by the brightness and color of the pixels. On the other hand, in the vicinity of eyes with hats, shadows of hair, and deeply carved faces, even if black pixels are fixed and extracted, focusing on the surrounding area rarely extracts strong edges, even if extracted In many cases, it does not coincide with the position of the black pixel. Naturally, if sunglasses are worn, a strong edge is extracted at the edge of the lens, and it is easy to match the boundary of the candidate sunglasses region.

上部一致度の求め方を、図３（ｂ）を用いて説明する。特徴量算出手段１３５は、頭部領域画像２００のＹ座標がＨｍａｘＹとなる行について、黒画素となっている画素を特定し、その各画素のＸ座標について上方向（Ｙ座標が小さくなる方向）に黒画素の有無を調べる。そして黒画素が途切れる画素（以下、「上部境界画素」と称する）を特定し、計数する。特定した上部境界画素ＥＵＰｉｘを並べて曲線で表したもの図３（ｃ）の符号２２０で示す。サングラスのレンズは２枚あるので、図３（ｃ）でも曲線２２０は２つある。 A method for obtaining the upper coincidence will be described with reference to FIG. The feature amount calculation unit 135 specifies a pixel that is a black pixel for a row in which the Y coordinate of the head region image 200 is HmaxY, and moves upward in the X coordinate of each pixel (a direction in which the Y coordinate decreases). Check for black pixels. Then, the pixels where the black pixels are interrupted (hereinafter referred to as “upper boundary pixels”) are identified and counted. The identified upper boundary pixel EUPix is represented by a curved line and is indicated by reference numeral 220 in FIG. Since there are two lenses of sunglasses, there are two curves 220 in FIG.

次に、特徴量算出手段１３５は、頭部抽出手段１３２にて抽出したエッジ強度画像を参照し、一定以上のエッジ強度の画素であり、上部境界画素ＥＵＰｉｘと頭部領域画像２００中で一致する画素ＳＵＰｉｘを計数する。そして、次式（８）にて上部一致度を求める。 Next, the feature quantity calculating unit 135 refers to the edge intensity image extracted by the head extracting unit 132, is a pixel having a certain edge intensity or more, and matches the upper boundary pixel EUPix in the head region image 200. Pixel SUPix is counted. Then, the upper coincidence is obtained by the following equation (8).

上部一致度＝ＳＵＰｉｘの個数／ＥＵＰｉｘの個数・・・（８） Upper part coincidence = number of SUPix / number of EUPix (8)

次に下部一致度の求め方を図３（ｂ）を用いて説明する。特徴量算出手段１３５は、頭部領域画像２００のＹ座標がＨｍａｘＹとなる行について、黒画素となっている画素を特定し、その各画素のＸ座標について下方向（Ｙ座標が大きくなる方向）に黒画素の有無を調べていき、黒画素が途切れる画素（以下、「下部境界画素」と称する）を特定し、計数する。特定した下部境界画素ＥＬＰｉｘを並べて曲線で表したものを図３（ｄ）の符号２３０で示す。サングラスのレンズは２枚あるので、図３（ｄ）でも曲線２３０は２つある。 Next, how to obtain the lower coincidence will be described with reference to FIG. The feature amount calculation unit 135 specifies a pixel that is a black pixel in a row in which the Y coordinate of the head region image 200 is HmaxY, and the downward direction (the direction in which the Y coordinate increases) of the X coordinate of each pixel. Then, the presence or absence of black pixels is examined, and the pixels where the black pixels are interrupted (hereinafter referred to as “lower boundary pixels”) are identified and counted. The identified lower boundary pixel ELPix arranged in a curve is indicated by reference numeral 230 in FIG. Since there are two lenses of sunglasses, there are two curves 230 in FIG.

図３（ｃ）の曲線２２０、図３（ｄ）の曲線２３０とも、例示としては２つに分かれて描画したが、サングラスの形状は典型的には図３（ａ）の符号２０１に示すようなものではあるものの、必ずしもレンズが概楕円形で、ブリッジが細いとは限らない。レンズもブリッジも他のフレーム部分も同じプラスチック素材で、太いデザインである場合には曲線２２０と曲線２３０は１つとなることもある。 Although the curve 220 in FIG. 3C and the curve 230 in FIG. 3D are drawn in two parts as an example, the shape of the sunglasses is typically as shown by reference numeral 201 in FIG. However, the lens is not necessarily elliptical and the bridge is not always thin. If the lens, bridge, and other frame parts are made of the same plastic material and have a thick design, the curves 220 and 230 may be one.

次に、特徴量算出手段１３５は、頭部抽出手段１３２にて抽出したエッジ強度画像を参照し、一定以上のエッジ強度の画素であり、下部境界画素ＥＬＰｉｘと頭部領域画像３５２中で一致する画素ＳＬＰｉｘを計数する。そして、次式（９）にて下部一致度を求める。 Next, the feature amount calculating unit 135 refers to the edge intensity image extracted by the head extracting unit 132, is a pixel having a certain edge intensity or more, and matches the lower boundary pixel ELPix in the head region image 352. Pixel SLPix is counted. Then, the lower coincidence is obtained by the following equation (9).

下部一致度＝ＳＬＰｉｘの個数／ＥＬＰｉｘの個数・・・（９） Lower coincidence = number of SLPix / number of ELPix (9)

そして、エッジ一致度を次式（１０）で求める。 Then, the edge coincidence is obtained by the following equation (10).

エッジ一致度＝Ｍａｘ（上部一致度、下部一致度）・・・（１０） Edge coincidence = Max (upper coincidence, lower coincidence) (10)

ここで、Ｍａｘ（、）は、括弧内の値のうち最大のものを求めることを表す。エッジ一致度は、黒画素とそうではない画素との境界と強いエッジ画素とが一致するほど１に近い値となり、全く一致しない場合に０となる。 Here, Max (,) represents obtaining the maximum value among the values in parentheses. The edge coincidence becomes a value closer to 1 as the boundary between a black pixel and a pixel other than that matches with a strong edge pixel, and becomes 0 when there is no coincidence.

（ｉｉｉ）形状度は、サングラスは人の顔にかけるものである、という性質上、レンズ下部に比べ、ブリッジ付近は凹んでおり、正面から見るとレンズ下部よりもブリッジは上部に位置するものであることを捉えたものである。 (Iii) The degree of shape is that sunglasses wears on the human face, so the area near the bridge is recessed compared to the bottom of the lens, and the bridge is located above the bottom of the lens when viewed from the front. It captures something.

形状度の求め方を図３（ｂ）および（ｄ）を用いて説明する。まず下部一致度を求める際に特定した下部境界画素ＥＬＰｉｘのうち、顔の左半面に属するものの中から頭部候補画像中で最も下にあるもの、すなわちＹ座標が最も大きい画素ＰＬを特定する。同様に顔の右半面に属する下部境界画素ＥＬＰｉｘのうち、最も下にあるもの、すなわちＹ座標が最も大きい画素ＰＲを特定する。それぞれの座標を（ＸＬ、ＹＬ）、（ＸＲ、ＹＲ）とする。 A method for obtaining the shape will be described with reference to FIGS. First, among the lower boundary pixels ELPix specified when obtaining the lower matching degree, the lowermost pixel in the head candidate image among those belonging to the left half of the face, that is, the pixel PL having the largest Y coordinate is specified. Similarly, the lower boundary pixel ELPix belonging to the right half of the face is identified as the lowest pixel, that is, the pixel PR having the largest Y coordinate. The respective coordinates are (XL, YL) and (XR, YR).

但し、画素ＰＬと画素ＰＲは、人の顔の右半面と左半面を区別する基準位置である頭部領域画像２００の中点より、一定以上左右に離れた位置、例えば、頭部領域画像２００の幅の１／４だけ離れた点を中心に、頭部領域画像２００の幅の１／８の幅の領域に含まれる下部境界画素ＥＬＰｉｘから選ぶものとする。これはサングラスの形状として、上述のようにブリッジ付近は凹んでおり、黒画素の領域は下向きに凹となっていることを捉えるためのものである。 However, the pixel PL and the pixel PR are positions that are more than a certain distance from the midpoint of the head region image 200 that is a reference position for distinguishing the right and left half faces of a human face, for example, the head region image 200. The lower boundary pixel ELPix included in a region having a width of 1/8 of the width of the head region image 200 with a point separated by 1/4 of the width of the head region image 200 as a center. This is to capture that the shape of the sunglasses is concave in the vicinity of the bridge as described above and the black pixel region is concave downward.

次に、下部境界画素ＥＬＰｉｘのうち、Ｘ座標についてＸＲ＜ｘ＜ＸＬを満たす画素のＹ座標を順次調べ、その最小値、即ち、頭部領域画像２００中で最も上部にある画素ＰＣを特定する。その座標を（ＸＣ、ＹＣ）とする。但し、サングラスの形状によっては図３（ｄ）のように曲線２３０が２つに分かれ、画素ＰＣを下部境界画素ＥＬＰｉｘから特定できない場合もある。機械的に画素ＰＣを特定すると図３（ｄ）のようにＹＣ＝０、つまり頭部領域画像２００の最上端となってしまうので、このような場合にはＹＣの最小値としてＨｍａｘＹと定義し、Ｘ座標を頭部領域画像２００の中点とした画素ＰＣ’を画素ＰＣの代わりとする。 Next, among the lower boundary pixels ELPix, the Y coordinate of the pixels satisfying XR <x <XL is sequentially examined with respect to the X coordinate, and the minimum value, that is, the pixel PC at the uppermost position in the head region image 200 is specified. . Let the coordinates be (XC, YC). However, depending on the shape of the sunglasses, the curve 230 may be divided into two as shown in FIG. 3D, and the pixel PC may not be identified from the lower boundary pixel ELPix. If the pixel PC is mechanically specified, YC = 0 as shown in FIG. 3D, that is, the uppermost end of the head region image 200. In such a case, HmaxY is defined as the minimum value of YC. The pixel PC ′ having the X coordinate as the midpoint of the head region image 200 is used instead of the pixel PC.

形状度は、次式（１１）を満たす場合に１、満たさない場合に０と定義する。ここで、ｔｈ３はサングラスの形状に関する閾値であり、適宜実験的に決めることとする。 The degree of shape is defined as 1 when the following expression (11) is satisfied, and 0 when it is not satisfied. Here, th3 is a threshold value related to the shape of the sunglasses and is appropriately determined experimentally.

ＭＩＮ（ＹＬ、ＹＲ） ― ＹＣ＞ｔｈ３・・・（１１） MIN (YL, YR) −YC> th3 (11)

この形状度は、例えば鍔付きの帽子を深く被ることで、鍔の帽子の影により図３（ａ）の符号２０２で示したサングラス検索領域付近が暗くなった場合との判別に用いるためのものである。つまり、帽子を深く被った場合には、サングラスを着用した場合とは異なり、鼻梁を写している部分である中央付近も黒画素が存在する。よって、上記のような画素ＰＣと、画素ＰＲまたはＰＬとが垂直方向に大きく離れない場合には、サングラス着用ではないと判断することとしたものである。 This shape degree is used for discrimination from the case where the vicinity of the sunglasses search area indicated by reference numeral 202 in FIG. It is. In other words, when wearing a hat deeply, unlike in the case of wearing sunglasses, black pixels also exist near the center, which is a portion where the nose bridge is shown. Therefore, when the pixel PC as described above and the pixel PR or PL are not largely separated in the vertical direction, it is determined that the user is not wearing sunglasses.

判定手段１３６は、特徴量算出手段１３５にて求めた、各サングラス特徴量に基づき、ＡＴＭ４０の前に立つ人物がサングラスを着用しているか否かを判定する。また、サングラス特徴量を算出する途中で、ＡＴＭ４０の前に立つ人物が明らかにサングラス着用の条件を満たしていない場合に、サングラス非着用と判定する。
その判定には次式（１２）で定義するサングラスらしさを求めて、それが所定のサングラス閾値ｔｈ４以上の場合には、ＡＴＭ４０の前に立つ人物がサングラスを着用していると判定する。サングラス閾値ｔｈ４は、実験により求めるものとする。 The determination unit 136 determines whether the person standing in front of the ATM 40 is wearing sunglasses based on each sunglasses feature amount obtained by the feature amount calculation unit 135. In addition, when the person standing in front of the ATM 40 clearly does not satisfy the conditions for wearing sunglasses during the calculation of the sunglasses feature value, it is determined that the sunglasses are not worn.
For the determination, the likelihood of sunglasses defined by the following equation (12) is obtained, and if it is equal to or greater than the predetermined sunglasses threshold th4, it is determined that the person standing in front of the ATM 40 is wearing sunglasses. The sunglasses threshold value th4 is obtained by experiment.

サングラスらしさ＝（対称度）×（エッジ一致度）×（形状度）・・・（１２） Sunglasses-likeness = (symmetry) x (edge coincidence) x (shape) (12)

なお、前述のように形状度の値は０または１のいずれかをとるため、（１１）式を満たさない場合には、サングラスらしさは０となるため、判定手段１３６は、必ずサングラス非着用の結果を出力する。 As described above, the value of the degree of shape is either 0 or 1. Therefore, when the expression (11) is not satisfied, the likelihood of sunglasses is 0. Therefore, the determination unit 136 must always wear sunglasses. Output the result.

更新手段１３７は、記憶部に１４０に記憶されている基準画像１４２を更新する。
更新には、各時点で撮像装置２０により取得された入力画像の各画素の画素値と、既に記憶済みの基準画像１４２の対応する画素にそれぞれ所定の重み計数を乗算して、加算して更新する。単純に入力画像を丸ごと基準画像１４２に置き換えても良い。
但し、基準画像１４２の更新は人感センサ３０がＯＦＦ、即ち、ＡＴＭ４０の前には人がいない状態で行うものとする。 The update unit 137 updates the reference image 142 stored in the storage unit 140.
For updating, the pixel value of each pixel of the input image acquired by the imaging device 20 at each time point is multiplied by a predetermined weighting factor and the corresponding pixel of the already stored reference image 142 is added and updated. To do. The entire input image may be simply replaced with the reference image 142.
However, the reference image 142 is updated when the human sensor 30 is OFF, that is, when there is no person before the ATM 40.

通報部１５０は、判定手段１３６の結果に基づき、ＡＴＭ４０の前に立つ人物がサングラスを着用している場合に、不審者であるとして、その旨を外部に出力する。ＬＥＤなどの警報ランプや警報ブザーなどによって、サングラスを外すよう促したり、予め記憶されたメッセージを、スピーカーなどを介して音声として流したり、ディスプレイ上に表示したりしてもよい。また、記億部１４０に記憶された不審者画像１４４は、ディスプレイやプリンタなどの出力装置（図示せず）を介して出力することもできる。
さらには、通報部１５０に通信Ｉ／Ｆの機能を持たせ、警備システムに結線することで、金融機関または警備会社に通報して迅速な対応をとることが可能となる。 When the person standing in front of ATM 40 is wearing sunglasses based on the result of determination means 136, reporting section 150 outputs that fact to the outside as a suspicious person. An alarm lamp such as an LED, an alarm buzzer, or the like may prompt the user to remove the sunglasses, or a prestored message may be played as sound through a speaker or displayed on a display. Further, the suspicious person image 144 stored in the storage unit 140 can be output via an output device (not shown) such as a display or a printer.
Furthermore, by providing the reporting unit 150 with a communication I / F function and connecting it to a security system, it is possible to report to a financial institution or a security company and take a quick response.

次にフロー図を参照し、本実施の形態にかかる不審人物検出装置１の動作を説明する。図７は、そのメインフロー図である。まず不審人物検出装置１の電源を投入後、各種初期設定などが終了しているとする。 Next, the operation of the suspicious person detection device 1 according to the present embodiment will be described with reference to a flowchart. FIG. 7 is a main flow diagram thereof. First, assume that various initial settings have been completed after the suspicious person detection device 1 is powered on.

ステップＳ１００にて、各時刻において撮像装置２０が、ＡＴＭ４０の前の領域を撮影し、取得した入力画像を画像入力部１１０に送信する。 In step S 100, the imaging device 20 captures an area before the ATM 40 at each time, and transmits the acquired input image to the image input unit 110.

サングラス着用検出装置１０は、人感センサ３０が反応しているか否かを調べ（ステップＳ１２０）、反応していない場合（ＯＦＦの場合）は、ＡＴＭ４０の前は無人であるとして、フラグをＯＦＦにセットする（ステップＳ１４０）。
このフラグは、ＡＴＭ４０の前に人が立ったことが検出された場合に、当該人物についてサングラス検出処理を既に行っているか否かを、次以降の時刻にて参照するためのものであり、記憶部１４０に記憶されているものである。 The sunglasses wearing detection device 10 checks whether or not the human sensor 30 is responding (step S120). If it is not responding (in the case of OFF), it is determined that the ATM 40 is unattended and the flag is turned OFF. Set (step S140).
This flag is used to refer to whether or not sunglasses detection processing has already been performed for the person at a subsequent time when it is detected that a person has stood before the ATM 40. This is stored in the unit 140.

人感センサ３０が反応していない場合、ステップＳ１６０にて、記憶部１４０の基準画像１４２を更新するものとする。日照変動などに対応するためである。 If the human sensor 30 is not responding, the reference image 142 in the storage unit 140 is updated in step S160. This is to cope with sunshine fluctuations.

ステップＳ１２０にて、人感センサ３０が反応している場合、つまりＡＴＭ４０の前に人が立っていると判断される場合にはステップＳ１８０に進み、前述のフラグの状態を調べる。フラグがＯＮにセットされている場合は、既にサングラス検出処理は済んでおり、後述のように警告アナウンスや通常の来客に対する操作案内がされているとして、ステップＳ１００に戻る（ステップＳ１８０のＹの分岐）。 If the human sensor 30 is reacting in step S120, that is, if it is determined that a person is standing in front of the ATM 40, the process proceeds to step S180 to check the state of the flag. If the flag is set to ON, the sunglasses detection process has already been completed, and it is assumed that a warning announcement or operation guidance for a normal visitor has been given as described later, and the process returns to step S100 (Y branch of step S180). ).

ステップＳ１８０にて、フラグがＯＦＦの場合には、ＡＴＭ４０の前には人がいるが、サングラス検出処理は未だ行われていない場合なので、サングラス検出処理を行うべくステップＳ２００に進む。 If the flag is OFF in step S180, there is a person in front of ATM 40, but the sunglasses detection process has not yet been performed, so the process proceeds to step S200 to perform the sunglasses detection process.

ステップＳ２００にて、変化領域抽出手段１３１は、記憶部１４０に記憶されている基準画像１４２と、画像入力部１１０から送られてきた入力画像を白黒化して画素ごとの差分処理（背景差分処理）を行い、入力画像中に生じた変化領域を抽出する。 In step S200, the change area extraction unit 131 converts the reference image 142 stored in the storage unit 140 and the input image sent from the image input unit 110 into black and white, and performs difference processing for each pixel (background difference processing). , And a change area generated in the input image is extracted.

ステップＳ２２０にて、頭部抽出手段１３２は、変化領域抽出手段１３１が抽出した変化領域から、ＡＴＭ４０の前に立つ人物の頭部に相当する部分を抽出する。 In step S220, the head extracting unit 132 extracts a portion corresponding to the head of a person standing in front of the ATM 40 from the changed region extracted by the changed region extracting unit 131.

ステップＳ２４０にて、黒画素抽出手段１３３が抽出した黒画素の分布に基づき、特徴量算出手段１３５が算出した各特徴量からサングラスらしさを求め、判定手段１３６が、それに基づき、ＡＴＭ４０の前に立つ人物がサングラスを着用しているか否かを判定する。この処理の詳細は後述する。 In step S240, based on the black pixel distribution extracted by the black pixel extraction unit 133, the likelihood of sunglasses is obtained from each feature amount calculated by the feature amount calculation unit 135, and the determination unit 136 stands in front of the ATM 40 based thereon. It is determined whether the person is wearing sunglasses. Details of this processing will be described later.

ステップＳ２６０にて、フラグをＯＮにセットする。これにより、一旦サングラスの着用の検出処理を行った後は、ステップＳ１８０でＹの分岐をたどるため、ＡＴＭ４０の前に立つ人物が立ち去るまで処理負荷の高いサングラス検出処理を繰り返さないで済む。 In step S260, the flag is set to ON. Thus, once the detection process of wearing sunglasses is performed, the branch of Y is followed in step S180, so that it is not necessary to repeat the sunglasses detection process with a high processing load until the person standing in front of the ATM 40 leaves.

ステップＳ２８０では、ステップＳ２４０での処理結果を参照し、サングラスが検出されなかった場合、つまりＡＴＭ４０の前に立つ人物がサングラスを着用してないと判断された場合は、Ｎの分岐をたどり、「いらっしゃいませ」などの通常のアナウンスを流し（ステップＳ３００）、ステップＳ１００に戻る。 In step S280, referring to the processing result in step S240, if sunglasses are not detected, that is, if it is determined that the person standing in front of ATM 40 is not wearing sunglasses, the N branch is followed. A normal announcement such as “Welcome” is played (step S300), and the process returns to step S100.

ステップＳ２４０の処理結果を参照し、サングラスが検出された場合（ステップＳ２８０でＹの分岐）、ＡＴＭ４０の前に立つ人物に「サングラスを外してご利用ください」等の警告アナウンスを流す。通報部１５０が通信Ｉ／Ｆの機能を有し、ＡＴＭ４０と結線されている場合には、サングラス着用検出装置１０は、ＡＴＭ４０の利用を停止する信号を出力しても良い。さらにはサングラスが検出された時の入力画像を記憶部１４０の不審者画像１４４に記憶しておき、外部からの送信要求に応じて、または自動的に通報部１５０経由で外部に送信しても良い。 Referring to the processing result of step S240, if sunglasses are detected (Y branch in step S280), a warning announcement such as “Please remove sunglasses and use” is sent to the person standing in front of ATM 40. When the reporting unit 150 has a communication I / F function and is connected to the ATM 40, the sunglasses wearing detection device 10 may output a signal for stopping the use of the ATM 40. Further, an input image when sunglasses are detected may be stored in the suspicious person image 144 of the storage unit 140 and transmitted to the outside via the notification unit 150 in response to an external transmission request or automatically. good.

次に、図８を参照して、ステップＳ２４０におけるサングラス検出処理の詳細について説明する。 Next, the details of the sunglasses detection process in step S240 will be described with reference to FIG.

ステップＳ５００にて、黒画素抽出手段１３３は、頭部抽出手段１３２が抽出した頭部領域画像２００の中から、サングラスを写している可能性が高く、色が黒い画素を抽出する。 In step S 500, the black pixel extraction unit 133 extracts a pixel having a black color that has a high possibility of taking sunglasses from the head region image 200 extracted by the head extraction unit 132.

ステップＳ５１０にて、特徴量算出手段１３５は、抽出した黒画素の分布から縦ヒストグラムを作成する。 In step S510, the feature amount calculating unit 135 creates a vertical histogram from the extracted black pixel distribution.

ステップＳ５２０にて、特徴量算出手段１３５は、ステップＳ５１０にて作成した縦ヒストグラムのうち、度数が最大となる行のＹ座標を特定し、それをＨｍａｘＹとする。 In step S520, the feature amount calculation unit 135 specifies the Y coordinate of the row having the maximum frequency in the vertical histogram created in step S510, and sets it as HmaxY.

ステップＳ５３０にて、特徴量算出手段１３５は、縦ヒストグラムの度数の最大値を参照し、閾値ｔｈ１と比較する。それが閾値ｔｈ１より小さい場合には、その旨を判定手段１３６に出力し、ＡＴＭ４０の前に立つ人物がサングラスを着用していないとして（Ｎの分岐）、ステップＳ２４０の処理を終了する（ステップＳ６１０）。縦ヒストグラムの度数が閾値ｔｈ１以上の場合には（Ｙの分岐）、ステップＳ５４０に進む。
ここで、縦ヒストグラムの度数の最大値を閾値処理する代わりに、頭部領域画像２００の横幅に対する黒画素数の割合であるＳＲａｔｉｏを閾値処理しても良い。 In step S530, the feature amount calculating unit 135 refers to the maximum value of the frequency of the vertical histogram and compares it with the threshold th1. When it is smaller than the threshold th1, the fact is output to the determination means 136, and it is assumed that the person standing in front of the ATM 40 is not wearing sunglasses (N branch), and the process of step S240 is terminated (step S610). ). When the frequency of the vertical histogram is equal to or greater than the threshold th1 (Y branch), the process proceeds to step S540.
Here, instead of thresholding the maximum frequency of the vertical histogram, SRatio, which is the ratio of the number of black pixels to the horizontal width of the head region image 200, may be thresholded.

ステップＳ５４０にて、特徴量算出手段１３５は、ステップＳ５２０にて特定したＨｍａｘＹと座標に関する閾値ｔｈ２を比較する。ＨｍａｘＹが閾値ｔｈ２以上の場合には、その旨を判定手段１３６に出力し、ＡＴＭ４０の前に立つ人物がサングラスを着用していないとして（Ｎの分岐）、ステップＳ２４０の処理を終了する（ステップＳ６１０）。ＨｍａｘＹが閾値ｔｈ２より小さい場合（Ｙの分岐）、ステップＳ５５０に進む。 In step S540, the feature amount calculating unit 135 compares HmaxY specified in step S520 with a threshold th2 related to coordinates. If HmaxY is greater than or equal to the threshold th2, the fact is output to the determination means 136, and the person standing in front of the ATM 40 is not wearing sunglasses (N branch), and the process of step S240 is terminated (step S610). ). When HmaxY is smaller than the threshold th2 (Y branch), the process proceeds to step S550.

ステップＳ５５０にて、特徴量算出手段１３５は、黒画素抽出手段１３３にて抽出した黒画素の分布から、対称度を算出する。算出方法は、特徴量算出手段１３５の説明で述べた通りである。 In step S550, the feature amount calculation unit 135 calculates the degree of symmetry from the distribution of black pixels extracted by the black pixel extraction unit 133. The calculation method is as described in the description of the feature amount calculation unit 135.

ステップＳ５６０にて、特徴量算出手段１３５は、頭部抽出手段１３２が作成し、記憶部１４０に一時記憶されているエッジ強度画像と、黒画素抽出手段１３３が抽出した黒画素の分布から、エッジ一致度を算出する。算出方法は、特徴量算出手段１３５の説明で述べた通りである。 In step S560, the feature amount calculating unit 135 calculates the edge from the edge intensity image created by the head extracting unit 132 and temporarily stored in the storage unit 140 and the black pixel distribution extracted by the black pixel extracting unit 133. The degree of coincidence is calculated. The calculation method is as described in the description of the feature amount calculation unit 135.

ステップＳ５７０にて、特徴量算出手段１３５は、黒画素抽出手段１３３にて抽出した黒画素の分布から、形状度を算出する。算出方法は、特徴量算出手段１３５の説明で述べた通りである。 In step S570, the feature amount calculating unit 135 calculates the shape degree from the distribution of black pixels extracted by the black pixel extracting unit 133. The calculation method is as described in the description of the feature amount calculation unit 135.

ステップＳ５８０にて、特徴量算出手段１３５は、対称度、エッジ一致度、形状度からサングラスらしさを式（１２）により算出する。 In step S580, the feature amount calculation unit 135 calculates the likelihood of sunglasses from the symmetry, the edge coincidence, and the shape by Equation (12).

ステップＳ５９０にて、判定手段１３６は、特徴量算出手段１３５が算出したサングラスらしさと閾値ｔｈ４とを比較して、ｔｈ４以下の場合には（Ｎの分岐）ＡＴＭ４０の前に立つ人物がサングラスを着用していないとして（ステップＳ６１０）、ステップＳ２４０の処理を終了する。サングラスらしさが閾値ｔｈ４より大きい場合には（Ｙの分岐）、同人物が、サングラスを着用しているとして（ステップＳ６２０）、ステップＳ２４０の処理を終了する。 In step S590, the determination unit 136 compares the likelihood of sunglasses calculated by the feature amount calculation unit 135 with the threshold th4, and if it is less than th4 (N branch), the person standing in front of the ATM 40 wears sunglasses. If not (step S610), the process of step S240 is terminated. If the likelihood of sunglasses is greater than the threshold th4 (Y branch), the person is wearing sunglasses (step S620), and the process of step S240 is terminated.

本実施の形態では、人の顔の右半面と左半面を区別する基準位置の決め方として、簡単に頭部領域画像２００の幅の中点としていたが、他の方法でも良い。
例えば、頭部抽出に用いた楕円テンプレート２００の長軸を頭部領域画像２００に重ねて基準位置の決定に用いても良い。
また、鼻梁を基準にすることもでき、その場合には記憶部１４０に人の鼻の標準的な形を表すエッジに関するテンプレートを用意して記憶しておき、大きさを様々に変化させて、頭部抽出手段１３２にて抽出したエッジ強度画像と最もマッチする位置を鼻の位置とすることで鼻梁を求めることができる。
さらには、公知の方法で口と鼻を抽出して、口点と鼻頭点を結ぶ直線を正中線として基準位置の決定に用いても良い。正中線は左右の口角点を結ぶ直線の垂直２等分線としてもよい。 In the present embodiment, as a method of determining the reference position for distinguishing the right half surface and the left half surface of the human face, the midpoint of the width of the head region image 200 is simply used, but other methods may be used.
For example, the major axis of the ellipse template 200 used for head extraction may be superimposed on the head region image 200 and used to determine the reference position.
In addition, a nose bridge can also be used as a reference. In that case, a template relating to an edge representing the standard shape of a person's nose is prepared and stored in the storage unit 140, and the size is changed in various ways. The nasal bridge can be obtained by setting the position that most closely matches the edge intensity image extracted by the head extracting unit 132 as the position of the nose.
Furthermore, the mouth and nose may be extracted by a known method, and a straight line connecting the mouth point and the nasal head point may be used as a midline to determine the reference position. The midline may be a perpendicular bisector of a straight line connecting the left and right mouth corner points.

他には、特徴量算出手段１３５が、図３（ａ）に示すサングラス検索領域２０２の全ての行を対象にして、各Ｘ軸座標の列に含まれる黒画素の度数をヒストグラム化したもの（本明細書では「横ヒストグラム」と称する。図３（ｂ）の符号２１２に例示する）を求め、横ヒストグラムの概略中心付近にて度数が極小値となる位置に対応するＸ座標にすることもできる。 In addition, the feature amount calculation unit 135 forms a histogram of the frequency of black pixels included in each X-axis coordinate column for all rows in the sunglasses search area 202 shown in FIG. In this specification, it is referred to as “horizontal histogram” (illustrated by reference numeral 212 in FIG. 3B), and the X coordinate corresponding to the position at which the frequency becomes a minimum value in the vicinity of the approximate center of the horizontal histogram is also obtained. it can.

本実施の形態では、対称度、エッジ一致度、形状度の乗算によりサングラスらしさを求めていたが、全てを用いる必要はない。例えば、サングラスは左右対称の形状であることに特に着目して、対称度のみを採用して検出結果を出力するものとしても良い。
さらにはＹ座標値がＨｍａｘＹである行のみに限らず、上述の横ヒストグラムを参照し、その分布曲線が図３（ｂ）の符号２１２に示すような双峰性を示した場合に、対称性が高いと判定して出力しても良い。 In this embodiment, the likelihood of sunglasses is obtained by multiplying the degree of symmetry, the degree of edge coincidence, and the degree of shape, but it is not necessary to use all of them. For example, with particular attention to the fact that sunglasses have a bilaterally symmetric shape, only the degree of symmetry may be used to output the detection result.
Furthermore, not only in the row where the Y coordinate value is HmaxY, but referring to the above-mentioned horizontal histogram, and the distribution curve shows bimodality as indicated by reference numeral 212 in FIG. It may be determined that is high and output.

本実施の形態では、頭部を抽出するにあたり、記憶部１４０に基準画像１４２を用意し、変化領域抽出手段１３１が、撮像装置２０にて得られた入力画像との差分処理を行ったが、これに限られない。予め、人の頭を表すデータを多数用意して、学習させた識別器（ディテクタ）を入力画像に作用させて、差分処理に依らず直接頭部を抽出しても、同様の効果を得ることができる。 In the present embodiment, when extracting the head, the reference image 142 is prepared in the storage unit 140, and the change area extraction unit 131 performs a difference process with the input image obtained by the imaging device 20. It is not limited to this. The same effect can be obtained even if a large number of data representing human heads are prepared in advance, and a learned discriminator (detector) is applied to the input image to directly extract the head regardless of the difference processing. Can do.

本発明を、ＡＴＭに備えることで、サングラスを着用したまま現金を引き出そうとする不審人物を、その操作前に検出でき、振り込め詐欺の犯人グループの早期摘発、および未然防止に役立てることができる。 By providing the present invention in an ATM, a suspicious person who wants to withdraw cash while wearing sunglasses can be detected before the operation, and this can be useful for early detection and prevention of a criminal group of wire fraud.

１不審人物検出装置
１０サングラス着用検出装置
１３２頭部抽出手段
１３３黒画素抽出手段
１３５特徴量算出手段 DESCRIPTION OF SYMBOLS 1 Suspicious person detection apparatus 10 Sunglasses wear detection apparatus 132 Head extraction means 133 Black pixel extraction means 135 Feature-value calculation means

Claims

An image input unit for acquiring an input image;
An image processing unit for detecting a person wearing sunglasses from the input image;
A sunglasses wearing detection device comprising:
The image processing unit
Head extraction means for extracting a head region corresponding to the head of a person from the input image;
Black pixel extraction means for extracting black pixels that are substantially black in the head region;
Feature amount calculating means for calculating the feature amount of the head region based on the distribution of the black pixels;
If the black pixels are distributed substantially symmetrically using the feature amount, it is determined that the person wears sunglasses. If the black pixels are not distributed substantially symmetrically, the person is wearing sunglasses. A sunglasses wear detection device comprising a determination means for determining that the wearer is not worn.

The feature amount calculating means calculates a vertical histogram related to a frequency obtained by projecting the black pixel on a vertical axis, and determines whether or not the black pixel is substantially symmetrical in the horizontal direction at a vertical position indicating the maximum frequency of the vertical histogram. The sunglasses wearing detection apparatus according to claim 1, wherein the feature quantity to be expressed is calculated.

The feature amount calculation unit calculates a median line from the head region, and calculates a feature amount indicating whether or not the black pixel is substantially symmetric with respect to the median line. Sunglasses wear detection device described in one.

The feature amount calculating means is located at a lowermost position among boundary pixels that exist below the vertical position and are separated from the median line by a predetermined number of pixels and are boundaries between the black pixels and other color pixels. Calculating the number of vertical pixels between the vertical pixel and the vertical position,
4. The sunglasses wearing detection apparatus according to claim 3, wherein the judging means judges that sunglasses are not worn if the number of vertical pixels is equal to or less than a predetermined threshold value.