JP2006350642A

JP2006350642A - Image processing device and program

Info

Publication number: JP2006350642A
Application number: JP2005175302A
Authority: JP
Inventors: Keisuke Shimada; 敬輔島田
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2005-06-15
Filing date: 2005-06-15
Publication date: 2006-12-28

Abstract

<P>PROBLEM TO BE SOLVED: To provide a technique that determines the vertical direction by focusing on information in an image other than characters and brightness at four corners. <P>SOLUTION: People are mostly on the ground, floor or the like, where they normally take different positions on the horizontal direction. In an image having a plurality of persons, the plurality of persons tend to be arranged side by side so that face images (skin regions) are disposed horizontally rather than vertically. Hence, the skin regions in the image are extracted, the dispersion of the skin regions is computed in two orthogonal directions, and the vertical direction is determined from the computations. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、被写体を撮影して得られた画像における天地方向を判別するための技術に関する。 The present invention relates to a technique for determining a vertical direction in an image obtained by photographing a subject.

デジタルカメラ等の撮影装置では普通、撮影によって得られる画像の形状（以降「フレーム形状」と記す）は予め定められている。それにより、ユーザは撮影する被写体によって撮影装置の構え方を選択するのが普通である。その構え方として、フレーム形状の短辺が天地方向と平行、或いは略平行とするものは「横構え」、その短辺が水平方向と平行、或いは略平行とするものは「縦構え」とそれぞれ呼ぶことにする。ユーザは横構え、縦構えの何れかを選択するのが普通である。 In a photographing apparatus such as a digital camera, the shape of an image obtained by photographing (hereinafter referred to as “frame shape”) is usually predetermined. As a result, the user usually selects how to hold the photographing apparatus according to the subject to be photographed. As for the posture, when the short side of the frame shape is parallel or substantially parallel to the top-and-bottom direction, it is “horizontal”, and when the short side is parallel or substantially parallel to the horizontal direction, “vertical posture”. I will call it. The user usually selects either horizontal or vertical.

撮影装置は通常、画像をそのままの状態で保存するのが普通である。それにより、保存した画像をパーソナルコンピュータ（以下「ＰＣ」）等で単に表示させると、縦構えで撮影された画像は水平方向を縦方向にした状態で表示される。つまり、被写体像は撮影者（ユーザ）が実際に見ていた状態から回転させた状態で表示される。 In general, an image capturing apparatus normally stores an image as it is. As a result, when the stored image is simply displayed on a personal computer (hereinafter referred to as “PC”) or the like, the image captured in the vertical orientation is displayed with the horizontal direction set to the vertical direction. That is, the subject image is displayed in a rotated state from the state actually viewed by the photographer (user).

被写体像を回転させた状態で表示させると、見難いのが普通である。このことから、その状態を元の状態に回転させることが望ましい。しかし、その回転をユーザに指示させることは、その指示に手間がかかるだけでなく、適切な状態の画像を表示させるのに時間がかかることを意味するから望ましくないと言える。このようなことから、従来、画像の天地方向を判定する画像処理が行われている。その天地方向を判定する画像処理を行う従来の画像処理装置としては、特許文献１〜３に記載されたものが挙げられる。 When the subject image is displayed in a rotated state, it is usually difficult to see. For this reason, it is desirable to rotate the state to the original state. However, it can be said that instructing the user to rotate the rotation is not desirable because it means that it takes time to display an image in an appropriate state. For this reason, conventionally, image processing for determining the vertical direction of an image has been performed. Examples of conventional image processing apparatuses that perform image processing for determining the vertical direction include those described in Patent Documents 1 to 3.

特許文献１に記載された従来の画像処理装置では、画像に写っている文字を認識することにより、天地方向を判定するようにしている。特許文献２、３にそれぞれ記載された従来の画像処理装置では、一般的な環境で撮影された画像はその上辺部分が明るいことに着目し、画像の四隅の明るさを比較することにより上辺（天地方向）を判定するようにしている。その判定を行えるようにする撮影装置としては、天地検出用のセンサ（重力センサ）を搭載したものや、縦構え用、横構え用のシャッターボタンをそれぞれ用意したものがある（特許文献４）。 In the conventional image processing apparatus described in Patent Document 1, the top-to-bottom direction is determined by recognizing characters in an image. In the conventional image processing apparatuses described in Patent Documents 2 and 3, focusing on the fact that the upper part of an image taken in a general environment is bright, and comparing the brightness of the four corners of the image, the upper side ( (Vertical direction) is determined. As an imaging device that can perform the determination, there are a device equipped with a sensor for detecting the top and bottom (gravity sensor) and a device that has prepared a shutter button for a vertical posture and a horizontal posture (Patent Document 4).

文字認識技術を利用した方法では、画像中に文字が写っていなければ天地方向の判定を行うことができない。風景写真や人物写真などでは、文字が写っていないことは普通である。また、画像（写真）の四隅の明るさに着目する方法では、その四隅に位置する被写体の明るさが天地方向の判定結果に大きく影響するから、天地方向を正確に判定するうえでの制約（条件）がある。その制約によって、必ずしも天地方向を正確に判定することはできない。このようなことから、天地方向を正確に判定するためには、文字や四隅の明るさ以外のことに着目する必要があると考えられる。 In the method using the character recognition technology, it is impossible to determine the vertical direction unless characters are shown in the image. In landscape photos and portraits, it is normal that characters are not shown. In addition, in the method that focuses on the brightness of the four corners of the image (photo), the brightness of the subject located at the four corners greatly affects the determination result of the top and bottom direction. Condition). Due to the restriction, the vertical direction cannot always be determined accurately. For this reason, it is considered necessary to pay attention to things other than the brightness of characters and four corners in order to accurately determine the vertical direction.

特許文献４に記載された撮影装置では、センサの検出結果、或いは撮影に使われたシャッターボタンを示す情報を画像に付加することにより、天地方向を常に正確に判定させることができる。しかし、そのためには、その撮影装置が画像と併せて保存する特殊な情報に対応させる必要がある。これは、保存した画像を別の装置により表示させる場合には、その装置に、その特殊な情報に対応させるための仕組みを用意しなければならないことを意味する。このことから、天地方向は画像中の情報から判定するのが望ましいと考えられる。
特開平８−３３６０３８号公報特開２０００−２４１８４６号公報特開２０００−１８４２７１号公報特開２００４−３４３２８３号公報 In the photographing apparatus described in Patent Document 4, the top and bottom direction can always be accurately determined by adding the detection result of the sensor or information indicating the shutter button used for photographing to the image. However, for that purpose, it is necessary to correspond to special information that the photographing apparatus stores together with the image. This means that when a stored image is displayed by another device, the device must be provided with a mechanism for corresponding to the special information. From this, it is considered desirable to determine the vertical direction from information in the image.
JP-A-8-336038 Japanese Patent Laid-Open No. 2000-24184 JP 2000-184271 A JP 2004-343283 A

本発明の課題は、文字や四隅の明るさ以外の画像中の情報に着目して天地方向を判定する技術を提供することにある。 An object of the present invention is to provide a technique for determining the top and bottom direction by paying attention to information in an image other than characters and brightness of four corners.

本発明の画像処理装置は、被写体を撮影して得られた画像における天地方向を判定する画像処理を行うことを前提とし、画像を取得する画像取得手段と、画像取得手段が取得した画像中に被写体として撮影されている人物像の肌領域を認識する肌領域認識手段と、肌領域認識手段による肌領域の認識結果を基に、天地方向を判定する方向判定手段と、を具備する。 The image processing apparatus of the present invention is premised on performing image processing for determining the top-to-bottom direction in an image obtained by photographing a subject, and includes an image acquisition unit that acquires an image, and an image acquired by the image acquisition unit. Skin area recognition means for recognizing the skin area of a human image photographed as a subject, and direction determination means for determining the top-to-bottom direction based on the recognition result of the skin area by the skin area recognition means.

なお、上記方向判定手段は、画像の縁と平行な方向毎に、肌領域認識手段が認識した肌領域を対象に分散値を計算し、該計算結果から該縁と平行な方向のうちの一つを天地方向と判定する、ことが望ましい。また、肌領域認識手段が認識した肌領域の大きさを考慮した分散値の計算を行う、ことが望ましい。 The direction determining means calculates a variance value for the skin area recognized by the skin area recognizing means for each direction parallel to the edge of the image, and determines one of the directions parallel to the edge from the calculation result. It is desirable to determine one as the vertical direction. It is also desirable to calculate a variance value in consideration of the size of the skin area recognized by the skin area recognition means.

本発明のプログラムは、上記画像処理装置が具備する手段を実現させるための機能を搭載している。 The program of the present invention has a function for realizing the means included in the image processing apparatus.

本発明は、画像中に被写体として撮影されている人物像の肌領域を認識し、その認識結果を基に、その画像における天地方向を判定する。人は普通、顔が少なくとも露出している。人は多くの場合、地面や床などの多くの人が一度に居られるところに居るが、そのようなところでは水平方向上の異なる位置に場所を確保するのが普通である。このようなことから、複数の人物を撮影した画像では、画像中の人物像を構成する肌領域の分布に、天地方向に依存する傾向が現れることが多い。そのため、画像中に存在する肌領域に着目して、その画像における天地方向を高精度に判定することができる。 The present invention recognizes a skin area of a human image photographed as a subject in an image, and determines the top / bottom direction in the image based on the recognition result. A person usually has at least a face exposed. In many cases, people are in places where many people such as the ground and the floor can be present at one time, but in such places, it is usual to secure places at different positions in the horizontal direction. For this reason, in an image obtained by photographing a plurality of persons, the distribution of the skin area constituting the person image in the image often tends to depend on the top-to-bottom direction. Therefore, paying attention to the skin region existing in the image, the top-and-bottom direction in the image can be determined with high accuracy.

以下、本発明の実施の形態について、図面を参照しながら詳細に説明する。
図１は、本実施の形態による画像処理装置の構成を示す図である。
この画像処理装置は、ＰＣに、本実施の形態による画像処理用プログラムをロードすることにより実現されるものである。図１に示すように、装置全体の制御を行うＣＰＵ１０１と、例えばＣＰＵ１０１がワークに用いるＲＡＭやＢＩＯＳが格納されたＲＯＭを含むメモリ１０２と、例えばハードディスク装置である補助記憶装置１０３と、不図示の通信ネットワークを介した通信を行うための通信インターフェース（Ｉ／Ｆ）１０４と、キーボードやマウス等のポインティングデバイスといった各種入力装置と接続用の入力インターフェース（Ｉ／Ｆ）１０５と、表示装置と接続用の出力インターフェース（Ｉ／Ｆ）１０６と、可搬性の記録媒体ＭＤにアクセスできる媒体駆動装置１０７と、各部１０１〜１０７を相互に接続するバス１０８と、を備えた構成となっている。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a diagram illustrating a configuration of an image processing apparatus according to the present embodiment.
This image processing apparatus is realized by loading an image processing program according to the present embodiment on a PC. As shown in FIG. 1, a CPU 101 that controls the entire apparatus, a memory 102 that includes a ROM storing a RAM or BIOS used by the CPU 101 for work, an auxiliary storage device 103 that is a hard disk device, for example, and the like (not shown) A communication interface (I / F) 104 for performing communication via a communication network, an input interface (I / F) 105 for connection with various input devices such as a keyboard and a pointing device such as a mouse, and a display device for connection Output interface (I / F) 106, a medium driving device 107 that can access the portable recording medium MD, and a bus 108 that connects the units 101 to 107 to each other.

上記画像処理用プログラムは、例えば補助記憶装置１０３、媒体駆動装置１０７に装着された記録媒体ＭＤ、或いは通信Ｉ／Ｆ１０４を介して通信可能な外部装置に格納されている。そのプログラムがメモリ１０２上に読み出されて起動されることにより、本実施の形態による画像処理装置は実現される。デジタルカメラ等の撮影装置による撮影で得られた画像は、記録媒体ＭＤが装着された媒体駆動装置１０７、或いは通信Ｉ／Ｆ１０４によって取得される。 The image processing program is stored in, for example, the auxiliary storage device 103, the recording medium MD mounted on the medium driving device 107, or an external device that can communicate via the communication I / F 104. The program is read out on the memory 102 and started up, whereby the image processing apparatus according to the present embodiment is realized. An image obtained by photographing with a photographing device such as a digital camera is acquired by the medium driving device 107 on which the recording medium MD is mounted or the communication I / F 104.

図３は、画像中に複数の人物像が存在する場合の顔領域の分布を説明する図である。図４は、複数の人を被写体として撮影された画像における顔領域の分布例を説明する図である。その図４では、図４（ａ）〜図４（ｃ）に例１〜３の３例を示している。図３、及び図４において、顔領域は丸、その顔領域が存在する範囲は破線でそれぞれ示している。ここで図３、及び図４を参照して、天地方向の判定方法について具体的に説明する。 FIG. 3 is a diagram for explaining the distribution of the face area when there are a plurality of human images in the image. FIG. 4 is a diagram for explaining an example of distribution of face areas in an image taken with a plurality of people as subjects. FIG. 4 shows three examples 1 to 3 in FIGS. 4 (a) to 4 (c). 3 and 4, the face area is indicated by a circle, and the range where the face area exists is indicated by a broken line. Here, with reference to FIG. 3 and FIG. 4, the determination method of the vertical direction will be specifically described.

人は普通、己の居る場所を確保してその場所に居る。人は多くの場合、地面や床など（以降、便宜的に「地表面」と総称する）の上に居るが、そのような地表面では水平方向上の異なる位置に場所を確保するのが普通である。このことから、図３、及び図４に示すように、複数の人物が写っている画像（写真）では、複数の人物は横に並んだような形となって、顔領域は垂直（天地）方向よりも水平方向に広がって存在する傾向となる。これは、人の数が多くなる程その傾向が強くなる（図４）。天地方向の判定は、このことに着目して行う。 A person usually secures a place where he is and stays there. People are often on the ground, floors, etc. (hereinafter collectively referred to as “the ground surface” for convenience), but it is common to secure places at different positions in the horizontal direction on such ground surfaces. It is. Therefore, as shown in FIGS. 3 and 4, in an image (photograph) in which a plurality of persons are shown, the plurality of persons are arranged side by side, and the face area is vertical (top and bottom). It tends to exist in a horizontal direction rather than a direction. This tendency becomes stronger as the number of people increases (FIG. 4). The determination of the vertical direction is made paying attention to this.

具体的には、画像中で肌色となっている肌色領域を抽出し、その分散を直交する２つの方向で調べ、一方の分散の値が比較的に高く、他方のその値が比較的に低い場合に、その値の低い方向を天地方向と判定する（図３）。撮影装置は縦構え、或いは横構えで撮影を行うのが普通であることから、直交する２つの方向としては、短辺と平行な方向（以降「短辺方向」と呼ぶ）と、長辺と平行な方向（以降「長辺方向」と呼ぶ）と、を採用している（図３）。 Specifically, a flesh-color region that is a flesh-color in the image is extracted, and its variance is examined in two orthogonal directions. One variance value is relatively high and the other value is relatively low. In this case, the direction with the lower value is determined as the top-and-bottom direction (FIG. 3). Since the photographing apparatus normally shoots in a vertical orientation or a horizontal orientation, the two orthogonal directions include a direction parallel to the short side (hereinafter referred to as a “short side direction”) and a long side. A parallel direction (hereinafter referred to as “long side direction”) is adopted (FIG. 3).

人は見上げたり、下を向いたりすることが多いことから、顔の形を認識して、その方向を考慮した判定を行った場合、判定結果の精度の低下を招く可能性は小さくないと考えられる。このことから、本実施の形態では、肌色領域を抽出することにより、その大きさ、及び位置に着目した判定を行うようにしている。顔の形や方向を考慮しないことにより、人による顔の向きの違いに係わらず、天地方向を高精度に判定することができる。また、その判定をより少ない計算量で行えるようになる。 Since people often look up or face down, it is not unlikely that the accuracy of the judgment result will be reduced when the face shape is recognized and the judgment is made in consideration of the direction. It is done. For this reason, in the present embodiment, the skin color region is extracted to make a determination focusing on its size and position. By not considering the shape and direction of the face, it is possible to determine the top-and-bottom direction with high accuracy regardless of the difference in the face direction of the person. Further, the determination can be performed with a smaller calculation amount.

図２は、本実施の形態による画像処理装置の機能構成を示す図である。その画像処理装置は、図２に示すように、画像入力部２０１、顔認識部２０２、顔位置の分散計算部２０３、及び縦横判別部２０４を備えた構成となっている。 FIG. 2 is a diagram showing a functional configuration of the image processing apparatus according to the present embodiment. As shown in FIG. 2, the image processing apparatus includes an image input unit 201, a face recognition unit 202, a face position variance calculation unit 203, and a vertical / horizontal determination unit 204.

上記画像入力部２０１は、撮影装置等で撮影された画像等を入力するものである。入力する画像は、画素毎に画素値を示すデータである。図１に示す構成では、例えばＣＰＵ１０１、メモリ１０２、補助記憶装置１０３、通信Ｉ／Ｆ１０４、及び媒体駆動装置１０７によって実現される。 The image input unit 201 inputs an image taken by a photographing apparatus or the like. The input image is data indicating a pixel value for each pixel. The configuration illustrated in FIG. 1 is realized by, for example, the CPU 101, the memory 102, the auxiliary storage device 103, the communication I / F 104, and the medium driving device 107.

顔認識部２０２は、例えば画像入力部２０１が入力した画像中から肌色領域を抽出し、抽出した肌色領域毎に、その位置（ここでは重心位置）、及び大きさを示す情報をそれぞれ生成する。顔位置の分散計算部（以降「分散計算部」と略記）２０３は、顔認識部２０２から肌色領域毎にそれらの情報を受け取り、短辺方向、長辺方向毎に肌色領域の分散値を計算する。その計算は、画面（画像）全体にわたっての肌色領域の重心位置（平均位置）をＧ、分散値をσ²とすると、これらは個々の肌色領域ｉの大きさ（画素数）ｆ_i、その重心位置｛ｘ_i，ｙ_i｝を用いて以下のように表される。 For example, the face recognition unit 202 extracts a skin color area from the image input by the image input unit 201, and generates information indicating the position (here, the center of gravity position) and the size of each extracted skin color area. The face position variance calculation unit (hereinafter abbreviated as “dispersion calculation unit”) 203 receives the information for each skin color area from the face recognition unit 202 and calculates the variance value of the skin color area for each of the short side direction and the long side direction. To do. In the calculation, assuming that the center of gravity (average position) of the skin color region over the entire screen (image) is G and the variance is σ ² , these are the size (number of pixels) f _i of each skin color region _i , and its center of gravity. The position {x _i , y _i } is used as follows.

Ｇ＝｛Ｇ_x，Ｇ_y｝＝｛Σｆ_i＊ｘ_i，Σｆ_i＊ｙ_i｝／Σｆ_i ・・・（１）
σ²＝｛σ² _x，σ² _y｝＝｛Σｆ_i＊（ｘ_i−Ｇ_x）²，Σｆ_i＊（ｙ_i−Ｇ_y）²｝／Σｆ_i
・・・（２）
ここでＧ_x及びσ² _xはそれぞれ長辺方向上の重心位置、及び分散値を表し、同様にＧ_y及びσ² _yはそれぞれ短辺方向上の重心位置、及び分散値を表している。 G = {G _x , G _y } = {Σf _i * x _i , Σf _i * y _i } / Σf _i (1)
σ ² = {σ ² _x , σ ² _y } = {Σf _i * (x _i −G _x ) ² , Σf _i * (y _i −G _y ) ² } / Σf _i
(2)
Here, G _x and σ ² _x represent the centroid position and dispersion value in the long side direction, respectively, and similarly G _y and σ ² _y represent the centroid position and dispersion value in the short side direction, respectively.

肌色領域の抽出では、顔が写っている顔領域の他に、手や足などの他の部分が写っている領域を抽出することが考えられる。身体の部分では、顔が写っている（露出している）可能性が最も高く、その次に可能性が高いのは手である。顔と手を比較した場合、顔領域のほうの面積が大きいのが普通である。そこで、本実施の形態では、肌色領域の分散値は、肌色領域の大きさを重み付けして計算するようにしている。それにより、計算される分散値に及ぼす手領域などの面積の小さい肌色領域による影響をより小さくさせ、より顔領域を重視する形で分散値を計算するようにしている。 In the extraction of the skin color area, it may be possible to extract an area in which other parts such as hands and feet are shown in addition to the face area in which the face is shown. In the body part, the face is most likely (exposed) and the next most likely is the hand. When comparing a face with a hand, the face area is usually larger. Therefore, in the present embodiment, the variance value of the skin color area is calculated by weighting the size of the skin color area. As a result, the influence of the skin color region having a small area such as the hand region on the calculated variance value is further reduced, and the variance value is calculated in such a manner that the face region is more important.

縦横判別部２０４は、分散計算部２０３が方向別に計算した分散値σ² _x、σ² _yを受け取り、天地方向の判別を行う。その判別は、例えばσ² _y／σ² _xの値が所定値ＴＨ（＞１）より大きければ短辺方向が天地（縦）方向、σ² _x／σ² _yの値が所定値ＴＨより大きければ長辺方向が天地（縦）方向、それら何れの関係も満たされない場合には判別不能、とすることで行っている。 The vertical / horizontal discrimination unit 204 receives the variance values σ ² _x and σ ² _y calculated by the variance calculation unit 203 for each direction, and discriminates the vertical direction. For example, if the value of σ ² _y / σ ² _x is larger than a predetermined value TH (> 1), the short side direction is the vertical (vertical) direction, and the value of σ ² _x / σ ² _y is larger than the predetermined value TH. In other words, the long side direction is the top (vertical) direction, and if neither of these relations is satisfied, the discrimination is impossible.

画像入力部２０１が入力した画像は、縦横判別部２０４による判別結果に応じた回転操作が自動的に行われて表示される。それにより、ユーザが画像の回転操作を指示することなく、画像は適切な状態で表示させることができる。このため、ユーザにとっては適切な状態の画像をより迅速に見ることができる。 The image input by the image input unit 201 is automatically displayed by a rotation operation according to the determination result by the vertical / horizontal determination unit 204. Thereby, the image can be displayed in an appropriate state without the user giving an instruction to rotate the image. For this reason, an image in an appropriate state for the user can be viewed more quickly.

上記顔認識部２０２、分散計算部２０３、及び縦横判別部２０４は共に、図１に示す構成では、例えばＣＰＵ１０１、メモリ１０２、及び補助記憶装置１０３によって実現される。画像の回転操作は、メモリ１０２を用いてＣＰＵ１０１により行われ、その回転操作後の画像は出力Ｉ／Ｆ１０６を介して表示装置に出力させることで表示される。 In the configuration shown in FIG. 1, the face recognition unit 202, the variance calculation unit 203, and the vertical / horizontal discrimination unit 204 are all realized by, for example, the CPU 101, the memory 102, and the auxiliary storage device 103. The image rotation operation is performed by the CPU 101 using the memory 102, and the image after the rotation operation is displayed by being output to the display device via the output I / F 106.

図２に示す各部２０１〜４は、ＣＰＵ１０１が図５に示す天地方向判定処理を実行した場合に実現される。次に図５を参照して、その判定処理について詳細に説明する。その判定処理自体は、ＣＰＵ１０１が例えば上記画像処理用プログラムを補助記憶装置１０３からメモリ１０２に読み出して実行することで実現される。 The units 201 to 4 illustrated in FIG. 2 are realized when the CPU 101 executes the top / bottom direction determination process illustrated in FIG. 5. Next, the determination process will be described in detail with reference to FIG. The determination process itself is realized by the CPU 101 reading, for example, the image processing program from the auxiliary storage device 103 to the memory 102 and executing it.

先ず、ステップ５０１では、ユーザが指定した場所から、通信Ｉ／Ｆ１０４、或いは媒体駆動装置１０７を介して画像を入力する。次のステップ５０２では、入力した画像中の１画素に注目して、その画素値が肌色と見なせる範囲内か否か判定する。その画素値が肌色と見なせる範囲内であった場合、判定はＹＥＳとなり、ステップ５０３でその画素の別の値として１を設定した後、ステップ５０５に移行する。そうでない場合には、判定はＮＯとなり、ステップ５０４でその画素の別の値として０を設定した後、そのステップ５０５に移行する。 First, in step 501, an image is input from the location designated by the user via the communication I / F 104 or the medium driving device 107. In the next step 502, attention is paid to one pixel in the input image, and it is determined whether or not the pixel value is within a range that can be regarded as skin color. If the pixel value is within the range that can be regarded as skin color, the determination is yes, and after setting 1 as another value of the pixel in step 503, the process proceeds to step 505. Otherwise, the determination is no, 0 is set as another value of the pixel in step 504, and the process proceeds to step 505.

ステップ５０５では、全ての画素について調べたか否か判定する。画素値が肌色と見なせる範囲内か否か調べていない画素が残っている場合、判定はＮＯとなって上記ステップ５０２に戻り、注目する画素を残っている画素のうちの１つに変更して、その画素値が肌色と見なせる範囲内か否か判定する。そうでない場合には、判定はＹＥＳとなってステップ５０６に移行する。 In step 505, it is determined whether or not all the pixels have been examined. If there remains a pixel that has not been checked whether or not the pixel value is within the range that can be regarded as skin color, the determination is no, the process returns to step 502, and the target pixel is changed to one of the remaining pixels. Then, it is determined whether or not the pixel value is within a range that can be regarded as skin color. Otherwise, the determination is yes and the process moves to step 506.

上記ステップ５０２〜５０５で形成される処理ループをステップ５０５の判定がＹＥＳとなるまで繰り返し実行することにより、ステップ５０１で入力した画像から、肌色と見なせる範囲内か否かにより２値化した画像が生成される。その２値化画像において、肌色と見なせる画素の値は１、そうでない画素の値は０である。ステップ５０６以降では、生成した２値化画像を対象にした処理が行われる。 By repeatedly executing the processing loop formed in steps 502 to 505 until the determination in step 505 is YES, an image binarized from the image input in step 501 depending on whether it is within the range that can be regarded as skin color is obtained. Generated. In the binarized image, the value of a pixel that can be regarded as skin color is 1, and the value of a pixel that is not so is 0. In step 506 and subsequent steps, processing for the generated binarized image is performed.

その２値化画像では、画素値が１の画素がどのように肌色領域を構成しているか判明していない。このことから、先ず、ステップ５０６では、肌色領域を個別に抽出するためのグループ化とラベル付け（ラベリング）処理を行う。その処理の実行により、同一の肌色領域を構成していると見なす画素には同一のラベルが割り当てられ、同一のグループにまとめられる。 In the binarized image, it is not clear how pixels having a pixel value of 1 constitute a skin color area. From this, first, in step 506, grouping and labeling processing for individually extracting the skin color regions is performed. By executing the processing, the same label is assigned to the pixels that are considered to constitute the same skin color region, and the pixels are grouped into the same group.

ステップ５０６に続くステップ５０７では、グループ（抽出した肌色領域）毎に、その大きさ（画素数）ｆ_i、及び重心位置｛ｘ_i，ｙ_i｝を計算し、その計算後に、肌色領域の平均位置Ｇを式（１）により計算する。その次に移行するステップ５０８では、長辺方向、短辺方向別に、式（２）により分散値σ² _x、σ² _yを計算する。ステップ５０９にはその後に移行する。 In step 507 following step 506, the size (number of pixels) f _i and the barycentric position {x _i , y _i } are calculated for each group (extracted skin color region), and the average of the skin color regions is calculated after the calculation. The position G is calculated by equation (1). In the next step 508, the dispersion values σ ² _x and σ ² _y are calculated by the equation (2) for each of the long side direction and the short side direction. Step 509 then proceeds.

ステップ５０９では、σ² _y／σ² _xの値が所定値ＴＨより大きいか否か判定する。その値が所定値ＴＨより大きい場合、判定はＹＥＳとなってステップ５１０に移行し、短辺方向（図中「垂直方向」と表記）を天地方向と設定した後、一連の処理を終了する。そうでない場合には、判定はＮＯとなってステップ５１１に移行する。 In step 509, it is determined whether the value of σ ² _y / σ ² _x is greater than a predetermined value TH. If the value is larger than the predetermined value TH, the determination is YES, the process proceeds to step 510, the short side direction (indicated as “vertical direction” in the figure) is set as the top-and-bottom direction, and the series of processing ends. Otherwise, the determination is no and the process moves to step 511.

ステップ５１１では、σ² _x／σ² _yの値が所定値ＴＨより大きいか否か判定する。その値が所定値ＴＨより大きい場合、判定はＹＥＳとなってステップ５１２に移行し、長辺方向（図中「水平方向」と表記）を天地方向と設定した後、一連の処理を終了する。そうでない場合には、判定はＮＯとなってステップ５１３に移行し、天地方向は不明と設定した後、一連の処理を終了する。 In step 511, it is determined whether or not the value of σ ² _x / σ ² _y is greater than a predetermined value TH. If the value is larger than the predetermined value TH, the determination is YES, the process proceeds to step 512, the long side direction (indicated as “horizontal direction” in the figure) is set as the top-and-bottom direction, and the series of processing ends. Otherwise, the determination is no, the process proceeds to step 513, the top-to-bottom direction is set as unknown, and the series of processes is terminated.

上述の天地判明処理は、画像毎に実行される。それにより、複数の画像を一度に表示させる場合には、画像毎に天地方向が判定され、その判定結果が画像の表示に反映される。なお、図２に示す画像入力部２０１は上記ステップ５０１を実行することで実現され、同様に、顔認識部２０２は上記ステップ５０２〜５０６、分散計算部２０３は上記ステップ５０７、５０８、縦横判別部２０４は上記ステップ５０９〜５１３をそれぞれ実行することで実現される。 The above-described top and bottom finding process is executed for each image. Thereby, when displaying a plurality of images at once, the top-and-bottom direction is determined for each image, and the determination result is reflected in the display of the image. The image input unit 201 shown in FIG. 2 is realized by executing the above step 501. Similarly, the face recognition unit 202 is the above steps 502 to 506, the variance calculation unit 203 is the above steps 507 and 508, and the vertical and horizontal discrimination unit. 204 is realized by executing the above steps 509 to 513.

図６は、上記ステップ５０６として実行されるグループ化とラベル付け処理のフローチャートである。次に図６を参照して、そのラベル付け処理について詳細に説明する。このラベル付け処理では、画像の左上隅の画素からスタートして、ラスタスキャン順に全ての画素を対象に、連結（隣接）している画素値が１の画素に同一のラベル（番号）を割り当てることが行われる。 FIG. 6 is a flowchart of the grouping and labeling process executed as step 506. Next, the labeling process will be described in detail with reference to FIG. In this labeling process, starting from the pixel in the upper left corner of the image, the same label (number) is assigned to the pixels whose connected (adjacent) pixel values are 1 for all the pixels in the raster scan order. Is done.

先ず、ステップ６０１では、初期化を行う。その初期化により、割り当てるラベルを管理するための変数Ｎに１を代入し、全ての画素のラベルをリセット、つまりラベルとして０をセットする。次のステップ６０２では、注目する画素Ａの画素値が０か否か判定する。その画素値が０であった場合、判定はＹＥＳとなってステップ６０８に移行し、そうでない場合には、判定はＮＯとなってステップ６０３に移行する。 First, in step 601, initialization is performed. As a result of the initialization, 1 is substituted into a variable N for managing the label to be assigned, and the labels of all the pixels are reset, that is, 0 is set as the label. In the next step 602, it is determined whether or not the pixel value of the pixel A of interest is zero. If the pixel value is 0, the determination is yes and the process proceeds to step 608; otherwise, the determination is no and the process proceeds to step 603.

ここで画素へのラベリング方法について図７を参照して具体的に説明する。その図７において、各枠はそれぞれ画素を表している。注目する画素はＡで示し、その８近傍の画素のうち、左、左上、上、及び右上の４つの画素はそれぞれＢ〜Ｅで示している。当然のことながら、それら４つの画素のうちの何れかは注目画素Ａの位置によって存在しない場合がある。 Here, a method of labeling pixels will be specifically described with reference to FIG. In FIG. 7, each frame represents a pixel. A pixel of interest is indicated by A, and four pixels on the left, upper left, upper and upper right among the eight neighboring pixels are indicated by B to E, respectively. Of course, any of these four pixels may not exist depending on the position of the pixel of interest A.

注目画素Ａ近傍の４つの画素Ｂ〜Ｅへのラベリングでは、何れの画素にもラベルが割り当てられていない（ケース１）、１つ以上の画素に同一のラベルが割り当てられている（ケース２）、複数の画素に異なるラベルが割り当てられている（ケース３）、の３ケースが考えられる。注目画素Ａへのラベリングは、ケース１ではそれまで割り当てていないラベルを割り当て、ケース２では近傍の画素と同一のラベルを割り当て、ケース３では近傍の画素のなかで最小のラベルを割り当てることで行う。ステップ６０３〜６０７では、そのようなラベリングを行うための処理が実行される。 In the labeling to the four pixels B to E in the vicinity of the target pixel A, no label is assigned to any pixel (case 1), and the same label is assigned to one or more pixels (case 2). There are three cases where different labels are assigned to a plurality of pixels (case 3). Labeling to the pixel of interest A is performed by assigning a label that has not been assigned in the case 1, assigning the same label as the neighboring pixel in the case 2, and assigning the smallest label among the neighboring pixels in the case 3. . In steps 603 to 607, processing for performing such labeling is executed.

先ず、ステップ６０３では、近傍画素Ｂ〜Ｅの何れかにラベルが割り当てられているか否か判定する。その何れの画素にもラベルが割り当てられていない場合、判定はＮＯとなり、次にステップ６０４で注目画素Ａに変数Ｎの値をラベルとして割り当て、更に変数Ｎの値をインクリメントしてからステップ６０８に移行する。そうでない場合には、判定はＹＥＳとなってステップ６０５に移行する。 First, in step 603, it is determined whether or not a label is assigned to any of the neighboring pixels B to E. If no label is assigned to any of the pixels, the determination is NO. Next, in step 604, the value of the variable N is assigned to the target pixel A as a label, and the value of the variable N is further incremented. Transition. Otherwise, the determination is yes and the process moves to step 605.

ステップ６０５では、近傍画素Ｂ〜Ｅに割り当てられていたラベルが全て同一か否か判定する。そのラベルとして複数の異なるラベルが存在していた場合、判定はＮＯとなってステップ６０６に移行し、複数のラベルのなかで最小のラベルを注目画素Ａに割り当て、近傍画素Ｂ〜Ｅのなかで最小のラベルが割り当てられていない画素には最小のラベルを新たに割り当てる。その後はステップ６０８に移行する。一方、そうでない場合には、つまりラベルが全て同一であった場合には、判定はＹＥＳとなり、ステップ６０７で注目画素Ａにそのラベルを割り当てた後、そのステップ６０８に移行する。 In step 605, it is determined whether all labels assigned to the neighboring pixels B to E are the same. If there are a plurality of different labels as the labels, the determination is no and the process proceeds to step 606, where the smallest label among the plurality of labels is assigned to the target pixel A, and among the neighboring pixels B to E. A minimum label is newly assigned to a pixel to which a minimum label is not assigned. Thereafter, the process proceeds to step 608. On the other hand, if this is not the case, that is, if all the labels are the same, the determination is YES, and after assigning the label to the pixel of interest A in step 607, the process proceeds to step 608.

ステップ６０８では、注目画素Ａが画像の最後（右下隅）の画素か否か判定する。その注目画素Ａが最後の画素であった場合、判定はＹＥＳとなり、ここで一連の処理を終了する。そうでない場合には、判定はＮＯとなり、次のステップ６０９で注目画素Ａをラスタスキャン順に沿って次の画素に変更した後、上記ステップ６０２に戻る。それにより、画素値が１の画素の全てを対象にラベリングを行う。 In step 608, it is determined whether the pixel of interest A is the last (lower right corner) pixel of the image. If the target pixel A is the last pixel, the determination is yes, and the series of processing ends here. Otherwise, the determination is no, and in step 609, the target pixel A is changed to the next pixel in the raster scan order, and then the process returns to step 602. Thereby, labeling is performed on all the pixels having a pixel value of 1.

グループ化とラベル付け処理では、上述したような処理が実行される。それにより、同じ肌色領域を構成する画素には同一のラベルが割り当てられてグループ化される。それにより、肌色領域の大きさは、異なるラベル毎に、画素数をカウントすることで計算される。その重心位置Ｘは、画素毎の座標位置を用いて計算される。 In the grouping and labeling process, the process as described above is executed. As a result, the same label is assigned to the pixels constituting the same skin color area and grouped. Thereby, the size of the skin color area is calculated by counting the number of pixels for each different label. The barycentric position X is calculated using the coordinate position for each pixel.

なお、本実施の形態では、肌色領域の分布状態を評価するために分散値σ²を計算しているが、それとは別の指標を採用しても良い。具体的には、肌色領域が存在する範囲（幅）、その形状等を採用しても良い。また、身体の特定の部分（例えば顔）が写っている肌色領域のみを抽出し、天地方向の判定を行うようにしても良い。特定の肌色領域のみを抽出する場合には、その大きさによる重み付けは行わなくとも良い。天地方向を判定する画像としては、撮影（ここではスキャナ等によって電子化することを含む）によって得られた画像でなくとも良い。例えばコンピュータ・グラフィックスによって生成された画像であっても良い。 In the present embodiment, the variance value σ ² is calculated in order to evaluate the distribution state of the skin color region, but another index may be employed. Specifically, a range (width) in which the skin color area exists, a shape thereof, and the like may be employed. Alternatively, only the skin color region in which a specific part of the body (for example, a face) is captured may be extracted, and the vertical direction may be determined. When only a specific skin color region is extracted, weighting according to the size may not be performed. The image for determining the top-and-bottom direction may not be an image obtained by photographing (including digitization by a scanner or the like here). For example, it may be an image generated by computer graphics.

本実施の形態による画像処理装置の構成を示す図である。It is a figure which shows the structure of the image processing apparatus by this Embodiment. 本実施の形態による画像処理装置の機能構成を示す図である。It is a figure which shows the function structure of the image processing apparatus by this Embodiment. 画像中に複数の人物像が存在する場合の顔領域の分布を説明する図である。It is a figure explaining distribution of a face field in case a plurality of human figures exist in a picture. 複数の人を被写体として撮影された画像における顔領域の分布例を説明する図である。It is a figure explaining the example of distribution of the face area in the picture photoed with a plurality of people as a subject. 天地方向判定処理のフローチャートである。It is a flowchart of a top-and-bottom direction determination process. グループ化とラベル付け処理のフローチャートである。It is a flowchart of a grouping and labeling process. 画素へのラベリング方法を説明する図である。It is a figure explaining the labeling method to a pixel.

Explanation of symbols

１０１ＣＰＵ
１０２メモリ
１０３補助記憶装置
１０４通信インターフェース
１０５入力インターフェース
１０６出力インターフェース
１０７媒体駆動装置
１０８バス

101 CPU
102 Memory 103 Auxiliary Storage Device 104 Communication Interface 105 Input Interface 106 Output Interface 107 Medium Drive Device 108 Bus

Claims

In an image processing apparatus for determining a top-and-bottom direction in an image obtained by photographing a subject,
Image acquisition means for acquiring the image;
A skin area recognition means for recognizing a skin area of a person image photographed as a subject in the image acquired by the image acquisition means;
Based on the recognition result of the skin area by the skin area recognition means, direction determination means for determining the top and bottom direction,
An image processing apparatus comprising:

The direction determining means calculates a variance value for the skin area recognized by the skin area recognizing means for each direction parallel to the edge of the image, and determines one of the directions parallel to the edge from the calculation result. One is determined to be the top and bottom direction,
The image processing apparatus according to claim 1.

The direction determination means calculates a variance value in consideration of the size of the skin area recognized by the skin area recognition means;
The image processing apparatus according to claim 2.

A program for causing an image processing apparatus to determine a top-to-bottom direction in an image obtained by photographing a subject,
A function of acquiring the image;
A function of recognizing a skin area of a human image captured as a subject in an image acquired by the function of acquiring;
Based on the recognition result of the skin area by the recognition function, the function of determining the top and bottom direction,
A program to realize