WO2005055144A1

WO2005055144A1 - Person face jaw detection method, jaw detection system, and jaw detection program

Info

Publication number: WO2005055144A1
Application number: PCT/JP2004/018451
Authority: WO
Inventors: Toshinori Nagahashi; Takashi Hyuga
Original assignee: Seiko Epson Corporation
Priority date: 2003-12-05
Filing date: 2004-12-03
Publication date: 2005-06-16
Also published as: TW200527319A; US20060010582A1; JP2005165983A

Abstract

A person face is detected and a jaw detection window is set at the lower portion of the face. Edge intensity distribution in the jaw detection window is calculated. Pixels having edge intensity equal to or above a threshold value are detected from the edge intensity distribution. An approximation curve is calculated so as to match with the distribution of the detected pixels. The lowermost portion of the approximation curve is decided to be the bottom of the jaw of the person. Thus, it is possible to accurately and rapidly detect the bottom portion of a jaw of a person face.

Description

TECHNICAL FIELD A chin detection method, a chin detection system, and a chin detection program for a human face

The present invention relates to a pattern recognition (Patternrecognition) object recognition technology, and more particularly to a chin detection method and a chin detection method for accurately detecting a chin position of a person's face from a face image in which the person's face is captured. This is related to the detection system and chin detection program.

,

Technology background

The accuracy of character and voice recognition has been dramatically improved with the recent increase in the performance of information processing devices such as pattern recognition technology and computers.However, images that show people, objects, landscapes, etc. In the pattern recognition of images captured by digital still cameras, etc., it is still extremely difficult to determine whether a human face appears in the image accurately and at high speed. It is known to be work. ,

However, in this way, it is necessary to automatically and accurately identify whether a person's face appears in an image and who the person is by using a computer or the like. However, it has become a very important theme in realizing faster criminal investigations, faster image data organization and faster search operations, and many other proposals have been made on such themes.

For example, in Japanese Patent Application Laid-Open No. Hei 9-501528, etc., for an input image, first, the presence or absence of a flesh-color area is determined, the mosaic size is automatically determined for the flesh-color area, and the mosaic is performed. The presence or absence of a human face is determined by calculating the distance between the area and the human face dictionary, and by extracting the human face, erroneous extraction due to the influence of the background etc. is reduced, and the human face is efficiently extracted from the image. The face in between is automatically found.

In Japanese Patent Application Laid-Open No. H8-773334, a predetermined algorithm is used to extract feature points of a face image to be used in order to distinguish each individual or group (for example, race / rape). As a result, it is automatically and quickly implemented.

By the way, in many cases, a face photograph (face image) of a person, which is indispensable for a passport or an ID card, has its size, direction, size, and position of the person's face set in detail.

For example, the condition of [, no background, and wearing no accessories such as hats is that the face of the person in the picture is facing the front, and that the face of the person is in the center of the photo. It is specified in detail that the position of the chin of the face in the image is within a certain range from the frame below the photo, and so on. In principle, photos (face images) that deviate from the standard are not adopted.

However, regardless of the reason that the person's face is not facing the front or wearing accessories such as a hat, it is simply because the size of the face in the image, because the size and position are slightly shifted. However, it is unreasonable to have to re-take the image again, and there is a problem in that the user has to pay a considerable labor / cost.

Therefore, a method for solving the above-mentioned problems by using digital image processing technology, which is a technical field that has been remarkably developed in recent years, is being studied.

For example, a face image of a required person can be directly obtained as digital image data by a digital still camera using an electronic imaging device such as a CCD or CMOS, or an analog photograph in which a human face has been photographed in advance. (Silver-salt photography) is obtained as digital image data using an electro-optical image reading device such as a scanner, and this digital image data is used using an image processing system consisting of a general-purpose computer such as a PC and general-purpose software. Then, it is considered that the above-mentioned problem can be solved by appropriately performing simple image processing such as enlargement, reduction, or movement of the face image without impairing the characteristic of the face of the person. On the other hand, if the number of images to be processed is small, the processing operation can be directly performed by a human using a general-purpose input / output device such as a mouse, a keyboard, or a monitor. When the number is huge, it is necessary to perform the processing automatically using the above-described conventional technology.

However, in order to realize automatic image processing of a human face in this way, it is necessary to accurately recognize the outline of the face, particularly the outline of the chin of the human face. Depending on the lighting conditions at the time, the features of the individual's face, and other conditions, it is often not possible to read clearly with conventional edge detection filters alone.

For example, depending on the direction of scattered light and lighting, the outline of the chin may be unclear, depending on the facial features, a relatively strong edge between the lips and the bottom of the chin may be detected, or the clothes may be worn. A strong edge is also detected at the border between the collar and the neck. Also, depending on age and body type, a stronger edge is often generated in the neck wrinkle than in the chin contour, and these may be erroneously detected as the chin contour.

Therefore, the present invention has been devised in order to effectively solve such a problem. The purpose of the present invention is to remove a portion of a face image whose chin outline is difficult to detect as described above. It is intended to provide a new chin detection method, a chin detection system and a chin detection program capable of accurately and quickly detecting the bottom of a chin under a robust condition. Disclosure of the invention

In order to solve the above-mentioned problem, the chin detection method of the human face of the invention 1 is

A method for detecting the lower bottom of a chin of a person's face from an image including a person's face, comprising detecting a face image in a range that includes both eyes and lips of the person's face and does not include a chin. After setting a chin detection window large enough to include the chin of the human face at the bottom of the face image, the intensity distribution of the edges in the chin detection window is determined, and the edge intensity equal to or greater than the threshold value is determined from the edge intensity distribution. Detect the pixels that have Thereafter, an approximation curve is obtained so as to best fit the distribution of the detected pixels, and the lowest bottom of the approximation curve is set as the lower bottom of the chin of the person's face.

As described above, according to the present invention, first, a component having a very high possibility of including a chin of a human face is selected, a chin detection window is set in that portion, and the intensity distribution of the edge in the chin detection window is determined. Ask. In other words, the contour including the lower bottom of the chin generally changes sharply in contrast to the surrounding area, and the edge strength is increased. Therefore, by obtaining the intensity distribution of the edge in the chin detection window, it is possible to easily and surely select a candidate region serving as a contour including the lower bottom of the chin of the answer included in the chin detection window. .

Next, if the edge intensity distribution is obtained in this way, pixels having an edge intensity equal to or higher than a threshold value are detected from this distribution. That is, since the contour including the lower and lower parts of the chin generally has a high edge strength, a pixel having an edge strength equal to or higher than a certain threshold is selected, and other pixels are excluded. Only pixels that are likely to correspond to the contour including the lower bottom of the chin can be selected.

Finally, an approximation curve is obtained so as to best fit the distribution of each pixel detected in this manner, and the lowermost portion of the approximation curve is detected by assuming the lowermost portion of the chin of the human face. Will be.

As a result, even if the face image is difficult to detect the contour of the chin of a human face, it is possible to detect the portion accurately and at high speed and detect a robust (robust) bottom of the chin. .

Invention 2

A method for detecting a lower bottom portion of a chin of a person's face from an image including a person's face, the method comprising detecting a face image in a range that includes both eyes and lips of the person's face and does not include a chin. After setting a chin detection window large enough to include the chin of the person's face at the bottom of the face image, obtain the intensity distribution of the first derivative type edge in the chin detection window, and obtain the threshold from the distribution of the edge intensity. , Above the threshold Pixels with edge strength are detected, and then the pixels to be used are narrowed down using the sign inversion of the edge of the second derivative type from the pixels, and then narrowed down to the pixel distribution that best matches the distribution of pixels. In addition, an approximate curve is obtained by using the least squares method, and the lowest bottom of the approximate curve is set as the lower bottom of the chin of the person's face. .

That is, the present invention more specifically describes the method of calculating the edge intensity distribution (first-order differential type), the method of selecting pixels (second-order differential type), and the method of calculating the approximate curve (least square method) among the methods of the first invention. Thus, the lower bottom of the chin of the human face can be detected more accurately and at higher speed than in Invention 1.

Invention 3

In the method for detecting a chin in a face image according to invention 1 or 2, the chin detection window has a horizontally long rectangular shape, and has a width wider than a face width of the human face and a height thereof. Is smaller than the above-mentioned width.

Thus, the lower bottom of the chin of the person's face to be detected can be reliably captured in the chin detection window, and thus the lower bottom of the chin can be detected more accurately.

Invention 4

In the chin detection method according to the invention 2 or 3, the first-order differential edge intensity distribution uses a Sobele edge detection operator.

In other words, the most typical method for detecting a sudden change in light and shade in an image is to find a derivative relating to light and shade. Then, since the differentiation of the digital image is substituted by the difference, the first-order differentiation of the original image in the chin detection window effectively detects the edge portion in the image where the shading changes rapidly. Can be .

The present invention uses this first-order differential type edge detection operator (filter) as: A known edge detection operator of Sove 1 having excellent detection performance is used, whereby an edge portion in the jaw detection window can be reliably detected.

Invention 5

In the chin detection method according to any one of inventions 2 to 4, the edge of the second derivative type uses a Laplace edge detection operator.

This makes it possible to accurately detect the second-order differential edge. Invention 6

In the chin detection method according to any one of Inventions 1 to 5, the least-squares method using a quadratic function is used for the approximate curve.

That is, as a method for obtaining an approximate curve in the chin detection window that can be simulated as the contour of the chin of the human face, the present invention uses a least square method by a quadratic function. The contour of the chin of the human face in the chin detection window can be obtained at high speed.

As used herein, the “least squares method (also called the least squares method)” is, as is generally understood, the error of the error from the function to be fitted to a set of samplings. This is a method to find a coefficient that minimizes the sum of squares.For example, if it is a phenomenon that behaves as a quadratic equation with respect to experimental data, a quadratic equation may be used, and an exponential function If expected behavior can be calculated by taking the logarithm. The calculation of the approximate curve by the least squares method can be easily realized by using software (programs) that are already incorporated in many scientific calculators and spreadsheet software as they are. .

Invention 7 The human face chin detection system

An image reading means for detecting a lower bottom portion of a chin of a person's face from an image containing a person's face, wherein the image reading means reads an image containing the person's face. A face detection unit configured to detect, from an image read by the image reading unit, a surrounding area including both eyes and lips of the human face and not including a chin, and to set a face detection frame in the detected range; Jaw detection window setting means for setting a chin detection window having a size including the chin of the person's face at the lower part of the human face; A pixel selecting means for selecting a pixel having an edge strength of 闞 or more from the obtained edge strength distribution, and a curve approximating means for obtaining an approximate curve that best fits the distribution of each pixel selected by the pixel selecting means. And a chin detecting means for detecting the lowest bottom of the approximation curve obtained by the curve approximating means as the lower bottom of the chin of the person's face.

As a result, even in the case of the face image in which it is difficult to detect the contour of the chin of the human face as in the case of the first invention, it is possible to detect the portion accurately and at high speed to obtain a robust (robust) lower jaw. Detection can be performed.

Further, by realizing each of these means by using dedicated hardware or a computer system, it is possible to automatically exert these functions and effects.

Invention 8 The human face chin detection system

In the human face chin detection system according to Invention 7, the pixel selecting means obtains a threshold value from a distribution of first-order differential type edge intensities calculated by the edge calculating means, and determines a pixel having an edge intensity not less than the threshold value. It is characterized in that a pixel to be used is detected and a pixel to be used is selected from the pixels by utilizing the sign inversion of the edge of the second derivative type.

As a result, similarly to the inventions 2 and 7, the lower bottom of the chin of the person's face can be detected accurately and at high speed, and each of these means is realized by using dedicated hardware or a computer system. As a result, these functions and effects can be exhibited automatically.

Invention 9 The human face chin detection program

A plug that detects the bottom of the chin of the person's face from the image containing the person's face An image reading step of reading an image including the human face, and detecting a range that includes both eyes and lips of the human face and does not include a chin from the image read in the image reading step; A face detection step of setting a face detection frame in the detected range; a chin detection window setting step of setting a chin detection window having a size including the chin of the human face below the detection frame; and An edge calculation step for obtaining an edge intensity distribution of the edge, a pixel selection step for selecting a pixel having an edge intensity equal to or greater than a threshold from the edge intensity distribution obtained in the edge calculation step, and a pixel selection step. A curve approximation step for finding an approximate curve that best fits the distribution of each pixel; and It is characterized in that to achieve a chin detecting step of detecting a lower bottom portion, to the computer.

As a result, the same effects as those of Inventions 1 and 7 can be obtained, and the functions can be realized on software using a general-purpose computer (hardware) such as a personal computer (PC). When creating and realizing a dedicated device (compared to this, it can be realized more economically and easily. In many cases, it is easy to change the function, upgrade, etc. simply by rewriting the program) can do.

Invention 10 is a human face chin detection program,

10. The human face chin detection program according to claim 9, wherein the pixel selection step obtains a threshold value from a distribution of first-order differential edge strength calculated in the edge calculation step, and has an edge strength not less than the threshold value. The method is characterized in that a pixel is detected, and a pixel to be used is selected from the surface elements by using sign inversion of a second-order differential type edge.

As a result, the same effects as those of the inventions 2 and 8 can be obtained, and the functions can be realized on the software as in the case of the invention 9, so that it can be realized economically and easily. In addition, it is possible to easily achieve version-up such as change or improvement of the function. Brief Description of Drawings

FIG. 1 is a block diagram showing an embodiment of a jaw detection system according to the present invention.

FIG. 2 is a configuration diagram showing hardware constituting the chin detection system. FIG. 3 is a flowchart showing an embodiment of a jaw detection method according to the present invention.

FIG. 4 is a graph showing the relationship between the luminance and the pixel position in the face image. FIG. 5 is a graph showing the relationship between the edge intensity in the face image and the pixel position.

FIG. 6 is a diagram showing an example of a face image to be a chin detection target.

FIG. 7 is a diagram illustrating a state in which a face detection frame is set in a face image.

FIG. 8 is a diagram showing a state in which a chin detection window is set below the face detection frame. FIG. 9 is a diagram showing a state in which the lower bottom of the chin is detected and its position is corrected.

FIG. 10 is a diagram showing a chin detection window displaying only pixels having edge strengths equal to or greater than a threshold value.

FIG. 11 is a diagram showing a chin detection window that displays only selected pixels as a result of sign inversion. 'FIG. 12 is a diagram showing an edge detection filter of S obe 1.

BEST MODE FOR CARRYING OUT THE INVENTION

Hereinafter, the best mode for carrying out the present invention will be described in detail with reference to the accompanying drawings.

FIG. 1 shows an embodiment of a human face chin detection system 100 according to the present invention. As shown, the chin detection system 100 includes the face of a person. An image reading means 10 for reading the face image G; a face detection means 12 for detecting a human face from the medium image G read by the image reading means 10 and setting a face detection frame F of the human face; A chin detection window setting means 14 for setting a chin detection window W having a size including the chin of the person's face below the face detection frame F; and an edge calculation for obtaining an intensity distribution of edges in the chin detection window W. Means 16, a pixel selecting means 18 for selecting a pixel having an edge strength equal to or greater than a threshold value from the distribution of the edge strength obtained by the edge calculating means 16, and each pixel selected by the pixel selecting means 18 Curve approximation means 20 for obtaining an approximate curve so as to best fit the distribution of the human face, and a chin detecting means 22 for detecting the lowest bottom of the approximate curve obtained by the curve approximate means 20 as the lower bottom of the chin of the human face. It is mainly composed of First, the image reading means 10 is a visual person attached to a public identification card such as a passport or a driver's license or a private document identification card such as an employee ID card, a student ID card or a membership card. A proving face photograph for identification, that is, a background image G containing only a large face facing the front of the person is stored in a CCD (Charge Coupled Device) or CMO S (Co A function to acquire digital image data consisting of R (red), G (green), and B (blue) pixel data by using an imaging sensor such as an image sensor (sampler). To provide.

Specifically, the digital camera is a CCD such as a digital still camera or a digital video camera, a CMOS camera, a vidicon camera, an image scanner, a drum scanner, or the like.The face image G read optically by the imaging sensor is subjected to AZD conversion. A function of sequentially transmitting the digital image data to the face detection means 20 is provided.

The image reading means 10 has a data storage function, and the read face image data can be appropriately stored in a storage device such as a hard disk drive (HDD) or a storage medium such as a DVD-ROM. It has become. In addition, the face image is converted into digital image data via a network or a storage medium. When supplied, the image reading means 10 becomes unnecessary or functions as a communication means, an interface (IZF) or the like.

Next, the face detection means 12 detects a human face from the face image G read by the image reading means 10 and sets a face detection frame F in the relevant part.

As will be described later, the face detection frame F has a size (area) including both eyes and lips around the nose of the human face and not including the chin of the human face. The algorithm for detecting a human face by the face detection means 12 is not particularly limited, but, for example, a conventional method as shown in the following literature or the like can be used as it is.

H. A. R o w l e y, S. B a 1 uja, a n d T. Kanad e, 'Ne u r a l n e two r k — b a s e d f a c e d e t e c t i o n "

I EEE T r a n s a c t i o n s o n P a t t e r n An a l y s i s a n d Ma c h i n e. Int e l l i g e n c e, v o l. 20, no.

According to this technology, a face image of a region including both eyes and lips of a human face and not including a chin is created, a neural network is trained using this image, and a human face is detected using the trained dual neural network. I do. According to the disclosed technique, a region from both eyes to the lips is detected as a face image region. Further, the size of the face detection frame F is not invariable, and is appropriately increased or decreased according to the size of the target face image.

The chin detection window setting means 14 sets a chin detection window W having a size including the chin of the person's face below the face detection frame F set by the face detection means 20. ing.

That is, a target area for accurately detecting the contour including the lower bottom of the chin of the human face by the following means is selected from the face image G using the chin detection window W. The edge calculating means 16 provides a function for obtaining the intensity distribution of the edge of the image in the chin detection window W. For example, as described later, the first derivative using the edge detection operator of Sobe 1 is used. Calculate the intensity distribution of the edge of the mold! / Puru.

Pixel selection means 18 ^ ;, which provides a function of selecting a pixel having an edge strength equal to or greater than a threshold value from the distribution of the edge strength obtained by the edge calculation means 16, as will be described later. Using a filter (Laplacian (Lap 1 acian) filter) or the like, candidate images obtained by the edge detection operator of the above-mentioned Sove 1 are narrowed down by detecting the sign inversion of the edge.

The curve approximation means 20 provides a function of obtaining an approximate curve so as to best fit the distribution of each pixel selected by the pixel selection means 18. Specifically, as will be described later, the following equation is used. The outline of the chin of the person's face is obtained in a curved line using the least squares method with a quadratic function as shown. y = a X (x—x.) ² + b '"(1)

Where y is the vertical coordinate

X: horizontal coordinate

X 0: The horizontal center of the chin detection window

When “a” and “b” are obtained by the least squares method using the equation (1), “b” represents the lower part of the chin (however, a is 0).

The chin detection means 22 provides a function of detecting the lowermost part of the approximation curve obtained by the curve approximation means 20 as the lower part of the chin of the person's face. A noticeable mar power M or the like may be applied to the lower bottom portion of the jaw to explicitly indicate it. The means 10 to 22 and the like constituting the chin detection system 100 are actually composed of hardware such as a CPU RAM and a dedicated computer program (software) as shown in FIG. It is realized by a computer system such as a personal computer (PC). That is, as shown in FIG. 2, for example, hardware for realizing this jaw detection system 100 is a CPU (Central Processing Unit) 4 which is a central processing unit that performs various controls and arithmetic processing. 0, RAM (R and om Access Memory) 41 used for the main storage (Ma in Storage), and ROM (Rad On Only Memory) 42, a read-only storage device. , An auxiliary storage device such as a node disk drive device (HDD) or semiconductor memory (S econdary storage) 43, and an output device 44 such as a monitor (LCD (liquid crystal display) or CRT (cathode ray tube)). An input device 45 consisting of an image sensor such as an image scan keypad, a mouse, a CCD (Charge Coiled Device) or a CMOS Combo (Chemical Component Device), and an input / output device for these devices. Interface (IZF) 46 etc. ¾ ■, PC i, Peripheral Computer Onent Interco nnect) This bus is connected by various internal / external buses 47 such as a processor bus, a memory bus, a system bus, and an input / output bus, such as a nos and an industrial standard architecture (ISA) bus.

For example, a storage medium such as a CD-ROM, a DVD-ROM, a flexible disk (FD), or various control programs and data supplied via a communication network (LAN, WAN, Internet, etc.) N The program and data are installed in the auxiliary storage device 43 and the like, and the programs and data are loaded into the main storage device 41 as needed.The CPU 40 makes full use of various resources according to the program loaded in the main storage device 41 and performs predetermined operations. It performs control and arithmetic processing, outputs the processing results (processing data) to an output device 44 via a bus 47, and displays the data. The data is also stored in a database formed by an auxiliary storage device 43 as necessary. It is designed to be stored and saved (updated). Next, an example of a jaw detection method using the jaw detection system 100 having such a configuration will be described with reference to FIGS.

FIG. 3 is a flowchart showing an example of a chin detection method for a face image G to be actually detected.

First, as shown in step S 101, a face included in the face image G from a face image G to be a chin detection target previously read by the image reading means 10 by the above-described face detection means 12. And then set the face detection frame F to identify the detected human faces.

For example, as shown in FIG. 6, the image to be detected by the chin of the present invention is limited to an image in which one person's face is shown. First, the position of the person's face is first determined by the face detection means 12. Then, a rectangular face detection frame F is set on the person's face as shown in FIG.

In the case of the face detection frame F shown here, the size (area) is such that it includes both the eyes and lips around the nose of the human face and does not include the chin of the human face. However, as long as the face detection frame F does not include the chin portion of the person's face, it is not always necessary to stick to the size and shape as exemplified. In each of the face images G in Figs. 6 to 9 (a), the size of the person's face and the horizontal position of the display frame Y are within the standard, but the position of the chin is too low. This indicates a state where the standard position has not been reached.

Next, after the face detection frame F is set in this way, the process proceeds to step S103, and as shown in FIG. A rectangular jaw detection window W is set, and the position of the jaw of the person's face is specified.

Here, the size and shape of the chin detection window W are not strict, and are not particularly limited as long as the size and shape are below the lower lip of the person's face and always include the lower bottom of the chin. However, if it is too large, many confusing lines and contours of the chin such as chin shadows, neck wrinkles, and shawl collars will appear in the chin detection window W, and it will take a lot of time to detect edges later. It took Conversely, if it is too small, the lower base of the chin to be detected may not be included due to individual differences.

Therefore, for example, as shown in the figure, if a horizontal rectangular shape having a width wider than the face width of the human face and a height smaller than the width is used, a shirt collar is used. It is thought that the contour of the chin, including the lower bottom of the chin, can be reliably captured while eliminating confusing parts such as. In the example of FIG. 8, the chin detection window W is set in close contact with the lower side of the face detection frame F '. However, the chin detection window W does not necessarily need to be in close contact with the face detection frame F. In short, it is only necessary that the chin detection window W keeps a predetermined positional relationship with respect to the face detection frame F.

Next, when the chin detection window W is set for the target image in this way, the process proceeds to the next step S105, in which the luminance of each pixel in the chin detection window W is determined.

(Y) is calculated, and based on the luminance value, the primary in the chin detection window W is calculated using a first-order differential (difference-type) edge detection operator represented by an edge detection operator j of Sobe 1 and the like. Find the edge intensity distribution of the differential type.

FIGS. 12 (a) and 12 (b) show this “Sobel edge detection operator”. The operator (filter) shown in FIG. Of the pixel values, the horizontal edge is emphasized by adjusting each of the three pixel values located in the left and right columns, and the operator shown in Fig. (B) calculates the eight pixel values surrounding the pixel of interest. Of these, the vertical and horizontal edges are detected by adjusting the three pixel values in the upper row and lower row, respectively, and enhancing the vertical edges.

Then, the sum of squares of the result generated by such an operator is calculated, and then the square root is used to determine the edge strength. As described above, it is also possible to apply another primary differential type edge detection operator such as "Roberts" or "Prewitt" instead of the "Operator of Sobeli".

Figure 4 shows the relationship between the luminance (vertical axis) and the pixel position (horizontal axis) of the face image G. Since the brightness of the edge portion of the image such as the outline of the chin changes greatly, the portion where the brightness changes greatly is represented by a first-order differential type (such as “Sobel's edge detection operator”). By using the edge detection operator of (type), it can be calculated as a parabolic approximated curve as shown in Fig. 5 (a).

Next, after the edge intensity distribution of the chin detection window W is obtained in this manner, the process proceeds to the next step S107, and a threshold value is obtained from the edge intensity distribution. That is, as described above, since the edge strength is greatly affected by the shooting conditions (illumination conditions) and the like, it is difficult to determine the edge corresponding to the jaw contour from the edge strength including other areas. .

Here, the threshold value for determining a pixel is not particularly limited, but, for example, a maximum edge intensity of 110 detected in the chin detection window W is set as the threshold value, and the threshold value is set to be stronger than this threshold value. A pixel having an edge is selected as a candidate pixel for obtaining the lower part of the chin.

Next, when the threshold value for selecting the pixel value is determined in this way, the process proceeds to step S111, and all pixels constituting the upper side of the chin detection window W are set as the base points as shown in FIG. While scanning in the vertical direction, only pixels having an edge intensity exceeding the threshold are selected, and pixels below the threshold are excluded.

Fig. 10 shows the pixel distribution selected in this way (exceeding the threshold) in an easy-to-understand manner. The chin detection window W is scanned in the X direction from the upper left of the chin detection window W, and sequentially scanned in the Y direction. The pixels in each row are scanned in a non-interlaced manner, such as moving the pixels to pixels, and pixels having an edge intensity equal to or higher than a threshold are identified and displayed.

The search from the upper left of the chin detection window W is performed in order to select the earliest appearing pixel in the Y direction that is equal to or greater than the threshold value as the effective lower-jaw catcher. It is possible to detect a pixel corresponding to the contour. In other words, the edge that is confusing with the jaw contour is more pronounced at the neck wrinkles and shirt collar below the actual jaw contour than at the top, This is to reduce the priority of those edges.

Next, if a pixel having an edge strength exceeding the threshold value is selected in this way, the process proceeds to step S113, and for each pixel column (Y direction) of the selected pixels, To narrow down the pixels with the highest edge strength, the sign of the second derivative wedge is detected for each column.

In other words, when narrowing down the candidate pixels, it is necessary to consider how sharp the brightness change is.However, when the brightness changes gently as shown in Fig. 4, the edge intensity of the first derivative Sobe 1 is However, as shown in Fig. 5 (a), it changes slightly and becomes wider than the threshold (the number of candidate pixels is large), which is an error in determining the lower part of the chin.

Therefore, by using a second-order differential type edge detection filter (Lablassian filter) as shown in FIG. 13 to detect the sign inversion of the edge, as shown in FIG. One of the pixels is determined (Fig. 11). For example, as shown in FIG. 10, assuming that a plurality of pixels are selected for each row from “a” to “g” as a result of searching for pixels having an edge strength equal to or greater than the threshold as shown in FIG. As a result of detecting the sign inversion of the edge, in FIG. 11, in the “a”, “b”, “d”, “f”, and “ _rg ” columns, the uppermost pixel is a candidate pixel constituting the chin outline. Selected as " _c ",

Column “e” indicates that the bottom pixel was also selected as a catch pixel.

After that, if the finally selected candidate pixels are narrowed out of a large number of pixels exceeding the threshold value in this manner, the process proceeds to step S115, where the approximate curve as described above is added to the distribution of the searched pixels. The bottom of the chin will be determined by applying it to Figure 11.

When the bottom of the chin is detected in this way, a marker M is placed on the bottom of the chin as shown in FIGS. 9 (a) and 9 (b), and the position of the marker M is set to the specified lower jaw. Move the entire human face so that it is at the same height as the bottom position. In Fig. 9 (a), the lower part of the chin of the person's face is located at a considerably lower position, so the lower part of the chin is moved to the specified position by moving the person's face vertically upward as shown in Fig. 9 (b). Can be matched. In FIG. 9 (a) and the like, the image below the person's neck is cut off, but it is assumed that the image of the hidden part actually exists as it is.

As described above, the present invention sets the chin detection window using a known person face detection method, and then detects the lower bottom of the person face based on the intensity distribution of the edge in the chin detection window. Even in the case of a face image in which it is difficult to detect the chin outline, it is possible to detect the portion accurately and at high speed to detect a robust (robust) lower part of the chin.

Claims

The scope of the claims

1. A method of detecting the bottom of the chin of a person's face from an image containing the person's face,

After detecting a face image in a range that includes both eyes and lips and does not include a chin of the human face, and setting a chin detection window large enough to include the chin of the human face below the detected face image,

The intensity distribution of the edge in the chin detection window is obtained, and a pixel having an edge intensity equal to or greater than a threshold is detected from the edge intensity distribution,

Thereafter, an approximate curve is obtained so as to best fit the distribution of the detected pixels, and the lowest bottom of the approximate curve is set as the lower bottom of the chin of the human face.

2. A method for detecting the bottom of the chin of a person's face from an image containing the person's face,

The intensity distribution of the first derivative type edge in the chin detection window is obtained, a threshold is obtained from the distribution power of the edge intensity, and a pixel having an edge intensity equal to or higher than the threshold is detected.

After that, the pixels to be used are narrowed down from the pixels by using the sign inversion of the second derivative type edge,

Thereafter, an approximated curve is obtained using the least squares method so as to best fit the distribution of the narrowed pixels, and the lowermost portion of the approximated curve is set as the lowermost portion of the chin of the human face. Chin detection method.

3. In the chin detection method according to claim 1 or 2,

As the chin detection window, a horizontally long rectangular shape having a width wider than the face width of the human face and a height smaller than the width is used. Chin detection method characterized by having done.

4. In the chin detection method according to claim 2 or 3,

A jaw detection method, wherein the first-order differential edge intensity distribution uses an edge detection operator of Sobele1.

5. In the chin detection method according to any one of claims 2 to 4,

A jaw detection method, wherein a Laplace edge detection operator is used for the second-order differential type edge.

6. The jaw detection method according to any one of claims 1 to 5, wherein the approximate curve uses a least squares method using a quadratic function.

7. A system for detecting the lower bottom of the chin of a person's face from an image containing the person's face,

Image reading means for reading an image including the human face,

A face detection unit configured to detect a range including the eyes and lips of the person's face and not including the chin from the image read by the image reading unit, and to set a face detection frame in the detected range;

A chin detection window setting means for setting a chin detection window having a size including the chin of the person's face below the detection frame;

Edge calculation means for obtaining the intensity distribution of the edge in the chin detection window, and pixel selection means for selecting a pixel having an edge erosion greater than or equal to a threshold from the distribution of the edge strength obtained by the edge calculation means,

Curve approximating means for finding an approximate curve that best fits the distribution of each pixel selected by the pixel selecting means;

The lowest bottom of the approximation curve obtained by the curve approximation means is defined as the bottom of the chin of the person's face.

A jaw detection system, comprising: a jaw detection means for detecting. .

8. The chin detection system according to claim 7,

The pixel selection means includes a first-order differential type calculated by the edge calculation means. A threshold is determined from the distribution of edge intensity, and pixels having an edge intensity equal to or higher than the threshold are detected. ■ Pixels to be used are selected from the pixels by using the sign inversion of the second derivative type edge. A chin detection system characterized in that:

9. A program that detects the lower bottom of the chin of the person's face from the image containing the person's face,

An image reading step of reading an image including the human face;

A face detection step of detecting, from the image read by the image reading means, a range including both eyes and lips of the human face but not including the chin, and setting a face detection frame in the detected range;

A chin detection window setting step of setting a chin detection window having a size including the chin of the human face below the detection frame;

An edge calculating step of obtaining an intensity distribution of an edge in the chin detection window; a pixel selecting step of selecting a pixel having an edge intensity equal to or larger than a threshold from the edge intensity distribution obtained by the edge calculating means;

A curve approximation step of finding an approximation curve that best fits the distribution of each pixel selected by the pixel selection means;

A chin detecting step of detecting the lowest bottom of the approximation curve obtained by the curve approximation means as the bottom of the chin of the person's face;

10. The human face chin detection program according to claim 9,

In the pixel selection step, a threshold value is obtained from the distribution of the primary differential type edge intensity calculated in the edge calculation step, a pixel having an edge intensity equal to or higher than the threshold value is detected, and a second derivative is selected from the pixels. A chin detection program characterized by selecting a pixel to be used by using sign inversion of a pattern edge.